SSIS with Excel Source/Destination on x64 SQL Server Instance

Not all people know, that the Excel file format is not supported by the x64 version of SQL server and the same apply to the SSIS runtime. This apply not only to excel but to all data sources for which you have only 32 bit drivers (e.g. ODBC drivers of external data sources etc.

Although you can develop the SSIS package with Excel source/destination in Business Intelligence Studio (BIDS)  on x64 machine, you can design the whole ETL process, but when you try to debug such package, you will receive an error message, that Excel file source is not supported on x64.

Debugging on x64 machine

If you want to debug a package on x64 machine, you have to set the environment not to use the x64 runtime. BIDS by default uses the x64 runtime on x64 machines so it is necessary to disable it. You can do this in the SSIS project properties.

SSIS Project Properties in BIDS

Once you set the Run64BitRuntime to false, you will be able to run and debug the package in the BIDS environment.

Running package on x64 version of SQL Server

As Mentioned above, you can set the SSIS properties, so the BIDS does not use the x64 runtime for running the package containing an Excel source/destination. Unfortunately this settings is only for design time debugging in the BIDS environment. When you deploy the package to the server and try to run it directly or from within and agent job, execution will fail as the Excel is not supported by x64 version of SQL Server.

SQL Server 2008 and newer

On the SQL Server 2008 and newer the solution is easy as the Job Step Properties dialog for a SQL Server Integration Services Package type has an option to execute the package using the 32 bit runtime. The option can be found on the Execution options tab. Once you set the “Use 32 bit runtime”, the SSIS package will run correctly on the x64 version of SQL Server 2008 and above. The option simply set the /X86 argument of the DTExec utility and you can see it on the Command Line tab, once you check it.

Agent Job Step Properties

SQL Server 2005

Situation is quite different on SQL Server 2005 as the Job Step Properties dialog doesn’t have the option to run the package using the X86 runtime. Although it seems, that we are blocked and it will not be possible to run such SSIS package on the x64 version of SQL Server 2005, there is one workaround you can use to run that package. The workaround is to use Operating System (CmdExec) task and use the x86 version of DTExec utility which is also installed on the x64 instance of SQL Server. It’s located in the SQL Server installation directory under “Program Files (X86)” in DTSBinn directory.

To simplify configuration, first choose the SQL Server Integration Services Package Type and select location a SSIS package you want to run.

New Job Step

Once you have selected the package and configured everything necessary for the package execution, you can switch to the Command Line tab where you can see all the necessary arguments to execute the DTExec utility. If you know all the arguments, you do not need to do this steps and simply configure the CmdExec task, although in this way you do not need to write the arguments manually.

New Job Step Command Line

After setting all the necessary parameters, change the step Type to Operating System (CmdExec). All the arguments for the DTExec appear in the command TextBox.

New Job Step - CmdExec

Now you only add the x86 version of DTExec utility including whole path to it.

New Job Step - CmdExec after update

After this, you have a SSIS Package with Excel source/destination (or other data source/destination with only 32bit drivers) configured to run on X64 instance of SQL Server.

Advertisements

Fuzzy strings matching using Levenshtein algorithm on SQL Server (T-SQL vs CLR)

Levenshtein algorithm is one of possible fuzzy strings matching algorithm. Levenshtein algorithm calculates Levenshtein distance which is a metric for measuring a difference between two strings. The Levenshtein distance is also called an edit distance and it defines minimum single character edits (insert/updates/deletes) needed to transform one string to another. Details on the algorithm itself can be found on Wikipedia.
When you need to use it in queries, functions or stored procedures you have two possibilities – T-SQL implementation and CLR implementation.
I will show both solutions here and also compare the speed of both solutions.

T-SQL implementation of Levenshtein algorithm

For T-SQL I will took the one I have found on the SqlTeam.com forums a which was originally developed by Joseph Gama as mentioned in the SqlTeam post.

CREATE FUNCTION [dbo].[edit_distance](
  @s1 nvarchar(3999),
  @s2 nvarchar(3999))
RETURNS int
AS
BEGIN
  DECLARE @s1_len int, @s2_len int
  DECLARE @i int, @j int, @s1_char nchar, @c int, @c_temp int
  DECLARE @cv0 varbinary(8000), @cv1 varbinary(8000)
  SELECT
    @s1_len = LEN(@s1),
    @s2_len = LEN(@s2),
    @cv1 = 0x0000,
    @j = 1, @i = 1, @c = 0

  WHILE @j <= @s2_len
    SELECT @cv1 = @cv1 + CAST(@j AS binary(2)), @j = @j + 1
  WHILE @i <= @s1_len

  BEGIN
    SELECT
      @s1_char = SUBSTRING(@s1, @i, 1),
      @c = @i,
      @cv0 = CAST(@i AS binary(2)),
      @j = 1

    WHILE @j <= @s2_len
    BEGIN
      SET @c = @c + 1
      SET @c_temp = CAST(SUBSTRING(@cv1, @j+@j-1, 2) AS int) +
        CASE WHEN @s1_char = SUBSTRING(@s2, @j, 1) THEN 0 ELSE 1 END
      IF @c > @c_temp SET @c = @c_temp
      SET @c_temp = CAST(SUBSTRING(@cv1, @j+@j+1, 2) AS int)+1
      IF @c > @c_temp SET @c = @c_temp
      SELECT @cv0 = @cv0 + CAST(@c AS binary(2)), @j = @j + 1
    END
    SELECT
      @cv1 = @cv0,
      @i = @i + 1
  END

  RETURN @c
END

CLR implementation of Levenshtein algorithm

public class FuzzyStrings{
  /// <summary>
  /// Calculates the Levenshtein Distance between two strings.
  /// It is minimum of single character insert/delete/update operations needed to transfrom
  /// first string into the second string
  /// </summary>
  /// <param name="firstString">First string to calculate the distance</param>
  /// <param name="secondString">Second string to calculate the distance</param>
  /// <param name="ignoreCase">Specifies whether to ignore case in comparison</param>
  /// <returns>int represending the Levenshtein Distance</returns>
  public static int LevenshteinDistance(SqlString firstString, SqlString secondString, SqlBoolean ignoreCase)
  {
    string strF = ignoreCase ? firstString.Value.ToLower() : firstString.Value;
    string strS = ignoreCase ? secondString.Value.ToLower() : secondString.Value;
    int lenF = strF.Length;
    int lenS = strS.Length;
    int[,] d = new int[lenF + 1, lenS + 1];

    for (int i = 0; i <= lenF; i++)
      d[i, 0] = i;
    for (int j = 0; j <= lenS; j++)
      d[0, j] = j;

    for (int j = 1; j <= lenS; j++)
    {
      for (int i = 1; i <= lenF; i++)
      {
        if (strF[i - 1] == strS[j - 1])
          d[i, j] = d[i - 1, j - 1];
        else
          d[i, j] = Math.Min(Math.Min(
            d[i - 1, j] + 1,        // a deletion
            d[i, j - 1] + 1),       //an Insertion
            d[i - 1, j - 1] + 1);   // a substitution
      }
    }

    return d[lenF, lenS];
  }
}

You need to compile the code using e.g. C# Express into a an assembly and create the assembly in DB. In my Case the assembly is named [SQLCLR].

CREATE FUNCTION [dbo].[fn_LevenshteinDistance](
  @firstString [nvarchar](4000),
  @secondString [nvarchar](4000),
  @ingoreCase [bit] = 1
)
RETURNS [int]
WITH EXECUTE AS CALLER
AS EXTERNAL NAME [CLRSQL].[FuzzyStrings].[LevenshteinDistance]
GO

Testing of the functions

Once we have properly created both functions we can start testing it. Here is a script for test of both functions on several strings.

SELECT
  dbo.edit_distance('Sunday', 'Monday') AS TSQLDistance,
  ClrSafe.fn_LevenshteinDistance('Sunday', 'Monday', 1) AS CLRDistance

UNION ALL 

SELECT
  dbo.edit_distance('Sunday', 'Sunday') AS TSQLDistance,
  ClrSafe.fn_LevenshteinDistance('Sunday', 'Sunday', 0) AS CLRDistance

UNION ALL

SELECT
  dbo.edit_distance('Sunday', 'sunday') AS TSQLDistance,
  ClrSafe.fn_LevenshteinDistance('Sunday', 'sunday', 0) AS CLRDistance

UNION ALL

SELECT
  dbo.edit_distance('Saturday', 'Monday') AS TSQLDistance,
  ClrSafe.fn_LevenshteinDistance('Saturday', 'Monday', 1) AS CLRDistance

UNION ALL

SELECT
  dbo.edit_distance('This is a first string to Compare', 'This is a second string to Compare') AS TSQLDistance,
  ClrSafe.fn_LevenshteinDistance('This is a first string to Compare', 'This is a second string to Compare', 1) AS CLRDistance

And here are the results of the tests:

TSQLDistance CLRDistance
------------ -----------
2            2
0            0
0            1
5            5
6            6

As we can see, the functions return the same results except the second select. As we have selected not to ignore the case in the CLR version. For T-SQL strings are identical as the function uses a default collation which I have Case Insensitive. If I had a default collation Case Sensitive, then the results will be the same. It could be also possible to modify the T-SQL function to accept parameter for Case Sensitive/Insensitive comparison and then use different collations for that, but it’s not what we want to do here.

Speed comparison

As we saw in previous paragraphs here, both T-SQL and CLR version of the algorithm woks correctly. Now take a look on the calculation speed of the Levenshtein distance by both version.

For the test we can use a simple script, which will calculate the Levenshtein distance in cycle 10 000 times

DECLARE
   @TSQLStartTime datetime,
  @TSQLEndTime datetime,
  @CLRStartTime datetime,
  @CLREndTime datetime,
  @distance int,
  @i int

SELECT
  @i = 0,
  @TSQLStartTime = GETDATE();

WHILE (@i < 10000)
BEGIN
  SELECT
    @distance = dbo.edit_distance('This is a first string to Compare', 'This is a second string to compare'),
    @i = @i + 1
END

SELECT
  @TSQLEndTime = GETDATE(),
  @i = 0,
  @CLRStartTime = GETDATE()

WHILE (@i < 10000)
BEGIN
  SELECT
    @distance = [ClrSafe].fn_LevenshteinDistance('This is a first string to Compare', 'This is a second string to compare', 1),
    @i = @i + 1
END

SELECT @CLREndTime = GETDATE()

SELECT
  DATEDIFF(millisecond, @TSQLStartTime, @TSQLEndTime) AS TSQLDuration,
  DATEDIFF(millisecond, @CLRStartTime, @CLREndTime) AS CLRDuration

Here are the results

TSQLDuration CLRDuration
------------ -----------
22993        763

As we can see the 10 000 times calculation using the T-SQL version took 22993 milliseconds which is in average circa 2.3 millisecond for calculating the distance for the strings in our test query.

On the other side the 10 000 times calculation using CLR took only 763 milliseconds which is in average circa 0.08 milliseconds for calculating the distance for the same strings as in T-SQL version.

Conclusion

From the results we can see that the CLR is about 30 times faster on the same machine than the T-SQL version of the same algorithm. Even the T-SQL version took only 2.3 milliseconds per calculation of sample texts and it’s quite good to use in normal usage, the use or CLR can enormously decrease the processing time when processing higher volume of records.

Also it is another example where CLR beats the T-SQL solution and where it has sense to use the CLR instead of pure T-SQL.

Keyboard shortcuts manager in SSMS Denali

Another new feature of the new SSMS Denali is introduction of the new Keyboard Shortcuts Manager known from the Visual Studio, as the new version of SSMS is based on the VS 2010 shell and is more bound to the VS shell than any previous version.

If you get used for shortcuts in some 3rd party SQL Server management application, now you can configure the SSMS to use your own shortcuts.

SSMS Denali Options Dialog - Keyboard Shortcuts

For example if you are missing a default shortcut Ctrl+Shift+M for “Specify Values for Template Parameters” from previous version of SSMS and which was not set-up after my installation of SSMS Denali, you can use the shortcuts manager and assign it.

What I have found missing compared to previous version of SSMS are the custom query shortcuts. Here is screenshot from SSMS 2008R2

SSMS 2008R2 Options Dialog - Keyboard Shortcuts

But When you take a look on the keyboard shortcuts in the SSMS Denali, you will find shortcuts for the Custom Queries, although I didn’t find a way how to define the custom queries. So hope, that Microsoft will add a GUI into the options dialog for defining such queries in upcoming releases of the Denali. On the first screenshot here you can see, that there are also separate commands and shortcuts for the sp_help, sp_who and sp_lock procedures.

SSMS Denali Options Dialog - Keyboard Shortcuts - Custom Queries

Code Snippets in SSMS Denali

Microsoft has introduced a new version of SQL Server Management Studio in the first CTP1 preview of the next SQL Server version code name Denali. The new SSMS is based on new VS 2010 shell and brings some new features.

One of the new features introduced in the CTP1 are new code snippets. Who was working with third party tools or was developing applications in Visual Studio, for sure knows this feature well.

The SSMS supports two kinds of snippets

  1. Insert snippets (Expansion)
  2. Surround With snippets

You can manage the snippets using the Code Snippets Manager available from Tools menu.

SSMS Denali - Code Snippets Manager

The code snippets are using the standard VS2005 snippet XML format, so you can use current available tools for creating and modifying the SQL Server snippets.

<?xml version="1.0" encoding="utf-8" ?>
<CodeSnippets  xmlns="http://schemas.microsoft.com/VisualStudio/2005/CodeSnippet">
<_locDefinition xmlns="urn:locstudio">
    <_locDefault _loc="locNone" />
    <_locTag _loc="locData">Title</_locTag>
    <_locTag _loc="locData">Description</_locTag>
    <_locTag _loc="locData">Author</_locTag>
    <_locTag _loc="locData">ToolTip</_locTag>
</_locDefinition>
    <CodeSnippet Format="1.0.0">
        <Header>
            <Title>Create Inline Table Function</Title>
            <Shortcut></Shortcut>
            <Description>Creates an inline table function.</Description>
            <Author>Microsoft Corporation</Author>
            <SnippetTypes>
                <SnippetType>Expansion</SnippetType>
            </SnippetTypes>
        </Header>
        <Snippet>
            <Declarations>
                                <Literal>
                                    <ID>SchemaName</ID>
                                    <ToolTip>Name of the schema</ToolTip>
                                    <Default>dbo</Default>
                                </Literal>
                                <Literal>
                                    <ID>FunctionName</ID>
                                    <ToolTip>Name of the function</ToolTip>
                                    <Default>FunctionName</Default>
                                </Literal>
                                <Literal>
                                    <ID>Param1</ID>
                                    <ToolTip>Name of the input parameter</ToolTip>
                                    <Default>param1</Default>
                                </Literal>
                                <Literal>
                                    <ID>Datatype_Param1</ID>
                                    <ToolTip>Data type of the input parameter</ToolTip>
                                    <Default>int</Default>
                                </Literal>
                                <Literal>
                                    <ID>Param2</ID>
                                    <ToolTip>Name of the input parameter</ToolTip>
                                    <Default>param2</Default>
                                </Literal>
                                <Literal>
                                    <ID>Datatype_Param2</ID>
                                    <ToolTip>Data type of the input parameter</ToolTip>
                                    <Default>char(5)</Default>
                                </Literal>
            </Declarations>
            <Code Language="SQL">
<![CDATA[CREATE FUNCTION [$SchemaName$].[$FunctionName$](
    @$Param1$ $Datatype_Param1$,
    @$Param2$ $Datatype_Param2$)
RETURNS TABLE AS RETURN(
    SELECT @$Param1$ AS c1,
           @$Param2$ AS c2)]]>
            </Code>
        </Snippet>
    </CodeSnippet>
</CodeSnippets>

Insert (Expansion) snippets

The expansion snippets are similar to the Templates known from previous versions of SSMS and which are also available in new version of SSMS. You can insert the snippet using the menu Edit/ItelliSense/Insert Snippet…, using a default keyboard shortcut Ctrl+K, Ctrl+X, or using the context menu after mouse right click.

SSMS Denali IntelliSense - Isert Snippet

Once you invoke the command you can choose from the available snippets.

SSMS Denali IntelliSense - choosing snippet

The snippet behave differently from the Templates. After the snippet is inserted several replacement parts are highlighted and you can substitute the default values by your own. Then all all occurrences are automatically substituted in the snippet. You can switch among the replacements using the Tab key. Once you are ready with modifications you simply hit the Enter Key.

SSMS Denali - Create Function Snippet

Surround With snippets

In contrast to Insert (Expansion) snippets the Surround with snippets take all the selected text and surround it by the code contained in the snippet. Surround with snippets are again accessible using the menu or Ctrl+K, Ctrl+S default shortcut.

SSMS Denali - Inserting Surround With Snippet

SSMS Denali - Surround With Snippet

Missing snippet functionality

The current snippet implementation in SSMS has only a few possibilities and small power comparing the to snippets in Visual Studio. Hope that it is only by the current CTP status and Microsoft will implement following before the final release of the product.

Snippets shortcuts

As you can see from the image of the Code Snippet Manager above, there is Shortcut written I the details of the snippet. Unfortunately all the snippets have unassigned shortcuts and event you edit the source of the snippet and put some shortcuts, the current version of SSMS doesn’t support the snippet shortcuts. It could be nice to have eg. “itf” shortcut for Inline Table Function and simple press Tab for the snippet to be inserted as it works in Visual Studio

Functions in Snippets

I would welcome a functions element functionality in the SQL snippets. E.g.. Function for expanding table fields from a table name etc. Something like GenerateSwitchCases function for expanding enum members in C#. By adding such functionality it would be very easy to write a easy to use complex snippets e.g.. for CRUD procedures generation etc.. SSMS Denali is Based on VS 2010 and visual studio doesn’t allow writing custom functions for use inside of snippets, at least Microsoft could introduce at least the function for table expansions I mentioned here.

SSRS 2008 & Multi-Line email body in Data-Driven Subscription delivered by email

Introduction

When you create a report in SSRS 2008 and then want to create a Data Driven Subscription to this report and want to deliver it by email you can come to a problems with the message in the email body.

Let’s say we have a test report and create Data Driven Subscription to this report. The query in the subscription will return fields SubscriberEmail and EmailBody. In the SubscriberEmail we will have email address to which the report should be delivered and in the EmailBody we will have a message we want to send in the email (e.g..explanation of the report, how to use it etc.)

SELECT
    'somebody@domain.com' AS SubscriberEmail,
    N'First Article of Email Body.
Second Article of Email Body.
Third Article of Email Body.' AS EmailBody

The setting could be like below (Comment is email body). Also notice the edit box for the Comment static value.. It’s a small single line TextBox.

SSRS Data Driven Subscription Step 4

SSRS Data Driven Subscription - Email Body

A problem occurs when  you want the email body to contain several articles separated by new lines. All the new lines in the delivered email will be removed and all the articles will be delivered as a single article.

First Article of Email Body.  Second Article of Email Body.  Third Article of Email Body.

In contrast, when you are creating a normal subscription  the Comment text box is multi line and you will receive a correct multi line email.

SSRS Standard Email Subscription

There is no way to send a simple text as multi line email with several articles.

Solution

As I mentioned above, you are not able to send a simple test as multi line email with several articles. Although, there is workaround and if the email client is accepting HTML, you can put as the email body a simple HTML source code. Then the email body will be correctly rendered in the email client as multi line text.

SELECT
    'somebody@domain.com' AS SubscriberEmail,
    N'
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
  </head>
  <body>
    <p>First Article of Email Body.</p>
    <p>Second Article of Email Body.<p>
    <p>Third Article of Email Body.<p>
  </body>
</html>' AS EmailBody

As I was searching over the Internet I found, that people have the same problem also with SSRS 2005. The issue is solved in SSRS 2008 R2 and in the R2 it works correctly even without the HTML.

Denali side by side installation with SQL 2008 or SQL 2008 R2 corrupts BIDS

Today I’ve just installed a SQL Server Code Name Denali CTP1 as a named instance side by side with instance of SQL Server 2008 R2 to take a look on the new features of Denali

Everything worked fine, unless I started a BIDS and wanted to create a new SSIS project (or open existing one).

Once you try it, you will se a below message.

Corrupted Installation of BIDS cannot open project

The problem is that the CTP1 version of Denali still uses VS2008 shell as Business Intelligence Development studio. In future versions it will be replaced by the VS2010 shell. But now it uses the V2008 shell as SQL Server 2008 or 2008R2 and the Side by Side installation corrupts BIDS.

To correct this, you have to uninstall Denali, and reinstall SQL2008. Then you can install Denali without BIDS.

So to save you trouble, and you want to install Denali side by side with existing SQL 2008 or SQL 2008 R2 installation, you can install all the features of Denali, except the BIDS.