PP.SSIS Custom Components Updated on GitHub

An updated version of custom SSIS Data Flow and Control Flow components were published on GitHub

The update brings seamless Upgrade/Downgrade functionality in VS 2015 and VS 2017 by simple selection of Destination Platform (SSIS 2012 – SSIS 2017).

Data Flow Components

  • Hash Column Transformation
  • Columns To Xml Transformation
  • RegEx Extraction Transformation
  • Row Number Tranformation
  • History Lookup Transformation
  • Lookup Error Aggregation Transformation

Control Flow Components

  • Variables To Xml Task
  • Sleep Task
  • Wait For File Task
  • Wait For Signal Task
  • Wait For Time Task

To build the Components, run Visual Studio as Administrator, than it takes care about registering  the components in GAC as well as copying the dll  as well as Extensions and Upgrade Mappings xml files to appropriate directories.

If Visual Studio is not running as administrator, copying of the above files will fail and you have to manually register the components in GAC as well as you have to copy the Dll files as well as the Extensions and Upgrade Mappings to appropriate DTS directories.

Proper target directories you can find in the Posst Build events settings in individual projects.

Advertisements

Clone / Copy SSIS Catalog Project Configuration – sp_SSISCloneConfiguration GitHub

When you deploy a SSIS project to SSIS Catalog, you have the possibility to Configure the deployed SSIS project so you do not need to pass the parameter execution values each time you are executing the the project packages. You can configure the project to use explicitly provided value or you can use reference to Environment Variables.

This is very useful feature, however if you are using multiple environments, like DEV, TEST, QA, PROD, then you have to handle that configuration in each of the environments. To simplify the process of setting up the parameter, you can scrip the configuration.

The pitfall of scripting the configuration is, that you can do that automatically only during initial setup of the configuration. Once you confirm the configuration values in SSMS, there is no possibility to script-out any existing configuration.

sp_SSISCloneConfiguration

sp_SSISCloneConfiguration is a custom stored procedure which takes care about cloning/copying the configurations among different projects in the same SSISDB catalog as well as it provides functionality to generate reusable scripts to easily re-apply the configurations even among different SSIS Servers.

The stored procedure also takes care about sensitive values re-encryption when those are being cloned as SSISS is using different encryption keys for each project in the SSIS catalog, so simple copy of the encrypted values does not work among different projects.

The stored procedure executes in the context of the AllSchemaOwner database user in the SSISDB, which ensure access to all areas of SSISDB including the needed certificates and symmetric keys to allow scripting. The actual re-application of the configuration script is running in the context of the caller as the Internal SSIS catalog stored procedures for configuring projects needs to be executed in the context of windows credentials.

The create scripts grants EXECUTE permission to the sssis_admin database role in the SSISDB catalog so the SSSIS admins can utilize that stored procedure.

Be careful, anyone with EXECUTE permission on the stored procedure is capable of seeing decrypted sensitive values when printing the script. The script by default grants that permission to ssis_admin database role. If you want to keep that functionality only to sysadmins, remove the GRANT at the end of the script.

The complete source code for the sp_SSISCloneConfiguration is available in my SQL-Scripts project on GitHub in the SSISDB folder.

The syntax of the stored procedure is as follow:

sp_SSISCloneConfiguration
    @sourceFolder             = 'Source Folder Name'
   ,@sourceProject            = 'Source Project Name'
   ,@sourceObject             = 'Optional Source Object Name'
   ,@destinationFolder        = 'Optional Destination Folder Name'
   ,@destinationProject       = 'Optional Destination Project Name'
   ,@printScript              = 1 --Specifies whether only script should be printed
   ,@decryptSensitiveInScript = 1 --Specifies whether sensitive values should be decrypted in script

When the procedure copies the configuration, it applies the source configurations to target. If on the target are any other configuration, those are not touched by the cloning script. In order to have exact configuration, the configurations on target project should be reset.

The generated script is also easily reusable as there are two variables in the beginning for specification of the destination folder and project. By simple modification of those variables the script will apply the configuration on different projects in different folders.

Parameters

The procedure has several parameters where some are required and some optional

The core parameters are @sourceFolder, @sourceProject . The @sourceObject@destinationFolder, @destinationProject, @printScript and @decryptSensistiveInScript are optional.

@source Folder

Specifies the source folder where the source project is located

@sourceProject

Specifies the name of the source project from within source folder to be scripted

@sourceObject

Optional name of the Source Object. It specifies name of the object withing the project which configuration should be scripted. It can point to project name or to individual package names. When provided, then only configurations related to that object are cloned or scripted.

@destinationFolder

Name of the folder of destination project to which the configuration should be cloned. This parameter is optional and if not provided than the source folder name is being used.

@destinationProject

Name of the destination project to which the configuration should be coned. This parameter is optional and if not provided, then the name of the source project is used. If the destination is not provided or matches the source project (both folders and projects are the same), then the @printScript parameter is enforced to 1 and script is always generated.

@printScript

Specifies whether a script should be printed into the output window. If yes, then you simply copy the script from the output window and you can apply it immediatelly on other SSIS Instance.

Default value = 0, this means that the script executes within the SSISDB catalog and automatically clones the configuration unless it is enforced when source matches destination or destination is not provided.

@decryptSensitiveInScript

Specifeis whether sensitive values should be decrypted in the script.If yes, then the values are part of the script in plain-text form, however they are marked sensitive and will be encrypted after applying the script.

If the parameter is 0 (Default), then the sensitive values are not scripted out and instead NULL value is being scripted out. User must then fill those missing values prior execution of the script.

Sample Output

Below is sample script generated by the stored procedure when @printScript=1 and @decryptSensitiveInScript=1

RAISERROR(N'sp_SSISCloneConfiguration v0.10 (2016-12-18) (C) 2016 Pavel Pawlowski', 0, 0) WITH NOWAIT;
RAISERROR(N'=====================================================================', 0, 0) WITH NOWAIT;

DECLARE @destinationFolder nvarchar(128) = N'Test' --Specify Destination Folder Name
DECLARE @destinationProject nvarchar(128) = N'DataLoading' --Specify Destination Project Name

--Checking for destination folder existence
IF NOT EXISTS(SELECT 1 FROM [SSISDB].[catalog].[folders] WHERE [name] = @destinationFolder)
BEGIN
RAISERROR(N'Destination folder [%s] does not exists.', 15, 0, @destinationFolder) WITH NOWAIT;
RETURN;
END
--Checking for destination project existence
IF NOT EXISTS(SELECT 1
FROM [SSISDB].[catalog].[projects] p
INNER JOIN [SSISDB].[catalog].[folders] f ON f.folder_id = p.folder_id
WHERE f.name = @destinationFolder AND p.name = @destinationProject)
BEGIN
RAISERROR(N'Destination project [%s]\[%s] does not exists.', 15, 1, @destinationFolder, @destinationProject) WITH NOWAIT;
RETURN;
END

--Project parameters configuration
DECLARE @var sql_variant

RAISERROR(N'Creating Configuration [SSISDB]\[%s]\[%s]\[DataLoading]\[SourceDB_ConnectionString]', 0, 0, @destinationFolder, @destinationProject) WITH NOWAIT;
EXEC [SSISDB].[catalog].[set_object_parameter_value] @object_type=20, @parameter_name = N'SourceDB_ConnectionString', @object_name = @destinationProject, @folder_name = @destinationFolder, @project_name = @destinationProject, @value_type = 'R', @parameter_value = N'SourceDB_ConnectionString'

RAISERROR(N'Creating Configuration [SSISDB]\[%s]\[%s]\[DataLoading]\[Interval]', 0, 0, @destinationFolder, @destinationProject) WITH NOWAIT;
SET @var = CONVERT(int, N'120');
EXEC [SSISDB].[catalog].[set_object_parameter_value] @object_type=20, @parameter_name = N'Interval', @object_name = @destinationProject, @folder_name = @destinationFolder, @project_name = @destinationProject, @value_type = 'V', @parameter_value = @var

RAISERROR(N'Creating Configuration [SSISDB]\[%s]\[%s]\[DataLoading]\[SecurePassword]', 0, 0, @destinationFolder, @destinationProject) WITH NOWAIT;
SET @var = CONVERT(sql_variant, N'Pa$$w0rd'); --SENSITIVE
EXEC [SSISDB].[catalog].[set_object_parameter_value] @object_type=20, @parameter_name = N'SecurePassword', @object_name = @destinationProject, @folder_name = @destinationFolder, @project_name = @destinationProject, @value_type = 'V', @parameter_value = @var

RAISERROR(N'Creating Configuration [SSISDB]\[%s]\[%s]\[DataLoad.dtsx]\[SourceTableName]', 0, 0, @destinationFolder, @destinationProject) WITH NOWAIT;
SET @var = CONVERT(sql_variant, N'Person');
EXEC [SSISDB].[catalog].[set_object_parameter_value] @object_type=30, @parameter_name = N'SourceTableName', @object_name = N'DataLoad.dtsx', @folder_name = @destinationFolder, @project_name = @destinationProject, @value_type = 'V', @parameter_value = @var
RAISERROR(N'-----------------------------------------------------------------------------', 0, 0) WITH NOWAIT;
RAISERROR(N'There are configurations using Environment varaibles references.', 0, 0) WITH NOWAIT;
RAISERROR(N'DON''T FORGET TO SET ENVIRONMENT REFERENCES for project [%s]\[%s].', 0, 0, @destinationFolder, @destinationProject) WITH NOWAIT
RAISERROR(N'-----------------------------------------------------------------------------', 0, 0) WITH NOWAIT;

Reading SharePoint Surveys and Lists using a SQL Server Integration Services Package – Updated

On www.mssqltips.com (Reading SharePoint Surveys and Lists using a SQL Server Integration Services SSIS Package) you can find an updated version of my original post Reading SharePoint List and Survey data using Integration Services package.

The updated version on mssqltips describes an easier way of reading the Surveys and Lists data with simplified use of URL protocol an more elegant way of final data extraction using the T-SQL XML methods.

Post also contains complete demo solution.

Manipulating SSIS packages using PowerShell

Probably you have come to situation when you have a SSIS project with higher count of packages and you need to add for example a parameter or variable to all or most of the packages.  Normally you have to open each package and add variable/parameter to all packages one by one and specify all the properties in each package.

Using the PowerShell will save you a lot of manual work and tons of clicks inside the SQL Server Data Tools (SSDT) or  Business Intelligence Development Studio (BIDS) as PowerShell will allow you to automate the manual work. It’s very easy to manipulate the packages by PowerShell as SSIS packages are pure XML Files and PowerShell has a good XML support.

For writing, testing and running PowerShell script I suggest you download and install PowerGUI which has great IntelliSense support and is probably best PowerShell editor available. Also I suggest you to install the Run AddOn which allows you to execute only selected part of script.

Also before you start executing the scripts, ensure that you have properly set ExecutionPolicy to allow script execution. You have to run it with elevated Administrator privileges.

Enable Local not signed Script Execution
Set-ExecutionPolicy RemoteSigned

Example how to manipulate SSIS using PowerShell

In this example we will manipulate SSIS 2012 packages and we will add a new parameter to the packages.

First thing you have to do is to retrieve list of SSIS package to be processed. This can be easily done by the Get-ChildItem cmdlet.

$files = Get-ChildItem "C:MySSISProjects*.dtsx"

Once we have the list of file to be processed we can start processing file by file using the foreach structure. In this example we will add a SSISTaskID package parameter.

Get List of all relevat SSIS Packages
$files = Get-ChildItem "C:MyProjectsMySSISProject*.dtsx"

#process file by file
foreach($file in $files)
{
    #get the content of SSIS package as XML
    $dts = [xml](Get-Content $file.FullName)

    #create XmlNamespaceManager
    $mng = [System.Xml.XmlNamespaceManager]($dts.NameTable)
    #add a DTS namespace to the XmlNamespaceManager
    $mng.AddNamespace("DTS", "www.microsoft.com/SqlServer/Dts")

    #use XPath query to get DTS:PackageParameters node
    $params = $dts.SelectSingleNode("/DTS:Executable/DTS:PackageParameters[1]", $mng)

    #use XPath query to get eventual existing SSISTaskID parameter
    $param = $dts.SelectSingleNode("/DTS:Executable/DTS:PackageParameters/DTS:PackageParameter[@DTS:ObjectName='SSISTaskID'][1]", $mng)

    if ($param -eq $null) #parameter does not exists
    {
        #Create New Package parameter with prefix DTS and correct namespace uri
        $param = $dts.CreateElement("DTS", "PackageParameter", "www.microsoft.com/SqlServer/Dts")

        $attr = $dts.CreateAttribute("DTS", "CreationName", "www.microsoft.com/SqlServer/Dts")
        $param.Attributes.Append($attr)

        #Parameter Data Type
        $attr = $dts.CreateAttribute("DTS", "DataType", "www.microsoft.com/SqlServer/Dts")
        $attr.Value = "3" #DataType = 3 is equal to integer
        $param.Attributes.Append($attr)

        ##Description
        $attr = $dts.CreateAttribute("DTS", "Description", "www.microsoft.com/SqlServer/Dts")
        $attr.Value = "This is a TaskID"
        $param.Attributes.Append($attr)

        #DTSID - unique GUID for created parameter
        $attr = $dts.CreateAttribute("DTS", "DTSID", "www.microsoft.com/SqlServer/Dts")
        $attr.Value = "{" + [System.Guid]::NewGuid().toString().ToUpper() + "}" #Each parameter has unique GUID in SSIS package, so let's generate one
        $param.Attributes.Append($attr)

        #parameter name
        $attr = $dts.CreateAttribute("DTS", "ObjectName", "www.microsoft.com/SqlServer/Dts")
        $attr.Value = "SSISTaskID"
        $param.Attributes.Append($attr)

        #Parameter Value property - setting the parameter value
        $paramValue = $dts.CreateElement("DTS", "Property", "www.microsoft.com/SqlServer/Dts");

        #setting attributes of Parameter Value Property
        $attr = $dts.CreateAttribute("DTS", "DataType", "www.microsoft.com/SqlServer/Dts")
        $attr.Value = "3"
        $paramValue.Attributes.Append($attr)

        $attr = $dts.CreateAttribute("DTS", "Name", "www.microsoft.com/SqlServer/Dts")
        $attr.Value = "ParameterValue"
        $paramValue.Attributes.Append($attr)

        #Set the value of the ParameterValue property
        $paramValue.InnerText = "0"

        #add the property to the Parameter
        $param.AppendChild($paramValue)     

        #Add the Parameter to the Params collection
        $params.AppendChild($param)
    }       

    ################################################################################################
    #  Once the Parameter is created it is time to write the package back to disk
    ################################################################################################

    #Create XmlWriterSettings as we want to format the SSIS package properly
    [System.Xml.XmlWriterSettings] $settings = New-Object -TypeName System.Xml.XmlWriterSettings

    $settings.Indent = $true #Indent child nodes
    $settings.NewLineOnAttributes = $true #put each attribude on separate line

    #Create XmlWriter with appropriate settings and path to overwrite the original package
    [System.Xml.XmlWriter]$writer = [System.Xml.XmlWriter]::Create($file.FullName, $settings)

    #Save the package using the XmlWriter
    $dts.Save($writer)

    #Close the writer
    $writer.Close()
}
In the script we are using XPath query to get relevant Parameter Nodes. This can be easily used for locating whatever node in the XML source is necessary. For more details related to XPath you can take a look  XPath Reference on MSDN. I our example we take first DTS:PackageParameters node under the DTS:Executable node. For correct querying we have to use XmlNameSpaceManager as all the nodes in SSIS Package are prefixed with DTS name space. For checking whether the parameter we would like to add exists or not we again use XPath with appropriate filters to find it by Name. Then if the parameter doesn't exists, we create a new one. What attributes and child nodes are necessary for such parameter you find out easy. Create the parameter using BIDS or SSDT in one of the SSIS packages, then take a look into the source code of the package and you will be able to find all the information necessary. For example in our case the parameter node in the source XML looks as below:
<DTS:PackageParameter
    DTS:CreationName=""
    DTS:DataType="3"
    DTS:Description="This is a TaskID"
    DTS:DTSID="{3252C760-114A-4E8F-86EB-D3E6BAACA0E4}"
    DTS:ObjectName="SSISTaskID">
    <DTS:Property
    DTS:DataType="3"
    DTS:Name="ParameterValue">0</DTS:Property>
</DTS:PackageParameter>

So we had to create a PackageParameterNode with CreationName, DataType, Description, DTSID and ObjectName attributes and also with child node Property, which represents default value of the parameter.

As each parameter, variable or whatever part in SSIS has it's own unique DTSID property, which is GUID, we have used the [System.Guid]::NewGuid() method to generate unique DTSID for our parameter.

Finally once the changes into the SSIS package are applied, it is necessary to write it back to the file system. By default the source of the SSIS package is indented and and each attribute is written on separate line. SSIS packages are written in that way to be more human readable and allow an easy processing and comparison by versioning systems.

To achieve the same formatting we have to create and use instance of the XmlWriter class with appropriate XmlWriterSettings. As you can see from the script we set Indent and NewLineOnAttributes properties to true.

Once the writer is constructed, we can write the package back to the file system. It is also important to close the writer once the saving is done otherwise you will keep open handles to the processed files.

In the example we are also overwriting the original package as we specify the $file.FullName as path for the XmlWriter class. If we want, we can provide whatever name for the package and save it under different name. In case we would like to add a "_modified" suffix to the file name, we can use:

System.IO.Path]::Combine($file.DirectoryName, $file.BaseName + "_modified") + $file.Extension

Conclusion

As we can see from the example, the SSIS manipulation using PowerShell is quite easy and after a small training with writing this modification also very quick and will save you a lot of manual work if you have to modify a higher count of packages. You can alter the above script slightly to do nearly whatever modification is necessary.

This example was done on SSIS 2012 package source, but can be relatively easy applied also on SSIS 2005 and SSI 2005 although the XML source of those packages are not so easy readable as the source of SSIS 2012.

SSIS 2008 Columns To XML Data Flow Transformation

This article represents a custom components for transformation of columns inside SSIS Data Flow into XML.

If you are writing ETL processes using SSIS, you many times comes to situation, when you need log errors or other problems which can occur during ETL process. An example can be a failure of data conversion during the ETL process. In that situation you mostly would like to know that record exactly caused that error and log such conversion errors into particular error table.

Problem occurs when you have a lot of tables and want to see what data were converted and caused that particular error. As SSIS requires that the sources and destinations needs to be strongly mapped and you have different sources with different structures, this leads to a situation where you have a lot of different error tables – generally one per the source structure at which an error can occur.

This situation can be easily solved by a custom Data Flow component which will convert all source columns into an XML. In that situation you can only have one Error Table, which will contain e.g. Error Code, Error Column, Description and an XML Field in which will be all the data of particular row which caused error encoded in XML Form. This rapidly simplifies error logging and also error log monitoring as you need to monitor only a single table.

So the SSIS Data Flow with the Custom Columns to XML Transformation could look like the one below.

Columns To Xml Data FLow

ColumnsToXML PropertiesColumnsToXML Input Columns

You can provide a SourceID and SourceName properties which are then part of the produced XML so you can easily identify the source. In the Input Columns you select columns you want to be written to the XML. You can also specify an OutpuAlias and the the column names written into the XML will be the one specified in the OutputAlias property.

The Log written using that component and the XML produced will look like the ones below.

ColumnsToXML Log

<row sourceID="1" sourceName="Test Data Conversion">
  <Column name="Name" id="235" lineageId="72">Joseph</Column>
  <Column name="ValueInt" id="236" lineageId="75">2</Column>
  <Column name="ValueDate" id="237" lineageId="78">2012/01/35</Column>
</row>

Using the .value method of XML data type we can easily querying necessary information from the log table for whatever source in the table. So for our sample scenario we would like to query the Name, ValueInt and ValueDate columns stored as XML. This can be easily achieved by below SQL query:

SELECT
    [RowID]
    ,[ErrorCode]
    ,[ErrorColumn]
    ,[ColumnsXMLData].value('data((/row/Column[@name="Name"])[1])', 'varchar(128)') AS Name
    ,[ColumnsXMLData].value('data((/row/Column[@name="ValueInt"])[1])', 'varchar(10)') AS ValueInt
    ,[ColumnsXMLData].value('data((/row/Column[@name="ValueDate"])[1])', 'varchar(10)') AS ValueDate
FROM [TestDB].[dbo].[ETLErrorLog]
WHERE [ColumnsXMLData].value('data((/row/@sourceID)[1])', 'int') = 1

As you can see from the samples above, you can create easily create single Error table for whatever ETL logging and simplify the logging rapidly as you#160; do not need to create the error tables with separate structures for each source.

ColumnsToXML Implementation

The ColumnsToXmlTransform is a custom implementation of the PipelineComponent. It is developed in C# for .NET framework 3.5 and uses Linq to XML for creating the XML output.

The component processes selected input columns, generates the XML and stores it in the output column called ColumnsXMLData.

Except the ColumsXMLDataSourceID and SourceName) to allow you to identify the source row source once it is written as XML into the database and simplifies querying for data coming from particular sources as showed above.

Once the component is built, it is necessary to install it into the Global Assembly Cache and to the PipelineComponents folder in the SQL Server Integration services installation path on the machine on which the SSIS using this component will be running.

To install it into a GAC you can use the gacutil.exe which is part of the Windows SDK.

Component is now available publicly with other components on GitHub as projects for SSIS 2008 – SSIS 2016.

If you run the development environment as with elevated Administrator privileges, the project contains Post Build events, which will install the assembly into the GAC (it assumes you have Windows SDK 7.0 installed – if you have newer version, modify the path to the gacutil.exe accordingly) and also copies the assembly into the codePipelineComponents/code folder of the default SQL Server Integration services installation. If you installation directory differ, please modify the paths in Post Build events accordingly.

Once you have compiled the code, installed in GAC and copied to the PipelineComponents folder, you can include the component to the data flow items palette. Right-click on the palette and select Choose Items

Choose Data Flow Transformation Items Pop-upChoose SSIS Data Flow Items

Once you select the component, it will appear on the toolbox and you can start using it.

Columns To XML On Palette

Once I will have more time, I will try to create an installer for the component. Anyway, any comments are welcome.

Reading SharePoint List and Survey data using Integration Services package

Sometimes you can come to situation when you need to read data from SharePoint and store them in database on SQL Server.

If you need to read a SharePoint List, then you can use SharePoint List Source and Destination components from CodePlex. There is a quite good presentation how to use it on MSDN site Extracting and Loading SharePoint Data in SQL Server Integration Services

These components work very well for the SharePoint List. Unfortunately you will have no success using them for reading data from the SharePoint Survey, although the structure of the data are nearly the same as for list. Simply these components doesn’t support Survey data. Also it can happen, that you cannot install any third party component you your server and therefore you cannot use the SharePoint List Source and Destination components.

In that case you can use a script which will extract the XML data out of the SharePoint and you have two possibilities how to handle the XML.

  1. You can save the XML file to some temporary location and then process the XML file using the XML Source in the Data Flow task to read data from it.
  2. You can process the XML inside the Script Component in the Data Flow task and direct records to corresponding output.

To retrieve the data you can use a below class snippet (C#), which you put into your Script Component or Scrip Task.

internal class SharePointDownloader
{
    public static XmlDocument GetSharePointListXml(string listUrl)
    {
        ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CertificateOverride.RemoteCertificateValidationCallback);
        WebResponse response;
        HttpWebRequest request = (HttpWebRequest)System.Net.HttpWebRequest.Create(listUrl);
        CookieContainer cookieContainer = new CookieContainer();

        request.CookieContainer = cookieContainer;
        request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

        response = request.GetResponse();
        string html;
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            html = sr.ReadToEnd();
            sr.Close();
        }

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(html);

        XmlDocument destXml = new XmlDocument();
        XmlDeclaration dec = destXml.CreateXmlDeclaration("1.0", null, null);
        destXml.AppendChild(dec);
        XmlNode data = destXml.CreateElement("Data");
        destXml.AppendChild(data);

        foreach (XmlNode node in doc.GetElementsByTagName("z:row"))
        {
            XmlNode row = destXml.CreateElement("row");
            foreach (XmlAttribute attr in node.Attributes)
            {
                XmlAttribute da = destXml.CreateAttribute(attr.Name);
                da.Value = attr.Value;
                row.Attributes.Append(da);
            }
            data.AppendChild(row);
        }

        return destXml;
    }

    internal class CertificateOverride
    {
        public static bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chaing, SslPolicyErrors sslPolicyErrors)
        {
            return true;
        }
    }
}

As mentioned above put this class as part of the Script Component or Script Task and use the static method SharePointDownloader.GetSharePointListXml method to retrieve the XML data.

The function takes as argument an URL which points to the list/survey you need to download. You can retrieve this link from the .igy file which you get, when you select Export To Spreadsheet action in the SharePoint. Instead opening the file in excel you choose to save it and retrieve the URL from this file.

It is necessary to mention the assignation of the ServerCertificateValidationCallback property of the ServicePointManager class to a newly created internal class, which simply returns true. This is necessary to avoid problems with certificates.

The code snippet also uses the DefaultNetworkCredentials which represents credentials of the account under which the SSIS package will be executed. You can eventually provide other credentials if necessary.

One very important thing related to this snipped is, that in case of downloading Survey data, the account under which the code will be executed needs admin rights to the Survey in SharePoint, otherwise it will have no access to the complete list of responses in the survey.

So finally a complete sample of a Script Task could look similar to below code:

[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

    #region VSTA generated code
    enum ScriptResults
    {
        Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
        Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
    };

    #endregion

    public void Main()
    {

        string url = "source data URL retrieved from the .igy file of the list/survey";
        string destFile = "c:tempfile.xml";
        try
        {
            XmlDocument data = SharePointDownloader.GetSharePointListXml(url);
            data.Save(destFile);
        }
        catch
        {
            Dts.TaskResult = (int)ScriptResults.Failure;
            return;
        }
        Dts.TaskResult = (int)ScriptResults.Success;
    }
}

internal class SharePointDownloader
{
    public static XmlDocument GetSharePointListXml(string listUrl)
    {
        ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CertificateOverride.RemoteCertificateValidationCallback);
        WebResponse response;
        HttpWebRequest request = (HttpWebRequest)System.Net.HttpWebRequest.Create(listUrl);
        CookieContainer cookieContainer = new CookieContainer();

        request.CookieContainer = cookieContainer;
        request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

        response = request.GetResponse();
        string html;
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            html = sr.ReadToEnd();
            sr.Close();
        }

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(html);

        XmlDocument destXml = new XmlDocument();
        XmlDeclaration dec = destXml.CreateXmlDeclaration("1.0", null, null);
        destXml.AppendChild(dec);
        XmlNode data = destXml.CreateElement("Data");
        destXml.AppendChild(data);

        foreach (XmlNode node in doc.GetElementsByTagName("z:row"))
        {
            XmlNode row = destXml.CreateElement("row");
            foreach (XmlAttribute attr in node.Attributes)
            {
                XmlAttribute da = destXml.CreateAttribute(attr.Name);
                da.Value = attr.Value;
                row.Attributes.Append(da);
            }
            data.AppendChild(row);
        }

        return destXml;
    }

    internal class CertificateOverride
    {
        public static bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chaing, SslPolicyErrors sslPolicyErrors)
        {
            return true;
        }
    }
}

Once you save the data to a temporary destination, you can connect to the xml file using the XML Source in the Data Flow Task. What you will need is to generate the XSD schema from the XML source and keep that XSD schema for future use as in case of any other XML file which needs to be imported by SSIS.

If you would like to process the XML in the Script component, again you simply put the code inside the Script component, retrieve the XML and then go through the XML programmatically and direct the data to appropriate output of the script component.

Of course you can make some fine tuning for the script component to avoid double processing of the file use the core of the GetSharePointListXML and redirect the data directly to and output of the script component when processing the nodes in that part.

SSIS with Excel Source/Destination on x64 SQL Server Instance

Not all people know, that the Excel file format is not supported by the x64 version of SQL server and the same apply to the SSIS runtime. This apply not only to excel but to all data sources for which you have only 32 bit drivers (e.g. ODBC drivers of external data sources etc.

Although you can develop the SSIS package with Excel source/destination in Business Intelligence Studio (BIDS)  on x64 machine, you can design the whole ETL process, but when you try to debug such package, you will receive an error message, that Excel file source is not supported on x64.

Debugging on x64 machine

If you want to debug a package on x64 machine, you have to set the environment not to use the x64 runtime. BIDS by default uses the x64 runtime on x64 machines so it is necessary to disable it. You can do this in the SSIS project properties.

SSIS Project Properties in BIDS

Once you set the Run64BitRuntime to false, you will be able to run and debug the package in the BIDS environment.

Running package on x64 version of SQL Server

As Mentioned above, you can set the SSIS properties, so the BIDS does not use the x64 runtime for running the package containing an Excel source/destination. Unfortunately this settings is only for design time debugging in the BIDS environment. When you deploy the package to the server and try to run it directly or from within and agent job, execution will fail as the Excel is not supported by x64 version of SQL Server.

SQL Server 2008 and newer

On the SQL Server 2008 and newer the solution is easy as the Job Step Properties dialog for a SQL Server Integration Services Package type has an option to execute the package using the 32 bit runtime. The option can be found on the Execution options tab. Once you set the “Use 32 bit runtime”, the SSIS package will run correctly on the x64 version of SQL Server 2008 and above. The option simply set the /X86 argument of the DTExec utility and you can see it on the Command Line tab, once you check it.

Agent Job Step Properties

SQL Server 2005

Situation is quite different on SQL Server 2005 as the Job Step Properties dialog doesn’t have the option to run the package using the X86 runtime. Although it seems, that we are blocked and it will not be possible to run such SSIS package on the x64 version of SQL Server 2005, there is one workaround you can use to run that package. The workaround is to use Operating System (CmdExec) task and use the x86 version of DTExec utility which is also installed on the x64 instance of SQL Server. It’s located in the SQL Server installation directory under “Program Files (X86)” in DTSBinn directory.

To simplify configuration, first choose the SQL Server Integration Services Package Type and select location a SSIS package you want to run.

New Job Step

Once you have selected the package and configured everything necessary for the package execution, you can switch to the Command Line tab where you can see all the necessary arguments to execute the DTExec utility. If you know all the arguments, you do not need to do this steps and simply configure the CmdExec task, although in this way you do not need to write the arguments manually.

New Job Step Command Line

After setting all the necessary parameters, change the step Type to Operating System (CmdExec). All the arguments for the DTExec appear in the command TextBox.

New Job Step - CmdExec

Now you only add the x86 version of DTExec utility including whole path to it.

New Job Step - CmdExec after update

After this, you have a SSIS Package with Excel source/destination (or other data source/destination with only 32bit drivers) configured to run on X64 instance of SQL Server.