Reading SharePoint Surveys and Lists using a SQL Server Integration Services Package – Updated

On www.mssqltips.com (Reading SharePoint Surveys and Lists using a SQL Server Integration Services SSIS Package) you can find an updated version of my original post Reading SharePoint List and Survey data using Integration Services package.

The updated version on mssqltips describes an easier way of reading the Surveys and Lists data with simplified use of URL protocol an more elegant way of final data extraction using the T-SQL XML methods.

Post also contains complete demo solution.

Reading SharePoint List and Survey data using Integration Services package

Sometimes you can come to situation when you need to read data from SharePoint and store them in database on SQL Server.

If you need to read a SharePoint List, then you can use SharePoint List Source and Destination components from CodePlex. There is a quite good presentation how to use it on MSDN site Extracting and Loading SharePoint Data in SQL Server Integration Services

These components work very well for the SharePoint List. Unfortunately you will have no success using them for reading data from the SharePoint Survey, although the structure of the data are nearly the same as for list. Simply these components doesn’t support Survey data. Also it can happen, that you cannot install any third party component you your server and therefore you cannot use the SharePoint List Source and Destination components.

In that case you can use a script which will extract the XML data out of the SharePoint and you have two possibilities how to handle the XML.

  1. You can save the XML file to some temporary location and then process the XML file using the XML Source in the Data Flow task to read data from it.
  2. You can process the XML inside the Script Component in the Data Flow task and direct records to corresponding output.

To retrieve the data you can use a below class snippet (C#), which you put into your Script Component or Scrip Task.

internal class SharePointDownloader
{
    public static XmlDocument GetSharePointListXml(string listUrl)
    {
        ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CertificateOverride.RemoteCertificateValidationCallback);
        WebResponse response;
        HttpWebRequest request = (HttpWebRequest)System.Net.HttpWebRequest.Create(listUrl);
        CookieContainer cookieContainer = new CookieContainer();

        request.CookieContainer = cookieContainer;
        request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

        response = request.GetResponse();
        string html;
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            html = sr.ReadToEnd();
            sr.Close();
        }

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(html);

        XmlDocument destXml = new XmlDocument();
        XmlDeclaration dec = destXml.CreateXmlDeclaration("1.0", null, null);
        destXml.AppendChild(dec);
        XmlNode data = destXml.CreateElement("Data");
        destXml.AppendChild(data);

        foreach (XmlNode node in doc.GetElementsByTagName("z:row"))
        {
            XmlNode row = destXml.CreateElement("row");
            foreach (XmlAttribute attr in node.Attributes)
            {
                XmlAttribute da = destXml.CreateAttribute(attr.Name);
                da.Value = attr.Value;
                row.Attributes.Append(da);
            }
            data.AppendChild(row);
        }

        return destXml;
    }

    internal class CertificateOverride
    {
        public static bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chaing, SslPolicyErrors sslPolicyErrors)
        {
            return true;
        }
    }
}

As mentioned above put this class as part of the Script Component or Script Task and use the static method SharePointDownloader.GetSharePointListXml method to retrieve the XML data.

The function takes as argument an URL which points to the list/survey you need to download. You can retrieve this link from the .igy file which you get, when you select Export To Spreadsheet action in the SharePoint. Instead opening the file in excel you choose to save it and retrieve the URL from this file.

It is necessary to mention the assignation of the ServerCertificateValidationCallback property of the ServicePointManager class to a newly created internal class, which simply returns true. This is necessary to avoid problems with certificates.

The code snippet also uses the DefaultNetworkCredentials which represents credentials of the account under which the SSIS package will be executed. You can eventually provide other credentials if necessary.

One very important thing related to this snipped is, that in case of downloading Survey data, the account under which the code will be executed needs admin rights to the Survey in SharePoint, otherwise it will have no access to the complete list of responses in the survey.

So finally a complete sample of a Script Task could look similar to below code:

[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

    #region VSTA generated code
    enum ScriptResults
    {
        Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
        Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
    };

    #endregion

    public void Main()
    {

        string url = "source data URL retrieved from the .igy file of the list/survey";
        string destFile = "c:tempfile.xml";
        try
        {
            XmlDocument data = SharePointDownloader.GetSharePointListXml(url);
            data.Save(destFile);
        }
        catch
        {
            Dts.TaskResult = (int)ScriptResults.Failure;
            return;
        }
        Dts.TaskResult = (int)ScriptResults.Success;
    }
}

internal class SharePointDownloader
{
    public static XmlDocument GetSharePointListXml(string listUrl)
    {
        ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CertificateOverride.RemoteCertificateValidationCallback);
        WebResponse response;
        HttpWebRequest request = (HttpWebRequest)System.Net.HttpWebRequest.Create(listUrl);
        CookieContainer cookieContainer = new CookieContainer();

        request.CookieContainer = cookieContainer;
        request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

        response = request.GetResponse();
        string html;
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            html = sr.ReadToEnd();
            sr.Close();
        }

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(html);

        XmlDocument destXml = new XmlDocument();
        XmlDeclaration dec = destXml.CreateXmlDeclaration("1.0", null, null);
        destXml.AppendChild(dec);
        XmlNode data = destXml.CreateElement("Data");
        destXml.AppendChild(data);

        foreach (XmlNode node in doc.GetElementsByTagName("z:row"))
        {
            XmlNode row = destXml.CreateElement("row");
            foreach (XmlAttribute attr in node.Attributes)
            {
                XmlAttribute da = destXml.CreateAttribute(attr.Name);
                da.Value = attr.Value;
                row.Attributes.Append(da);
            }
            data.AppendChild(row);
        }

        return destXml;
    }

    internal class CertificateOverride
    {
        public static bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chaing, SslPolicyErrors sslPolicyErrors)
        {
            return true;
        }
    }
}

Once you save the data to a temporary destination, you can connect to the xml file using the XML Source in the Data Flow Task. What you will need is to generate the XSD schema from the XML source and keep that XSD schema for future use as in case of any other XML file which needs to be imported by SSIS.

If you would like to process the XML in the Script component, again you simply put the code inside the Script component, retrieve the XML and then go through the XML programmatically and direct the data to appropriate output of the script component.

Of course you can make some fine tuning for the script component to avoid double processing of the file use the core of the GetSharePointListXML and redirect the data directly to and output of the script component when processing the nodes in that part.