Reading SharePoint List and Survey data using Integration Services package

Sometimes you can come to situation when you need to read data from SharePoint and store them in database on SQL Server.

If you need to read a SharePoint List, then you can use SharePoint List Source and Destination components from CodePlex. There is a quite good presentation how to use it on MSDN site Extracting and Loading SharePoint Data in SQL Server Integration Services

These components work very well for the SharePoint List. Unfortunately you will have no success using them for reading data from the SharePoint Survey, although the structure of the data are nearly the same as for list. Simply these components doesn’t support Survey data. Also it can happen, that you cannot install any third party component you your server and therefore you cannot use the SharePoint List Source and Destination components.

In that case you can use a script which will extract the XML data out of the SharePoint and you have two possibilities how to handle the XML.

  1. You can save the XML file to some temporary location and then process the XML file using the XML Source in the Data Flow task to read data from it.
  2. You can process the XML inside the Script Component in the Data Flow task and direct records to corresponding output.

To retrieve the data you can use a below class snippet (C#), which you put into your Script Component or Scrip Task.

internal class SharePointDownloader
{
    public static XmlDocument GetSharePointListXml(string listUrl)
    {
        ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CertificateOverride.RemoteCertificateValidationCallback);
        WebResponse response;
        HttpWebRequest request = (HttpWebRequest)System.Net.HttpWebRequest.Create(listUrl);
        CookieContainer cookieContainer = new CookieContainer();

        request.CookieContainer = cookieContainer;
        request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

        response = request.GetResponse();
        string html;
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            html = sr.ReadToEnd();
            sr.Close();
        }

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(html);

        XmlDocument destXml = new XmlDocument();
        XmlDeclaration dec = destXml.CreateXmlDeclaration("1.0", null, null);
        destXml.AppendChild(dec);
        XmlNode data = destXml.CreateElement("Data");
        destXml.AppendChild(data);

        foreach (XmlNode node in doc.GetElementsByTagName("z:row"))
        {
            XmlNode row = destXml.CreateElement("row");
            foreach (XmlAttribute attr in node.Attributes)
            {
                XmlAttribute da = destXml.CreateAttribute(attr.Name);
                da.Value = attr.Value;
                row.Attributes.Append(da);
            }
            data.AppendChild(row);
        }

        return destXml;
    }

    internal class CertificateOverride
    {
        public static bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chaing, SslPolicyErrors sslPolicyErrors)
        {
            return true;
        }
    }
}

As mentioned above put this class as part of the Script Component or Script Task and use the static method SharePointDownloader.GetSharePointListXml method to retrieve the XML data.

The function takes as argument an URL which points to the list/survey you need to download. You can retrieve this link from the .igy file which you get, when you select Export To Spreadsheet action in the SharePoint. Instead opening the file in excel you choose to save it and retrieve the URL from this file.

It is necessary to mention the assignation of the ServerCertificateValidationCallback property of the ServicePointManager class to a newly created internal class, which simply returns true. This is necessary to avoid problems with certificates.

The code snippet also uses the DefaultNetworkCredentials which represents credentials of the account under which the SSIS package will be executed. You can eventually provide other credentials if necessary.

One very important thing related to this snipped is, that in case of downloading Survey data, the account under which the code will be executed needs admin rights to the Survey in SharePoint, otherwise it will have no access to the complete list of responses in the survey.

So finally a complete sample of a Script Task could look similar to below code:

[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

    #region VSTA generated code
    enum ScriptResults
    {
        Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
        Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
    };

    #endregion

    public void Main()
    {

        string url = "source data URL retrieved from the .igy file of the list/survey";
        string destFile = "c:tempfile.xml";
        try
        {
            XmlDocument data = SharePointDownloader.GetSharePointListXml(url);
            data.Save(destFile);
        }
        catch
        {
            Dts.TaskResult = (int)ScriptResults.Failure;
            return;
        }
        Dts.TaskResult = (int)ScriptResults.Success;
    }
}

internal class SharePointDownloader
{
    public static XmlDocument GetSharePointListXml(string listUrl)
    {
        ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CertificateOverride.RemoteCertificateValidationCallback);
        WebResponse response;
        HttpWebRequest request = (HttpWebRequest)System.Net.HttpWebRequest.Create(listUrl);
        CookieContainer cookieContainer = new CookieContainer();

        request.CookieContainer = cookieContainer;
        request.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

        response = request.GetResponse();
        string html;
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            html = sr.ReadToEnd();
            sr.Close();
        }

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(html);

        XmlDocument destXml = new XmlDocument();
        XmlDeclaration dec = destXml.CreateXmlDeclaration("1.0", null, null);
        destXml.AppendChild(dec);
        XmlNode data = destXml.CreateElement("Data");
        destXml.AppendChild(data);

        foreach (XmlNode node in doc.GetElementsByTagName("z:row"))
        {
            XmlNode row = destXml.CreateElement("row");
            foreach (XmlAttribute attr in node.Attributes)
            {
                XmlAttribute da = destXml.CreateAttribute(attr.Name);
                da.Value = attr.Value;
                row.Attributes.Append(da);
            }
            data.AppendChild(row);
        }

        return destXml;
    }

    internal class CertificateOverride
    {
        public static bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chaing, SslPolicyErrors sslPolicyErrors)
        {
            return true;
        }
    }
}

Once you save the data to a temporary destination, you can connect to the xml file using the XML Source in the Data Flow Task. What you will need is to generate the XSD schema from the XML source and keep that XSD schema for future use as in case of any other XML file which needs to be imported by SSIS.

If you would like to process the XML in the Script component, again you simply put the code inside the Script component, retrieve the XML and then go through the XML programmatically and direct the data to appropriate output of the script component.

Of course you can make some fine tuning for the script component to avoid double processing of the file use the core of the GetSharePointListXML and redirect the data directly to and output of the script component when processing the nodes in that part.

Advertisements

6 thoughts on “Reading SharePoint List and Survey data using Integration Services package

  1. Hi Pavel,
    I’ve implemented the above to try to retrieve a Survey list from a sharepoint site, however im getting a error around the;
    ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CertificateOverride.RemoteCertificateValidationCallback);

    Description of error is, no overload for RemoteCertificateValidationCallback

    I have;
    using System.Net.Security;

    So bit confused whats causing the error.

    • It seems you have some bug in the implementation of the internal class CertificateOverride.

      internal class CertificateOverride
      {
          public static bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chaing, SslPolicyErrors sslPolicyErrors)
          {
              return true;
          }
      }
      

      The Parameters of the RemoteCertificateValidationCallback method as to match the RemoteCertificateValidationCallback delegate.

  2. Hey Pavel,

    Thanks a bunch for this tutorial. It works great! For the next step, how would you go about updating or deleting surveys from a Sharepoint Survey list?

    For my specific project, I would like to download the surveys using the method you mentioned above. But then I need to delete the surveys that I just downloaded from the Sharepoint Survey List.

    Thanks,
    Travis

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s