Retrieve Part of XML File

Today I faced a question on changing style of an xml file and fetch desired result as another xml file, 

The XML file has lots of un-necessary tags which should be removed , I tried to do this with C#.NET code.


This is the input xml:

<?xml version="1.0" encoding="utf-16"?>
<DataTable>
  <xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
    <xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:MainDataTable="Table" msdata:UseCurrentLocale="true">
      <xs:complexType>
        <xs:choice minOccurs="0" maxOccurs="unbounded">
          <xs:element name="Table">
            <xs:complexType>
              <xs:sequence>
                <xs:element name="PersonId" type="xs:long" minOccurs="0" />
                <xs:element name="StudentId" type="xs:long" minOccurs="0" />
                <xs:element name="FaAliyasFName" type="xs:string" minOccurs="0" />
                <xs:element name="FaAliyasLName" type="xs:string" minOccurs="0" />
                <xs:element name="BornInCity" type="xs:string" minOccurs="0" />
                <xs:element name="BornInCountry" type="xs:int" minOccurs="0" />
                <xs:element name="BornInProvince" type="xs:int" minOccurs="0" />
                <xs:element name="DateIn" type="xs:dateTime" minOccurs="0" />
                <xs:element name="EnAliyasFName" type="xs:string" minOccurs="0" />
                <xs:element name="EnAliyasLName" type="xs:string" minOccurs="0" />
                <xs:element name="EnFName" type="xs:string" minOccurs="0" />
                <xs:element name="EnLName" type="xs:string" minOccurs="0" />
                <xs:element name="HealthId" type="xs:int" minOccurs="0" />
                <xs:element name="MarriageId" type="xs:int" minOccurs="0" />
                <xs:element name="NoNational" type="xs:string" minOccurs="0" />
                <xs:element name="PassExpire" type="xs:dateTime" minOccurs="0" />
                <xs:element name="PassIssuance" type="xs:dateTime" minOccurs="0" />
                <xs:element name="PassNumber" type="xs:string" minOccurs="0" />
                <xs:element name="PursueCode" type="xs:string" minOccurs="0" />
                <xs:element name="ShIssunce" type="xs:string" minOccurs="0" />
                <xs:element name="ShNo" type="xs:string" minOccurs="0" />
                <xs:element name="Status" type="xs:short" minOccurs="0" />
                <xs:element name="SolicitorshipId" type="xs:int" minOccurs="0" />
                <xs:element name="FaAliyasMName" type="xs:string" minOccurs="0" />
                <xs:element name="EnAliyasMName" type="xs:string" minOccurs="0" />
                <xs:element name="NaAliyasFName" type="xs:string" minOccurs="0" />
                <xs:element name="NaAliyasLName" type="xs:string" minOccurs="0" />
                <xs:element name="NaAliyasMName" type="xs:string" minOccurs="0" />
                <xs:element name="EnMName" type="xs:string" minOccurs="0" />
                <xs:element name="NaFName" type="xs:string" minOccurs="0" />
                <xs:element name="NaLName" type="xs:string" minOccurs="0" />
                <xs:element name="NaMName" type="xs:string" minOccurs="0" />
                <xs:element name="FName" type="xs:string" minOccurs="0" />
                <xs:element name="LName" type="xs:string" minOccurs="0" />
                <xs:element name="MName" type="xs:string" minOccurs="0" />
                <xs:element name="NationId" type="xs:int" minOccurs="0" />
                <xs:element name="GHozeId" type="xs:int" minOccurs="0" />
                <xs:element name="GUniId" type="xs:int" minOccurs="0" />
                <xs:element name="Expr1" type="xs:long" minOccurs="0" />
                <xs:element name="FaMarriage" type="xs:string" minOccurs="0" />
                <xs:element name="EnMarriage" type="xs:string" minOccurs="0" />
                <xs:element name="ArMarriage" type="xs:string" minOccurs="0" />
                <xs:element name="ArNation" type="xs:string" minOccurs="0" />
                <xs:element name="EnNation" type="xs:string" minOccurs="0" />
                <xs:element name="FaNation" type="xs:string" minOccurs="0" />
                <xs:element name="FaGender" type="xs:string" minOccurs="0" />
                <xs:element name="EnGender" type="xs:string" minOccurs="0" />
                <xs:element name="ArGender" type="xs:string" minOccurs="0" />
                <xs:element name="Is_Sadat" type="xs:boolean" minOccurs="0" />
                <xs:element name="BirthDate" type="xs:dateTime" minOccurs="0" />
                <xs:element name="Code" type="xs:string" minOccurs="0" />
                <xs:element name="Accept_Date" type="xs:dateTime" minOccurs="0" />
                <xs:element name="GenderId" type="xs:int" minOccurs="0" />
                <xs:element name="ReligionId" type="xs:int" minOccurs="0" />
                <xs:element name="FatherName" type="xs:string" minOccurs="0" />
              </xs:sequence>
            </xs:complexType>
          </xs:element>
        </xs:choice>
      </xs:complexType>
    </xs:element>
  </xs:schema>
  <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
    <DocumentElement>
      <Table diffgr:id="Table1" msdata:rowOrder="0">
        <PersonId>1</PersonId>
        <StudentId>1</StudentId>
        <FaAliyasFName>اکبر </FaAliyasFName>
        <EnFName>MAGERRAM</EnFName>
        <EnLName>KAZEM OV</EnLName>
        <MarriageId>1</MarriageId>
        <FName>محرم</FName>
        <LName>كاظم</LName>
        <NationId>111</NationId>
        <Expr1>1</Expr1>
        <FaMarriage>نامشخص</FaMarriage>
        <EnMarriage>نامشخص</EnMarriage>
        <ArMarriage>نامشخص</ArMarriage>
        <FaNation>ایران</FaNation>
        <FaGender>مرد</FaGender>
        <EnGender>مرد</EnGender>
        <ArGender>مرد</ArGender>
        <Is_Sadat>false</Is_Sadat>
        <BirthDate>1976-03-21T00:00:00-07:00</BirthDate>
        <Code>1111</Code>
        <GenderId>2</GenderId>
        <ReligionId>6</ReligionId>
      </Table>
      <Table diffgr:id="Table2" msdata:rowOrder="1">
        <PersonId>2</PersonId>
        <StudentId>2</StudentId>
        <EnFName>tahsin</EnFName>
        <EnLName>tahmaz ov</EnLName>
        <MarriageId>1</MarriageId>
        <FName>تحسين</FName>
        <LName>تحمازاف</LName>
        <NationId>111</NationId>
        <Expr1>2</Expr1>
        <FaMarriage>نامشخص</FaMarriage>
        <EnMarriage>نامشخص</EnMarriage>
        <ArMarriage>نامشخص</ArMarriage>
        <FaNation>ایران</FaNation>
        <FaGender>مرد</FaGender>
        <EnGender>مرد</EnGender>
        <ArGender>مرد</ArGender>
        <Is_Sadat>false</Is_Sadat>
        <BirthDate>1978-03-21T00:00:00-07:00</BirthDate>
        <Code>1112</Code>
        <GenderId>2</GenderId>
        <ReligionId>6</ReligionId>
      </Table>  
    </DocumentElement>
  </diffgr:diffgram>
</DataTable>

If you try to use this xml directly in XML Source in data flow task you will face errors, so you should remove tags like diffgr diffgram from this xml and fetch only valuable data from it.

I used this C# Script to change the xml input:

System.Xml.XmlDocument xdoc = new System.Xml.XmlDocument();
            System.Xml.XmlDocument xdocResult = new System.Xml.XmlDocument();
            xdoc.Load(@"D:\SSIS\New folder\GetStudentByCode.xml");
            System.Xml.XmlNode ParentNew = xdocResult.CreateNode(System.Xml.XmlNodeType.Element, "Data", string.Empty);

            foreach (System.Xml.XmlNode xnode in xdoc.GetElementsByTagName("DocumentElement"))
            {
                foreach (System.Xml.XmlNode xchild in xnode.ChildNodes)
                {
                    xchild.Attributes.RemoveAll();
                    System.Xml.XmlNode importNode = ParentNew.OwnerDocument.ImportNode(xchild, true);
                    ParentNew.AppendChild(importNode);
                }
            }
            xdocResult.AppendChild(ParentNew);
            xdocResult.Save(@"D:\SSIS\New folder\GetStudentByCode_Converted.xml");

result of applying this code will be :

<Data>
  <Table>
    <PersonId>1</PersonId>
    <StudentId>1</StudentId>
    <FaAliyasFName>اکبر </FaAliyasFName>
    <EnFName>MAGERRAM</EnFName>
    <EnLName>KAZEM OV</EnLName>
    <MarriageId>1</MarriageId>
    <FName>محرم</FName>
    <LName>كاظم</LName>
    <NationId>111</NationId>
    <Expr1>1</Expr1>
    <FaMarriage>نامشخص</FaMarriage>
    <EnMarriage>نامشخص</EnMarriage>
    <ArMarriage>نامشخص</ArMarriage>
    <FaNation>ایران</FaNation>
    <FaGender>مرد</FaGender>
    <EnGender>مرد</EnGender>
    <ArGender>مرد</ArGender>
    <Is_Sadat>false</Is_Sadat>
    <BirthDate>1976-03-21T00:00:00-07:00</BirthDate>
    <Code>1111</Code>
    <GenderId>2</GenderId>
    <ReligionId>6</ReligionId>
  </Table>
  <Table>
    <PersonId>2</PersonId>
    <StudentId>2</StudentId>
    <EnFName>tahsin</EnFName>
    <EnLName>tahmaz ov</EnLName>
    <MarriageId>1</MarriageId>
    <FName>تحسين</FName>
    <LName>تحمازاف</LName>
    <NationId>111</NationId>
    <Expr1>2</Expr1>
    <FaMarriage>نامشخص</FaMarriage>
    <EnMarriage>نامشخص</EnMarriage>
    <ArMarriage>نامشخص</ArMarriage>
    <FaNation>ایران</FaNation>
    <FaGender>مرد</FaGender>
    <EnGender>مرد</EnGender>
    <ArGender>مرد</ArGender>
    <Is_Sadat>false</Is_Sadat>
    <BirthDate>1978-03-21T00:00:00-07:00</BirthDate>
    <Code>1112</Code>
    <GenderId>2</GenderId>
    <ReligionId>6</ReligionId>
  </Table>
</Data>

This is sample of using .NET System.XML Classes to flatten and making easy a complicated xml file. now you can use this xml result in an XML Source simply.


Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for nine continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.

Leave a Reply