Assume .csv file which columns delimited by comma ( , ) . and Texts delimited by Double Quote ( " ) like below:
"Invoice Number","Invoice Status","Invoice Date of Service","Invoice Date Opened","Invoice Hold from printing/submission","Invoice Branch","Invoice Balances Adjustments","Invoice Balances Payments","Invoice Balances Balance","Patient Last Name","Patient First Name","Patient ID","Policy Payor Name","Invoice Detail Item ID","Invoice Detail Item Name","Invoice Detail Balance","Invoice Detail Qty","Invoice Detail Item Group"
"12345","Open","3/31/2009 12:00:00 AM","5/26/2009 8:09:02 AM","False","Test","($346.23)","$0.00","$346.23","Smith","John","12344","[Patient]","Test-1234","5""x8"" Multi-Axis Stump Support, Therafin, 31552","$51.12","1","Wheelchair Accessories"
If you look at "5""x8"" Multi-Axis Stump Support, Therafin, 31552" , there are some ( " ) and ( , ) inside this value.
So, How it must be handled in SSIS Flat File Source Connection Manager?
If you set column delimiter as comma ( , ) and text qualifier as double quote ( " ) you will receive this error:
The preview sample contains embedded text qualifiers ("). The flat file parser does not support embedding text qualifiers in data. Parsing columns that contain data with text qualifiers will fail at run time.
the error says that you can not have delimiter values like comma inside a column.
Solution is:
1-change every comma values with another unique delimiter which values haven’t any of this characters inside,like : vertical bar ( | )
2- change column delimiter to this new delimiter , and set text qualifier with double quote ( " )
for the above .csv file, we can use ( | ) as delimiter because there is no instance of this characters inside values
but we can not replace any comma ( , ) with vertical bar ( | ). because there are commas who are inside the values . So I replaced every "," with "|" .
this will replace only column delimiters surrounded by text qualifiers .
the result will be
"Invoice Number"|"Invoice Status"|"Invoice Date of Service"|"Invoice Date Opened"|"Invoice Hold from printing/submission"|"Invoice Branch"|"Invoice Balances Adjustments"|"Invoice Balances Payments"|"Invoice Balances Balance"|"Patient Last Name"|"Patient First Name"|"Patient ID"|"Policy Payor Name"|"Invoice Detail Item ID"|"Invoice Detail Item Name"|"Invoice Detail Balance"|"Invoice Detail Qty"|"Invoice Detail Item Group"
"12345"|"Open"|"3/31/2009 12:00:00 AM"|"5/26/2009 8:09:02 AM"|"False"|"Test"|"($346.23)"|"$0.00"|"$346.23"|"Smith"|"John"|"12344"|"[Patient]"|"Test-1234"|"5""x8"" Multi-Axis Stump Support, Therafin, 31552"|"$51.12"|"1"|"Wheelchair Accessories"
as you can see there is no delimiter inside values , and now we can set column delimiter as ( | ) , and text qualifier as ( " ) .
everything is OK now.
but for automating the replace process, I added a Script Task before Dataflow Task and added this code in Visual C# 2008 for replacing delimiters:
public void Main()
{
System.IO.StreamReader sr = new System.IO.StreamReader(@"C:\Users\Reza\Downloads\test(2).csv");
string str1=(char)34+","+(char)34;
string str2=(char)34+"|"+(char)34;
string strContent = sr.ReadToEnd().Replace(str1, str2);
sr.Close();
sr.Dispose();
System.IO.StreamWriter sw = new System.IO.StreamWriter(@"C:\Users\Reza\Downloads\test(2).csv",false);
sw.Write(strContent);
sw.Flush();
sw.Close();
sw.Dispose();
Dts.TaskResult = (int)ScriptResults.Success;
}
Thats All.