Regular Expression in SSIS

Regular Expressions are very useful expressions for text processing

there are many usages like validation a text against a pattern or find appropriate parts of text with defined patterns which can be solved with Regular Expressions.

To find out more about Regular Expressions read here.

Today I find a simple case of such things in SSIS, SSIS will act great if combined with Regular Expressions.

Consider a case when Input column has values like this:


2011\09\23 rev.018
2011\09\26 rev.019
2011\09\25 rev.005
\\ rev.

Desired output is to fetch out just date part like this:







Suppose we have a text file which contains source data, 

after the source add a Script Component as Transformation and set the Col1 as Input Column, and create new output columns of type DT_STR, name this as OutputCleansed

then Set language as C#, and Edit script, write this script to apply the Regular Expression to input column’s data as below:

public override void Input0_ProcessInputRow(Input0Buffer Row)
        Regex reg = new Regex(@"\w{4}\\\w{2}\\\w{2}");
        if (Row.col1_IsNull || string.IsNullOrEmpty(Row.col1.Trim()))
            Row.OutputCleansed_IsNull = true;
            if (reg.IsMatch(Row.col1))
                Row.OutputCleansed = reg.Match(Row.col1).Groups[0].Value;
                Row.OutputCleansed_IsNull = true;

Note that for using Regular Expressions in Script you need to add this using part :

using System.Text.RegularExpressions;

The expression used in this sample is just to fetch YYYY\MM\DD part , and the expression is : \w{4}\\\w{2}\\\w{2}

but for any other cases you can use any other regular expression, a quick reference of regular expressions can be found here:

After the Script Component add a destination, and add a Data Viewer.

this is a sample of desired output fetched by Script Component resorting Regular Expressions:

Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for nine continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog:
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.

Leave a Reply