Finding Multiple File Patterns in an SSIS Foreach Loop

2015-02-27
I had come across a rare requirement in which I needed a foreach loop to enumerate over two different file patterns and skip a similar third pattern.  The files were named like this:
FileType.MyFirstPattern.<DateTime>.csv
FileType.MySecondPattern.<DateTime>.csv
FileType.FileShouldNotBeProcessed.<DateTime>.csv
So using the pattern "FileType.*.csv" would not work because it would pick up all three.  Unfortunately, the foreach file enumerator in SSIS does not support multiple patterns like:
FileType.MyFirstPattern.*.csv;FileType.MySecondPattern.*.csv
Similarly, .Net gets a little hacky for the same task, which you might solve with Linq.  There are some robust third party components which support Regex patterns for SSIS, but that would mean installing and maintaining some third party tool for a one-off, unique scenario.

This is amazingly frustrating because it is solved so simply in powershell:
(get-childitems "MyFolder" -include FileType.MyFirstPattern.*.csv;FileType.MySecondPattern.*.csv).fullname
So I came up with this compromise - run powershell from a script task.  Put the results in a variable and then iterate over the variable in the foreach loop.



This is what the script task looks like:

        public void Main()
        {

            try
            {
                string pattern = Dts.Variables["User::FileSpec"].Value.ToString();
                string dropFolder = Dts.Variables["User::DropFolder"].Value.ToString();

                Process p = new Process();
                p.StartInfo.FileName = "powershell";
                p.StartInfo.Arguments = String.Format(@"(get-childitem {0}* -include {1}).fullname", dropFolder, pattern);
                p.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
                p.StartInfo.RedirectStandardOutput = true;
                p.StartInfo.UseShellExecute = false;
                p.Start();
                string[] split = new string[1]{Environment.NewLine};
                var files = p.StandardOutput.ReadToEnd().Split(split, StringSplitOptions.RemoveEmptyEntries);
                Dts.Variables["User::FilesToRead"].Value = files;
                p.WaitForExit();


            }
            catch (Exception e)
            {
                Dts.Events.FireError(0, "", e.Message, "", 0);
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
           

            Dts.TaskResult = (int)ScriptResults.Success;

        }


The source folder and FileSpec (pattern to search for) are stored in SSIS variables.  The FilesToRead is an object variable.  We redirect the results from powershell to the standard output, which is formatted as a string.  This is then split into a string array and pushed into the object variable.




Mark Wojciechowicz

Labels: ,