In my last post (http://anexinetbi.blogspot.com/2015/11/introduction-to-business-intelligence.html)
we went through a quick introduction to BIML – AKA Business Intelligence Markup
Language. We used it, and the help of BIDS Helper, to
generate a very basic SSIS package simply from the XML based BIML code. In this post, we are going to jump ahead to
the really cool things that BIML can do by adding in the concept of BIML
Script. With BIML Script, we are going
to very quickly create a separate extract and load package for all of Adventure
Works DW tables just by using metadata.
To jump right ahead, all demo files and samples are at the end of the
post.
BIML Script
Along with BIML proper, our friends at Varigence also have defined a scripting
language that can be used within BIML files, called BIML Script. BIML and BIML Script are somewhat like HTML
and ASP.NET. Using BIML, BIML Script, and
our BIDS Helper to generate the packages, we can take one BIML file and
generate multiple SSIS packages for different meta data with the same exact
design pattern. You can probably already
start to imagine the possibilities.
I’m going to jump right into the BIML Script part of things,
so please see my previous post linked above to get caught up. First some light housekeeping. In order to properly use BIML Script, we need
to import some references just as we would in other languages. We also need to define which .Net language we
want to use – C# or VB.Net. We will be
using C# for this demo.
<Biml
xmlns="http://schemas.varigence.com/biml.xsd">
<!--Directives-->
<#@ template language ="C#" #> <!--VB.Net or C#-->
<#@ import namespace="System.Data" #> <!--Imported namespaces to give
us access to some BIML Script methods-->
<#@ import namespace="Varigence.Biml.CoreLowerer.SchemaManagement" #>
</Biml>
The <#@ #> tags are called “Directives” and actually can be
included at the beginning or the end of our BIML file because the compiler
finds them first wherever they are. I
like to keep them at the bottom because I think it keeps our BIML files a
little cleaner.
Now the magic of BIML Script is the ability to embed .Net
code “nuggets” inside of the BIML. We
embed these code nuggets within <# #> tags as shown below. As you can see, within the BIML file we can
jump in and out of C#. Below we jump to
C# and define some variables.
Unfortunately, at this point Intellisense gets confused with the mixing
of languages, so we lose a lot of syntax highlighting going forward. I will try to indent and italicize the C#
nuggets for clarity.
<Biml
xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<Connection Name="EriksDB" ConnectionString="Data Source=localhost;Initial
Catalog=AdventureWorks2014;Provider=SQLNCLI11.1;Persist Security Info=True;Auto
Translate=False; Integrated Security=SSPI" />
<Connection Name="EriksDestDB" ConnectionString="Data Source=localhost;Initial
Catalog=AdventureWorks2014Destination;Provider=SQLNCLI11.1;Persist Security
Info=True;Auto Translate=False; Integrated Security=SSPI" />
</Connections>
<Packages>
<#
//Now we are in C#!
//Create some variables
string MyConnectionString
="Data Source=localhost;Initial
Catalog=AdventureWorks2014;Provider=SQLNCLI11.1;Persist Security Info=True;Auto
Translate=False; Integrated Security=SSPI" ;
string DestinationTableName;
string TableSchema;
string SourceTableName;
DataTable MyDataTable;
#>
<!--Back in BIML XML-->
</Packages>
</Biml>
As you can see, we can jump back and forth between BIML/XML
and BIML Script C# code just by using the <# #> tags. The important take away here is that the embedded
BIML Script will be “run” when you compile your BIML file and will be used in
the expansion of the resulting BIML file – which is then compiled into your
SSIS packages. You could use BIDS
Helper to compile the above BIML but it would not produce anything.
Setting Up Our Demo
Environment
Now that we have a basic idea of BIML Script, let’s use it
to dynamically create a bunch of simple extract and load packages. For this demo we will use the AdventureWorks
2014 DW sample, as well as a modified version of it for our destination db that
removes constraints to make it easier. Here is the
original AdventureWorks 2014 DW db backup, and here
is the modified version. First
restore the .bak file from codeplex to create the AdventureWorks2014 db on your
test server, then run CREATE_AW2014Destination.sql file to create a new db
called AW2014Destination with all empty tables.
Using BIML Script to
Dynamically Create Multiple Packages
We will walk through this piece by piece, and then summarize
at the end, so please bear with me on the code snippets. Below is a continuation of the C# snippet
from above which is embeded within the BIML <Packages> tag. What we added below is a GetDataTable method
using a SQL query (highlighted in blue) to pull a list of tables from the INFORMATION_SCHEMA.Tables
table. Some tables with hierarchy
datatypes were filtered out for simplicity in this demo.
We then set up a for each loop to loop over each row (AKA
each record of a table from the meta data) . Then using the #> tag, we close out of the
C# nugget and jump back to BIML XML with a <package> tag. Notice that within the <package> tag,
the Name attribute contains <#=SourceTableName#> - using the <#= … #>
syntax, we can reference variables from previous C# BIML Script nuggets. In this case, the SourceTableName parameter.
Eagle eyed readers will notice that in the snippet below,
there is no closing “}” to the foreach loop.
We’ll get to that shortly!
<Packages>
<#
//Now we are in C#!
string MyConnectionString
="Data Source=localhost;Initial
Catalog=AdventureWorks2014;Provider=SQLNCLI11.1;Persist Security Info=True;Auto
Translate=False; Integrated Security=SSPI" ;
string DestinationTableName;
string TableSchema;
string SourceTableName;
DataTable MyDataTable;
//Populate Data Table with
list of tables
MyDataTable =
ExternalDataAccess.GetDataTable(MyConnectionString,"SELECT t.TABLE_CATALOG, t.TABLE_SCHEMA,t.TABLE_NAME FROM
INFORMATION_SCHEMA.TABLES t WHERE t.TABLE_SCHEMA <> 'dbo' AND t.TABLE_TYPE ='Base
Table' AND t.TABLE_NAME NOT LIKE '%Document%' AND t.TABLE_NAME <>'Employee'");
/*Loop over each target table
row in MyDataTable */
foreach(DataRow row in
MyDataTable.Rows){
TableSchema =
row["TABLE_SCHEMA"].ToString();
SourceTableName =
row["TABLE_NAME"].ToString();
#>
<Package ConstraintMode="Linear" Name="Extract <#=SourceTableName#>"
ProtectionLevel="EncryptSensitiveWithUserKey"> <!--ConstraintMode connects the
tasks in order-->
The key thing to realize with the above snippet, is that
everything after the foreach loop, is STILL contained in the foreach loop –
including the BIML XML. This means that
for each row in MyDataTable, when the BIML file is expanded, everything
following the foreach { is repeated.
Below, everything within the box is repeated for every row in
MyDataTable. You will finally notice the
closing “}” bracket just after the </package> tag.
<Packages>
<#
//Now we are in C#!
string MyConnectionString
="Data Source=localhost;Initial
Catalog=AdventureWorks2014;Provider=SQLNCLI11.1;Persist Security Info=True;Auto
Translate=False; Integrated Security=SSPI" ;
string DestinationTableName;
string TableSchema;
string SourceTableName;
DataTable MyDataTable;
//Populate Data Table with
list of tables
MyDataTable =
ExternalDataAccess.GetDataTable(MyConnectionString,"SELECT
t.TABLE_CATALOG, t.TABLE_SCHEMA,t.TABLE_NAME FROM INFORMATION_SCHEMA.TABLES t
WHERE t.TABLE_SCHEMA <> 'dbo' AND t.TABLE_TYPE ='Base Table' AND t.TABLE_NAME NOT LIKE
'%Document%' AND t.TABLE_NAME <>'Employee'");
/*Loop over each target table
row in MyDataTable */
foreach(DataRow row in
MyDataTable.Rows){
TableSchema =
row["TABLE_SCHEMA"].ToString();
SourceTableName =
row["TABLE_NAME"].ToString();
#>
<Package ConstraintMode="Linear" Name="Extract <#=SourceTableName#>"
ProtectionLevel="EncryptSensitiveWithUserKey"> <!--ConstraintMode connects the
tasks in order-->
<Tasks>
<ExecuteSQL Name="Truncate <#=SourceTableName#>"
ConnectionName="EriksDestDB">
<DirectInput>
TRUNCATE TABLE <#=TableSchema#>.<#=SourceTableName#>
</DirectInput>
</ExecuteSQL>
<Dataflow Name ="DFT Extract" >
<Transformations>
<OleDbSource ConnectionName="EriksDB" Name="<#=SourceTableName#> Source" >
<DirectInput>
SELECT *
FROM <#=TableSchema#>.<#=SourceTableName#>
</DirectInput>
</OleDbSource>
<OleDbDestination Name="Insert <#=SourceTableName#>"
ConnectionName="EriksDestDB" KeepIdentity="true" >
<ExternalTableOutput Table="<#=TableSchema#>.<#=SourceTableName#>"></ExternalTableOutput>
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<#}#>
<!--End for each
MyDataTable loop-->
</Packages>
The whole BIML file for this demo can be downloaded here. A very brief review (see my previous post
for more) of what the BIML in the box above is actually doing is that it is
using <package> </package> tags to represent each package. Within that we have <tasks> which are
similar to the Control Flow. Here we
have an <ExecuteSQL> tag to TRUNCATE the current table in the loop. Then lastly we have the <DataFlow> tag
which contains a source and destination component based on the current table
name within the foreach loop. The table
names and schemas are represented as <#=TableSchema#>.<#=SourceTableName#> to access
the C# variables as described above.
Now all we have to do is Right Click our BIML file (in my
screenshot BimlScript_2.biml) and select Generate SSIS Packages (remember, you need
the free BIDS Helper add-in for this).
The results can also be seen in the below screenshot – dozens of
distinct SSIS packages, one for each Adventure Works table from the meta data.
The resulting .dtsx packages should look like the below –
Summary and Next
Steps
In this example we built on the basics of BIML from my first
post
on the topic and introduced the concept of BIML Script – C# or VB.Net code
nuggets embedded within the BIML XML itself.
Using these embedded code snippets, we gain the extremely powerful
ability to generate large numbers of ETL packages dynamically with very few
lines of code. This demo required less than 75 lines of code in the .biml file and only minutes to write, whereas manually creating all of these - even simple - SSIS packages would have taken a few hours.
In upcoming posts we will continue to build on the concept
of dynamic SSIS package generation and try to address some tips and common
pitfalls.
Sample Files:
Erik
Labels: BIML, Dynamic, ETL, SSIS