Archives

Generating Biml Scripts Using Excel

This post is part of a series, Short Cuts to Biml Mastery which shows you how to learn BimlScript through a series of short cuts and remain productive while doing so. In the previous article, I showed you how to simplify Biml code that was imported using BimlExpress or BimlStudio. All of the code used in this post can be downloaded here.

In this article, we’ll use Excel to dynamically generate the Biml code. First a disclaimer, I am not an Excel expert, I just know enough to be dangerous. This workbook will accomplish what we want, but is probably not the most elegant way to do it. To refresh your memory, this is the Biml from the previous article that we need to automate:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <FlatFileConnection Name="conn_ff_Orders" FilePath="C:\FileExport\Orders.csv" FileFormat="fff_Orders"/>
    </Connections>
    <Packages>
        <Package Name="Export Orders" ConstraintMode="Linear" ProtectionLevel="DontSaveSensitive" PackageSubpath="Export Flat Files">
            <Tasks>
                <Dataflow Name="DFT Load file" >
                    <Transformations>
                        <OleDbSource Name="OLE_SRC Orders" ConnectionName="connNorthwind">
                            <DirectInput>
SELECT * FROM dbo.Orders
                            </DirectInput>
                        </OleDbSource>
                        <FlatFileDestination Name="FF_DEST Orders"  ConnectionName="conn_ff_Orders" Overwrite="true" ValidateExternalMetadata="false"/>
                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
    <FileFormats>
        <FlatFileFormat Name="fff_Orders" FlatFileType="Delimited"  ColumnNamesInFirstDataRow="true" TextQualifier=""">
            <Columns>
                <Column Name="OrderID" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="CustomerID" DataType="StringFixedLength" Length="5" Delimiter="Comma"/> 
                <Column Name="EmployeeID" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="OrderDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="RequiredDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="ShippedDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="ShipVia" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="Freight" DataType="Currency" Length="0" Delimiter="Comma"/> 
                <Column Name="ShipName" DataType="String" Length="40" Delimiter="Comma"/> 
                <Column Name="ShipAddress" DataType="String" Length="60" Delimiter="Comma"/> 
                <Column Name="ShipCity" DataType="String" Length="15" Delimiter="Comma"/> 
                <Column Name="ShipRegion" DataType="String" Length="15" Delimiter="Comma"/> 
                <Column Name="ShipPostalCode" DataType="String" Length="10" Delimiter="Comma"/> 
                <Column Name="ShipCountry" DataType="String" Length="15" Delimiter="CRLF"/> 
            </Columns>
        </FlatFileFormat>
    </FileFormats>
</Biml>

We need a way to way to derive the metadata from a SQL query. SQL Server 2012 included some handy new metadata discovery features, and we’re going to use sys.dm_exec_describe_first_result_set. The primary attributes we need to derive are column name, SQL data type, and length. So, we’ll start with the following query and result set:

USE Northwind
GO

DECLARE @SrcQuery NVARCHAR(4000);
SET @SrcQuery = 'SELECT * FROM dbo.Orders';

SELECT name [Column Name]
 ,system_type_name 
 ,max_length 
 FROM sys.dm_exec_describe_first_result_set(@SrcQuery, NULL, 1) 
 WHERE [name] is not NULL  and is_hidden = 0
Column Namesystem_type_namemax_length
OrderIDint4
CustomerIDnchar(5)10
EmployeeIDint4
OrderDatedatetime8
RequiredDatedatetime8
ShippedDatedatetime8
ShipViaint4
Freightmoney8
ShipNamenvarchar(40)80
ShipAddressnvarchar(60)120
ShipCitynvarchar(15)30
ShipRegionnvarchar(15)30
ShipPostalCodenvarchar(10)20
ShipCountrynvarchar(15)30

The next thing we need to do is to derive the base SQL data type from system_type_name. In other words, we need to strip off any length, precision, or scale values from it. We could do this in Excel, but I prefer to do it in T-SQL.

DECLARE @SrcQuery NVARCHAR(4000);
SET @SrcQuery = 'SELECT * FROM dbo.Orders';
 
SELECT name [Column Name]
 ,CASE WHEN charindex('(',system_type_name) = 0 THEN system_type_name
  ELSE left(system_type_name, charindex('(',system_type_name) - 1)
  END [SQL DT]
 ,max_length 
  FROM sys.dm_exec_describe_first_result_set(@SrcQuery, NULL, 1) where [name] is not NULL  and is_hidden = 0
Column NameSQL DTmax_length
OrderIDint4
CustomerIDnchar10
EmployeeIDint4
OrderDatedatetime8
RequiredDatedatetime8
ShippedDatedatetime8
ShipViaint4
Freightmoney8
ShipNamenvarchar80
ShipAddressnvarchar120
ShipCitynvarchar30
ShipRegionnvarchar30
ShipPostalCodenvarchar20
ShipCountrynvarchar30

Finally, we need to convert max_length to an appropriate value for Biml Length. The Biml length for Unicode strings must be half the max_length, ANSI strings like char or varchar use max_length, and other data types will be set to 0. To make this easier, I’ll wrap the above code into a subquery like this:

DECLARE @SrcQuery NVARCHAR(4000);
SET @SrcQuery = 'SELECT * FROM dbo.Orders';

SELECT 
 t.[Column Name]
,t.[SQL DT]
,CASE 
WHEN t.[SQL DT] in ('char', 'varchar','text') THEN (t.max_length)
WHEN t.[SQL DT] in ('nchar', 'nvarchar','ntext') THEN (t.max_length/2)
ELSE 0 END [Length]
FROM (SELECT name [Column Name]
 ,CASE WHEN charindex('(',system_type_name) = 0 THEN system_type_name
  ELSE left(system_type_name, charindex('(',system_type_name) - 1)
  END [SQL DT]
 ,max_length
   FROM sys.dm_exec_describe_first_result_set(@SrcQuery, NULL, 1) where [name] is not NULL  and is_hidden = 0) t
Column NameSQL DTLength
OrderIDint0
CustomerIDnchar5
EmployeeIDint0
OrderDatedatetime0
RequiredDatedatetime0
ShippedDatedatetime0
ShipViaint0
Freightmoney0
ShipNamenvarchar40
ShipAddressnvarchar60
ShipCitynvarchar15
ShipRegionnvarchar15
ShipPostalCodenvarchar10
ShipCountrynvarchar15

The rest of our work will be in Excel, so please download the solution files so you can follow along. Open Delimited Feed Metadata Generator.xlsx and you’ll find two sheets included. The first contains a conversion table of SQL Server, Biml, and SSIS data types. Cathrine Wilhelmsen has a comprehensive blog article about data type conversion. Below are the contents of my Excel lookup table.

SQL ServerBimlSSIS
tinyintByteDT_UI1
smallintInt16DT_I2
intInt32DT_I4
bigintInt64DT_I8
charAnsiStringFixedLengthDT_STR
varcharAnsiStringDT_STR
ncharStringFixedLengthDT_WSTR
nvarcharStringDT_WSTR
dateDateDT_DBDATE
datetimeDateTimeDT_DBTIMESTAMP
datetime2DateTime2DT_DBTIMESTAMP2
datetimeoffsetDateTimeOffsetDT_DBTIMESTAMPOFFSET
smalldatetimeDateTimeDT_DBTIMESTAMP
timeTimeDT_DBTIME2
decimalDecimalDT_NUMERIC
numericDecimalDT_NUMERIC
bitBooleanDT_BOOL
floatDoubleDT_R8
realSingleDT_R4
moneyCurrencyDT_CY
smallmoneyCurrencyDT_CY
textAnsiStringDT_TEXT
ntextStringDT_NTEXT
timestampBinaryDT_BYTES
uniqueidentifierGuidDT_GUID
varbinaryBinaryDT_BYTES
xmlXmlDT_NTEXT
geographyObjectDT_IMAGE
geometryObjectDT_IMAGE
hierarchyidObjectDT_BYTES
imageBinaryDT_IMAGE
rowversionBinaryDT_BYTES
sql_variantObjectDT_WSTR

The second sheet is the template that we’ll use to create our feed Biml itself. Use the following steps to create your first export:

  • Make a copy of the FF Export Template and rename the sheet tab to “Orders”. Notice the Feed Name in cell B1 defaults to the tab name.
  • Enter the source query we’ve been using (SELECT * FROM dbo.Orders) into the shaded blue cell B2. Notice that the Biml in column G begins to generate.
  • Execute the latest metadata query against Northwind in SQL Server Management Studio.
  • Copy and paste the results into the blue shaded area starting on row 5.
  • Notice that the remaining Biml code for the Export Orders package is expanded for you in column G.
  • Open the included Visual Studio SSIS project and create a new Biml file. You can do so through the BimlExpress menu or Right Click on the project name in the Solution Explorer and select “Add New Biml File.”
  • Copy and paste the contents of column G into the new Biml file. Multi-select the new Biml file and Environment.Biml in the Solution Explorer, right click, and select Generate SSIS Packages. Your new package should generate and function correctly.

This is the generated Biml from the spreadsheet. Note that it looks identical to the Biml from the previous article.

<!-- Orders.biml -->
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <FlatFileConnection Name="conn_ff_Orders" FilePath="C:\FileExport\Orders.csv" FileFormat="fff_Orders"/>
    </Connections>
    <Packages>
        <Package Name="Export Orders" ConstraintMode="Linear"  ProtectionLevel="DontSaveSensitive" PackageSubpath="Export Flat Files">
            <Tasks>
                <Dataflow Name="DFT Load file" >
                    <Transformations>
                        <OleDbSource Name="OLE_SRC Orders" ConnectionName="connNorthwind">
                            <DirectInput>
                                 SELECT * FROM dbo.Orders
                            </DirectInput>
                        </OleDbSource>
                        <FlatFileDestination Name="FF_DEST Orders"  ConnectionName="conn_ff_Orders" Overwrite="true" ValidateExternalMetadata="false"/> 
                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
    <FileFormats>
        <FlatFileFormat Name="fff_Orders" FlatFileType="Delimited"  ColumnNamesInFirstDataRow="true" TextQualifier=""">
            <Columns>
                <Column Name="OrderID" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="CustomerID" DataType="StringFixedLength" Length="5" Delimiter="Comma"/> 
                <Column Name="EmployeeID" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="OrderDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="RequiredDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="ShippedDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="ShipVia" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="Freight" DataType="Currency" Length="0" Delimiter="Comma"/> 
                <Column Name="ShipName" DataType="String" Length="40" Delimiter="Comma"/> 
                <Column Name="ShipAddress" DataType="String" Length="60" Delimiter="Comma"/> 
                <Column Name="ShipCity" DataType="String" Length="15" Delimiter="Comma"/> 
                <Column Name="ShipRegion" DataType="String" Length="15" Delimiter="Comma"/> 
                <Column Name="ShipPostalCode" DataType="String" Length="10" Delimiter="Comma"/> 
                <Column Name="ShipCountry" DataType="String" Length="15" Delimiter="CRLF"/> 
            </Columns>
        </FlatFileFormat>
    </FileFormats>
</Biml>

This is simple, right? More importantly, it’s very quick to create an endless number of export feeds by repeating the steps above.

Now let’s explore some of the Excel code which generates the Biml. The first calculated column (D) converts the base SQL data type to Biml using a standard VLookup. Since file exports have a variable number of columns, any blank spreadsheet row returns nothing.

=IF(B5 <> "",VLOOKUP(B5,BimlDT,2,FALSE),"")

The next column determines whether a Comma or CRLF delimiter is appropriate. Basically, the code checks to see if a Biml Data Type has been calculated on it’s row and at the cell beneath it. This is a simple way to determine if the feed column is the last in the feed.

=IF(D5 <> "", IF(E6 <> "", "Comma","CRLF"),"")

The actual column Biml creation begins in F24. The reason for this is that the “final” Biml in column G has to specify the Connection and Package code before we can tackle the FlatFileFormat details. So, row 24 generates from the column metadata in column 5. Remember that the column code ends up looking like this:

            <Columns>
                <Column Name="OrderID" DataType="Int32" Length="0" Delimiter="Comma"/>
                <Column Name="CustomerID" DataType="StringFixedLength" Length="5" Delimiter="Comma"/>

Notice that I’m actually indenting final code because I use this spreadsheet in my presentations as well and want it to look nice. This, coupled with the conditional display logic, makes the code a bit more difficult to follow. Since “scripting Biml” is meant to be a temporary short cut on your way to Biml mastery, feel free to skip the aesthetic aspects and just generate Biml code. Also, the double quote character is reserved in the Excel formula language and must be escaped with more quotes or by using CHAR(34). With that in mind, this is what the column generation code looks like.

=IF(D5 <> "", CONCAT("                <Column Name=""", A5,""" DataType=""", D5,""" Length=""", C5,""" Delimiter=""", E5,"""/> "), "")

Now let’s look at column G. Basically, the reason that this column exists, is to prettify the generated Biml and make it easier to use. This sheet currently supports fixed width feeds of up to 100 columns, but most feeds don’t require that many. Instead of having a large set of blank cells between the last column line and </Columns>, this code detects the last feed column required and then adds the final four statements. Again, you may want to skip this level of detail when creating your own Excel shortcuts to make things easier. The important code begins in cell G24. Through a series of IF statements it looks at column F to determine when the Biml column metadata is finished and then applies the correct closing snippets in the next four rows.

=IF(F24<>"",F24,IF(F23<>"", "            </Columns>", IF(F22<>"", "        </FlatFileFormat>", IF(F21<>"", "    </FileFormats>", IF(F20<>"", "</Biml>", "")))))
                <Column Name="ShipPostalCode" DataType="String" Length="10" Delimiter="Comma"/> 
                <Column Name="ShipCountry" DataType="String" Length="15" Delimiter="CRLF"/> 
            </Columns>
        </FlatFileFormat>
    </FileFormats>
</Biml>

If desired, we could continue refining this process until it’s almost entirely automated. For example, we could create a Tab containing connection information including the connection string. Add a Source Connection value in B3, and use Visual Basic for Applications to execute the metadata query above against the source and automatically populate column metadata in columns A, B, and C. This VBA macro could be tied to a button for ease of use. You could even automate the creation of the individual Biml files on disk as well. How far you want to go with this is up to you.

Feel free to create addition flat file exports from the Northwind database by following the steps above. In the next article, we’ll refine the metadata query above and generate the entire Biml script using only T-SQL.

Stay Tuned!

All of the code used in this post can be downloaded here.

1 comment to Generating Biml Scripts Using Excel

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>