Archives

My Wife and I are Presenting at SQL Saturday Dallas on June 1st!

It might surprise some of you that my lovely wife Melissa is an ETL developer as well, and she’ll be giving her first presentation at SQL Saturday Dallas.

My Better Half

She will be giving the following presentation, while I act as her lovely assistant.

The True Story of Learning Biml

How do you learn Biml when you don’t have years of SQL experience? Is Biml even worth my time and energy when I’m still working on perfecting my SSIS skills? It seems like a lot of work, what is in it for me? In this session you will learn:

  • Why Biml is worth your time and effort.
  • How to identify ideal SSIS packages for conversion.
  • How to convert those packages into Biml.
  • How to troubleshoot your Biml build errors when Google fails you.
  • How to automate SSIS package creation with BimlScript.

I can’t let her have all the fun, so I’m presenting an updated version of the following:

Azure Data Factory V2 and SSIS Automation With Biml

SSIS was originally designed to work in a world where all data was structured and on premise. As modern enterprise data projects increasingly move online, tools like Azure Data Factory (ADF) can extend SSIS capabilities to the cloud. Azure Data Factory is a flexible and powerful cloud based ETL service, but it doesn’t replace the functionality of SSIS. In this session you will learn:

  • The advantages and disadvantages of both ETL platforms.
  • How to use both in tandem to maximize productivity.
  • How both ETL platforms, as well as any supporting structures, can be be automatically generated with Biml Script.

All of the code used in this presentation can be downloaded by attendees, customized, and used in their production environments.

Anyway, it’ll be a great time, and I can’t wait to see some of you there!

Simplifying Imported Biml

This post is part of a series, Short Cuts to Biml Mastery which shows you how to learn BimlScript through a series of short cuts and remain productive while doing so. In the previous article, I covered how to import existing packages into Biml using BimlExpress. In this article, we’ll examine that Biml and simplify it in preparation for automation. If you’ve been reading along in this series, you’ll remember that we started with a single SSIS package which extracts data from a SQL source and creates a CSV file with quote identifiers. The generated file looks like the following:

"OrderID","CustomerID","EmployeeID","OrderDate","RequiredDate","ShippedDate","ShipVia","Freight","ShipName","ShipAddress","ShipCity","ShipRegion","ShipPostalCode","ShipCountry"
"10262","","8","1996-07-22 00:00:00","1996-08-19 00:00:00","1996-07-25 00:00:00","3","48.29","Rattlesnake Canyon Grocery","2817 Milton Dr.","Albuquerque","NM","87110","USA"
"10317","","6","1996-09-30 00:00:00","1996-10-28 00:00:00","1996-10-10 00:00:00","1","12.69","Lonesome Pine Restaurant","89 Chiaroscuro Rd.","Portland","OR","97219","USA"

The raw imported Biml is below:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <Connection Name="connNorthwind" ConnectionString="provider=SQLNCLI11;data source=localhost;integrated security=SSPI;initial catalog=Northwind" />
        <FlatFileConnection Name="conn_ff_Orders" FilePath="C:\FileExport\Orders.csv" FileFormat="conn_ff_Orders" />
    </Connections>
    <Packages>
        <Package Name="Export Orders" Language="None" ConstraintMode="LinearOnCompletion" ProtectionLevel="DontSaveSensitive">
            <Tasks>
                <Dataflow Name="DFT Load file">
                    <Transformations>
                        <OleDbSource Name="OLE_SRC Orders" ConnectionName="connNorthwind">
                            <DirectInput>
								SELECT * FROM dbo.Orders
                            </DirectInput>
                        </OleDbSource>
                        <FlatFileDestination Name="FF_DEST Orders" ValidateExternalMetadata="false" LocaleId="None" ConnectionName="conn_ff_Orders">
                            <Header></Header>
                        </FlatFileDestination>
                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
    <FileFormats>
        <FlatFileFormat Name="conn_ff_Orders" CodePage="1252" TextQualifier="_x0022_" ColumnNamesInFirstDataRow="true" RowDelimiter="">
            <Columns>
                <Column Name="OrderID" Delimiter="Comma" />
                <Column Name="CustomerID" Length="5" DataType="String" Delimiter="Comma" MaximumWidth="5" />
                <Column Name="EmployeeID" Delimiter="Comma" />
                <Column Name="OrderDate" DataType="DateTime" Delimiter="Comma" />
                <Column Name="RequiredDate" DataType="DateTime" Delimiter="Comma" />
                <Column Name="ShippedDate" DataType="DateTime" Delimiter="Comma" />
                <Column Name="ShipVia" Delimiter="Comma" />
                <Column Name="Freight" DataType="Currency" Delimiter="Comma" />
                <Column Name="ShipName" Length="40" DataType="String" Delimiter="Comma" MaximumWidth="40" />
                <Column Name="ShipAddress" Length="60" DataType="String" Delimiter="Comma" MaximumWidth="60" />
                <Column Name="ShipCity" Length="15" DataType="String" Delimiter="Comma" MaximumWidth="15" />
                <Column Name="ShipRegion" Length="15" DataType="String" Delimiter="Comma" MaximumWidth="15" />
                <Column Name="ShipPostalCode" Length="10" DataType="String" Delimiter="Comma" MaximumWidth="10" />
                <Column Name="ShipCountry" Length="15" DataType="String" Delimiter="CRLF" MaximumWidth="15" />
            </Columns>
        </FlatFileFormat>
    </FileFormats>
</Biml>

When first learning Biml, you should examine the Biml code and the SSIS package side by side. The first section (lines 2 through 5) covers the connections, one of which is an OLE DB connection to the Northwind database. The second connection is involved with the flat file export.

The second section (lines 6 through 23) describe the package and is fairly self explanatory. A single package, composed of a single data flow task, containing only the OLE DB source and flat file destination.

The third section (lines 24 through 43) is a bit more complicated. As shown previously, flat file connections require more configuration. Compare what you see in this section to these screenshots from a previous article.

The connection manager general tab configuration corresponds to line 25 from the code above. The most confusing thing would be the value “x0022” for TextQualifier, which is the hexadecimal value for double quotes. The double quotes is a reserved XML character, so we’ll substitute the following: &quot;

The rest of the FlatFileFormat describes the column information which corresponds to this:

So, when you compare the Biml to the SSIS GUI, everything starts to make sense. The columns typically specify the Biml data type, length and/or maximum width, and delimiter character. Notice that every column is “Comma” delimited except the last which specifies “CRLF”, carriage return line feed. These settings match the SSIS package flat file manager editor advanced tab.

Remember that our ultimate goal is to be able to loop through a specified set of queries (or tables, views, etc.) and dynamically create an export package for each. While the Northwind connection is shared by every export package, it can only be specified once. So, we’ll move that connection to it’s own file, Environment.biml.

<!-- Environment.biml -->
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
		<Connection Name="connNorthwind" ConnectionString="provider=SQLNCLI11;data source=localhost;integrated security=SSPI;initial catalog=Northwind" />
    </Connections>	
</Biml>

The last section, the FlatFileFormat, is the most involved because it requires column metadata which is very dynamic. I’ve made the following changes to the FlatFileFormat biml to make it more consistent and easier to automate.

  • Replaced the TextQualifier value mentioned above.
  • Removed the MaximumWidth attribute as it is not necessary.
  • Added the Length attribute to every column, even though it’s only required for character data types. A value of 0 is ignored anyway.

After our cleanup, the Export Orders Pkg.biml looks like this.

<!-- Export Orders Pkg.biml -->
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <FlatFileConnection Name="conn_ff_Orders" FilePath="C:\FileExport\Orders.csv" FileFormat="fff_Orders"/>
    </Connections>
    <Packages>
        <Package Name="Export Orders" ConstraintMode="Linear" ProtectionLevel="DontSaveSensitive" PackageSubpath="Export Flat Files">
            <Tasks>
                <Dataflow Name="DFT Load file" >
                    <Transformations>
                        <OleDbSource Name="OLE_SRC Orders" ConnectionName="connNorthwind">
                            <DirectInput>
SELECT * FROM dbo.Orders
                            </DirectInput>
                        </OleDbSource>
                        <FlatFileDestination Name="FF_DEST Orders"  ConnectionName="conn_ff_Orders" Overwrite="true" ValidateExternalMetadata="false"/>
                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
    <FileFormats>
        <FlatFileFormat Name="fff_Orders" FlatFileType="Delimited"  ColumnNamesInFirstDataRow="true" TextQualifier="&quot;">
            <Columns>
                <Column Name="OrderID" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="CustomerID" DataType="StringFixedLength" Length="5" Delimiter="Comma"/> 
                <Column Name="EmployeeID" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="OrderDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="RequiredDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="ShippedDate" DataType="DateTime" Length="0" Delimiter="Comma"/> 
                <Column Name="ShipVia" DataType="Int32" Length="0" Delimiter="Comma"/> 
                <Column Name="Freight" DataType="Currency" Length="0" Delimiter="Comma"/> 
                <Column Name="ShipName" DataType="String" Length="40" Delimiter="Comma"/> 
                <Column Name="ShipAddress" DataType="String" Length="60" Delimiter="Comma"/> 
                <Column Name="ShipCity" DataType="String" Length="15" Delimiter="Comma"/> 
                <Column Name="ShipRegion" DataType="String" Length="15" Delimiter="Comma"/> 
                <Column Name="ShipPostalCode" DataType="String" Length="10" Delimiter="Comma"/> 
                <Column Name="ShipCountry" DataType="String" Length="15" Delimiter="CRLF"/> 
            </Columns>
        </FlatFileFormat>
    </FileFormats>
</Biml>

So, the Biml is easier to read now, but how are we going to automate it? Well, the ultimate solution is to use either C# or VB.NET code nuggets to loop through the entire file and create each package. This qualifies as BimlScript. However, this series is about short cuts which involve string concatenation. It’s faster and easier (for some), and I call this method Scripting Biml. In the next article, we’ll use Excel to generate our Biml dynamically, which will be our first productivity improvement.

The Export Orders Pkg and Environment Biml files we just created can be downloaded here.

Loving What You Do

Ever heard the following quote?

Choose a job you love, and you will never have to work a day in your life.

I can’t credit that because nobody knows who coined it. I always thought it was pie in the sky thinking, and tended to lean towards a work to live philosophy.

Don’t get me wrong, I find dimensional modeling, data warehousing, and many other BI tasks fascinating and rewarding. But it still feels like work. In particular, ETL development can become repetitious and boring after you’ve done it for years. It might surprise some that I seriously considered abandoning ETL development many years ago. I wanted to focus more on virtual OLAP models, and skip the physical data warehouse load entirely. Blasphemy! Luckily I found Biml and have steadily automated almost all of the tedium from my work.

Tim Costello and I – Pass Summit 2017

At a SQL Saturday event many years ago I ran into Tim Costello (Blog/Twitter) in the speaker room. He was hunched over his laptop actively working on something. I teased him by asking, “Hey, are you still writing your presentation?” He just laughed and kept messing with the computer. So, I sat near him so I could go my own PowerPoint presentation one last time.

I glanced over and Tim looked like he was playing a video game, lots of screen motion and color. He wasn’t, he was using Tableau. I asked him if he was fixing something for a client, but he wasn’t. He didn’t have a goal in mind, he simply loved playing with Tableau. Tim is someone who would do this stuff for free, and that inspires me.

Fast forward to two months ago. I had created a Biml framework which shreds most flat files and automatically stores the records into an Operational Data Store (ODS) for historical staging. In order to test all of the possible attribute variations, I needed to find or create many sample flat files representing each attribute value. For example, I needed to test delimited, fixed width, and ragged-right files. Files can be delimited with multiple characters, use various text qualifiers, have first row column names, etc. I could have created a SSIS package that exports a sample file and repeatedly edit it until all scenarios were covered. I started to do that and quickly realized how tedious and boring this would be.

Instead, I created a basic flat file export Biml framework that eventually became the basis for my latest blog series. Anyway, after several revisions it became a production level framework that will continue to provide value in many future projects. Why did I do this? Because, it was fun. It didn’t seem like work, I was “playing” with Biml, and was doing it for free.

Biml mastery is not easy, but it substitutes thought and effort for repetition and boredom. Now, I truly love what I do and can’t believe I get paid to work on puzzles every day.

So here’s to Tim, who’s on his own Biml journey. Thanks for the inspiration.

How to Import Existing SSIS Packages into Biml With BimlExpress

I’m asked the same question almost every time I present on Biml. “How do I import my existing SSIS packages?” Well in this post, I’m going to show you how to do it with BimlExpress 2019 for free.

This post is part of a series, Short Cuts to Biml Mastery which shows you how to learn BimlScript through a series of short cuts and remain productive while doing so.

What is BimlExpress?

BimlExpress is a free Visual Studio add-in created by the good folks at Varigence. Its a full featured Biml editor which allows you to dynamically create SSIS packages. It was first released back in 2017, and the latest version is 2019 (of course). The current version supports Visual Studio 2010 through 2019 as well as SQL Server 2005 through 2019.

Prior to it’s release, Biml was written with Bids Helper, now known as BI Developer Extensions. While BI Developer Extensions has many nice features, you should no longer use it to work with Biml as it is no longer being updated/supported.

How do I use it?

Well, first download the BimlExpress VSIX install file, run it, and follow the prompts. Next, please download the Northwind sample database and restore it to your development machine. Finally, download this VS Sample Project. As I mentioned previously, this simple package forms the basis for the next several blog posts. Open the package, you can see that it exports from a SQL Server table to a csv flat file. Edit the connection details if necessary, and create the location for the export file. The default location is C:\FileExport\.

SSIS Data Flow
Flat File General Tab

Flat File Advanced Tab

Most of the functionality of BimlExpress can be found in the custom toolbar. For our demo, make sure the Export Orders.dtsx package is selected, and then choose “Convert SSIS Packages to Biml.”

You can also right-click either the package or the entire project in the Solution Explorer to import as well.

Import Packages

This is the Import Packages window. There are four options for SSIS Asset Location:

  • DTProj from File System
  • ISPAC from File System
  • DTSX from File System
  • Project from Project Catalog

Select “DTSX from File System” since we are importing a single package. There are all sorts of Import Options which I don’t want to take the time to go into. Most often, I leave all of them unchecked anyway.

Imported Assets

Select Import and examine the imported assets. Everything looks correct, so click “Add to Project”.

Imported Biml

As you can see in the above screen shot, BimlExpress has made great strides in usability. Not only can you import packages, but you can edit BimlScript and see the results immediately in the Preview Pane. Notice that all of the Biml is imported into one file. We’ll examine and simplify the imported Biml in my next post.

To test your Biml file, edit the Package name to something different like “Export Orders Biml”. Save the change, right click on BimlScript.biml, and select “Generate SSIS Packages”. The package should generate and execute without error if you have the correct connection strings and such. Feel free to run the package and examine the resulting csv file. After the successful test, generate the package a second time and you’ll receive the following confirmation request to overwrite an existing package:

Confirm Overwrite

Next Steps

In the next article, we’ll examine our imported Biml code, clean it up, and start down the path of dynamic package generation.

Stay Tuned!

Short Cuts to Biml Mastery

I’ve been giving Biml related presentations for many years and shared what I believe are the most efficient ways to learn it. It’s possible to get enough Biml knowledge to become productive in a number of weeks, assuming one has time to focus on it. However context switching, commonly referred to as multi-tasking, kills the focus and concentration required to learn complex tasks. The unfortunate reality is that most employers can’t (or won’t) allow this kind of schedule flexibility.

So, how can you learn to apply Biml in a way that works within your schedule? Well, this series of articles will cover that very topic. I’ll show you some short cuts to generating Biml that will give you immediate productivity, so you can leverage “free” time to learn more.

In this series, we’ve been tasked by our boss to create several flat file extracts from a single database. The file specification is comma-separated values (CSV) format, double quote text qualified.

By the end of this series, we’ll have a completed, robust framework which will loop through a specified set of queries (or tables, views, etc.) and dynamically create an export for each. Additionally, it’ll support any
delimited flat file format.

Be sure to check back to this page for a list of articles as they are published.

Remembering Warren Thornthwaite

Warren Thornthwaite

I met Warren at my first PASS Summit (2010) after his “Dimensional Modeling Fundamentals” presentation. I think I embarrassed him a little when I gushed about how much I loved his books.

What? You didn’t know Warren? Well, let me tell you a bit about him. He was exceptionally kind, compassionate, brilliant, and generous with his time. He was a prolific presenter, consultant, and author. I’ve read all of the Kimball Group books in the past, but my favorite is The Microsoft Data Warehouse Toolkit. In fact, I was searching through that book yesterday, which is what prompted me to write this. Dimensional modeling is a timeless skill, and his books will forever impart this skill to those who seek to master it.

Warren was a master teacher and presenter. All of the Kimball Group presenters are/were excellent, but Warren was my favorite. He was brilliant, but self-effacing. I once saw him admit that a piece of code in his presentation wasn’t actually his, when an audience member asked a question about it. He was one of those rare people who was not afraid to say, “I don’t know.”

In 2013, I found myself involved in a very large and complicated data warehouse project. I had been thrust into the position of Architect and was in charge of a large technical team. This project had already been attempted and failed by several different teams. Anyway, I called in several experts to advise and work on the project and assure the business that the project was progressing efficiently. Since money was no object, I called in Warren to review and advise on the data model. Basically, he was there to check my homework. He spent several days on-site with us, and we all learned a great deal from him. I found him to be a masterful consultant as well. He didn’t look for “problems” just to justify his time. He wasn’t pushy or arrogant, even if someone asked him really basic questions. He didn’t lecture but rather relied more on the Socratic method of teaching. Dimensional modeling is a topic where spirited debates come up, some may even call it arguing. Warren didn’t argue, he listened to you and responded with subtle questions, he got you to realize your mistakes on your own.

Why do I keep referring to Warren in the past tense? He was diagnosed with a brain tumor shortly before joining my project. Regardless, he was the consummate professional, filled with enthusiasm and energy, and didn’t tell me until the end. The guy was fighting for his life, and we never knew it. Warren Thornthwaite passed away five years ago today.

Warren really loved data warehousing. He was immersed in it, and we bonded over our mutual obsession. I still miss him after all these years.

Feel free to share your experiences with him in the comments below.

Presenting Biml Frameworks Tonight at North Texas SSUG!

If you are in the DFW area, come down to the North Texas SQL Server User Group tonight at the Microsoft building in Irving, Tx. The meet and greet starts at 6:00pm and the presentation follows at 6:30pm. I’ll be giving the following brand new, deep dive presentation.

Creating Your Own ETL Automation Framework with Biml for Beginners

You may have heard of Business Intelligence Markup Language or even tried to learn it. But how can you take the time to learn it, while getting the rest of your work done. In this highly interactive presentation, I’ll guide you through the process of learning/using Biml, without sacrificing your daily productivity.
We’ll cover how to:

  • Identify “easy” ETL patterns to BimlScript
  • Take incremental steps in automating these processes
  • Move to metadata driven BimlScript
  • Tie it all together into a full-fledged framework

As a bonus, you’ll leave the presentation with a completed framework which dynamically generates most types of flat files that you can use in your own environment.

Anyway, additional details about the meeting can be found here. Hope to see some of you there!

Presentation Code Download

I'm Presenting Agile Data Warehouse Design with #Biml Tomorrow at SQL Saturday Houston!

This is just a quick note that I’ll be presenting an updated version of the following talk tomorrow at SQL Saturday Houston.

Agile Data Warehouse Design with Biml

Business Intelligence Markup Language is literally changing the way we think about Data Warehouse design and ETL. In this interactive presentation you’ll see how you can rapidly iterate through the Data Warehouse Development Cycle and provide ROI almost immediately. Learn how you can use BimlScript to automatically create staging tables, SSIS Packages, and other necessary objects during the modeling process.
Presentation Code Download

It’ll be a great time, and I can’t wait to see some of you there!

Presenting Azure Data Factory and SSIS Automation With Biml Tonight in Fort Worth!

If you are in the DFW area, come down to the Forth Worth SQL Server User Group tonight at the Frees and Nichols Headquarters. The meet and greet starts at 6:30pm and the presentation follows at 7pm. I’ll be giving an updated version of the following presentation.

Azure Data Factory and SSIS Automation With Biml

SSIS was originally designed to work in a world where all data was structured and on premise. As modern enterprise data projects increasingly move online, tools like Azure Data Factory (ADF) can extend SSIS capabilities to the cloud. Azure Data Factory is a flexible and powerful cloud based ETL service, but it doesn’t replace the functionality of SSIS.

In this session you will learn the advantages and disadvantages of both ETL platforms; how to use both in tandem to maximize productivity;
and how both ETL platforms, as well as any supporting structures, can be be automatically generated with Biml Script.

Anyway, additional details about the meeting can be found here. Hope to see some of you there!

I'm Presenting Tomorrow at SQL Saturday Dallas BI Edition!

This is just a quick note that I’ll be presenting an updated version of the following talk tomorrow at SQL Saturday Dallas.

Agile Data Warehouse Design with Biml

Business Intelligence Markup Language is literally changing the way we think about Data Warehouse design and ETL. In this interactive presentation you’ll see how you can rapidly iterate through the Data Warehouse Development Cycle and provide ROI almost immediately. Learn how you can use BimlScript to automatically create staging tables, SSIS Packages, and other necessary objects during the modeling process.
Presentation Code Download

Also, I’m bringing a bunch Biml t-shirts with me, courtesy of my friends at Varigence.

It’ll be a great time, and I can’t wait to see some of you there!