Writing T-SQL Merge Statements the Right Way

In a previous article, I discussed Merge statement basics. However, in extensive testing I’ve come to realize that my article, like most articles I’ve read about Merge leaves out or mis-handles several important aspects. Rather than edit that article, I’ve decided to publish a series of articles which I hope will clear up some of these misconceptions. If you already read the original article, a good portion of this will be review as I’m using the original as a base for this one.

Let’s start by glancing at the syntax portion of the Books Online T-SQL Merge Page. I’ll take the liberty of re-posting just the first 25% or so below.

[ WITH [,…n] ]
[ TOP ( expression ) [ PERCENT ] ]
[ INTO ] [ WITH ( ) ] [ [ AS ] table_alias ]
THEN ] [ …n ]
THEN ] [ …n ]
[ ]
[ OPTION ( [ ,…n ] ) ]

[ database_name . schema_name . | schema_name . ]

{ [ [ ,…n ] ]
[ [ , ] INDEX ( index_val [ ,…n ] ) ] }

Simple right? Great, I guess I’m done here…. No seriously, who can easily absorb that? So, what is Merge really and how do we use it?

T-SQL Merge Basics

In a nutshell, the Merge statement allows you to Insert, Update, or Delete data in an entity, referred to as the Target, with data from another entity called the Source. The entities are compared on Fields which uniquely identify records in each, a Join if you will. Notice how I keep using the word entity rather than table, and the reason is that the Target and Source could be many SQL Server objects such as Tables, Temp Tables, Views, Table Variables, or even Common Table Expressions. The Source could also be a complete Select statement as well. In this case, for the sake of simplicity, I’ll use Temp Tables.

I think most people learn best from examples, by doing rather than reading descriptions of syntax, so I’ve provided a brief script to create the tables required for the following example.

IF OBJECT_ID ('tempdb..#Customer_Orig') IS NOT NULL DROP TABLE #Customer_Orig;
IF OBJECT_ID ('tempdb..#Customer_New')  IS NOT NULL DROP TABLE #Customer_New;
CREATE TABLE #Customer_Orig
(  CustomerNum    TINYINT NOT NULL
  ,CustomerName   VARCHAR (25) NULL
  ,Planet         VARCHAR (25) NULL);
CREATE TABLE #Customer_New
(  CustomerNum    TINYINT NOT NULL
  ,CustomerName   VARCHAR (25) NULL
  ,Planet         VARCHAR (25) NULL);
INSERT INTO #Customer_Orig (CustomerNum, CustomerName, Planet)
   VALUES (1, 'Anakin Skywalker', 'Tatooine')
         ,(2, 'Yoda', 'Coruscant')
         ,(3, 'Obi-Wan Kenobi', 'Coruscant');
INSERT INTO #Customer_New (CustomerNum, CustomerName, Planet)
   VALUES (1, 'Anakin Skywalker', 'Tatooine')
         ,(2, 'Yoda', 'Coruscant')
         ,(3, 'Obi-Wan Kenobi', 'Coruscant');

So, I’ve created two temporary tables called Customer_New and Customer_Orig with identical data in each. In this case, Customer_New will be the Target and Customer_Orig will be the Source.
Merge Base
Now, we’re going to make the following changes to Customer_Orig. Notice that I’ve performed a single Insert, Update, and Delete.

-- Update Yoda's Name
UPDATE #Customer_Orig
   SET CustomerName = 'Master Yoda'
 WHERE CustomerNum = 2
-- Delete Anakin
DELETE #Customer_Orig
 WHERE CustomerNum = 1
--Add Darth
INSERT INTO #Customer_Orig (CustomerNum, CustomerName, Planet)
VALUES (4, 'Darth Vader', 'Death Star')

Merge Changes To Base

Now, being the geek that I am, I realize that Anakin became Darth Vader, which could have been seen as a change in name. However, Obi-Wan clearly states that Darth Vader betrayed and murdered Anakin, effectively becoming a new person. If that bothers you, then you’re a scruffy looking nerf herder.

Old School CRUD

Prior to SQL Server 2008, we could have accomplished this merge with the following code. Please note that I have used Joins in my example, which will align with my Merge code later.

--Process Updates  
Update Tgt
Set    Tgt.CustomerName = Src.CustomerName, Tgt.Planet = Src.Planet
FROM   #Customer_New Tgt Inner JOIN #Customer_Orig Src ON Tgt.CustomerNum = Src.CustomerNum
Where  Tgt.CustomerName <> Src.CustomerName Or Tgt.Planet <> Src.Planet -- Eliminates needless updates.
--Process Inserts
Insert Into #Customer_New
  SELECT Src.CustomerNum, Src.CustomerName, Src.Planet
  FROM   #Customer_Orig Src LEFT JOIN #Customer_New Tgt ON Tgt.CustomerNum = Src.CustomerNum
  Where  Tgt.CustomerNum is null;
--Process Deletes
Delete FROM Tgt
from        #Customer_New as Tgt LEFT JOIN #Customer_Orig Src ON Tgt.CustomerNum = Src.CustomerNum
Where       Src.CustomerNum is null;

This works, but it’s less than optimal for a few reasons. First, writing those statements can be tedious, especially if this been a typical table with 20+ fields to deal with. Second, this represents three separate groups of work and SQL Server processes them that way. Once you understand it, the T-SQL Merge Statement is easier to write and can accomplish this in one transaction. It’s basically a win-win.

Components of Merge Statements

So, lets break a Merge statement into it’s component parts to make it easy to understand. First, the Target and Source tables are specified along with the business key which identifies each record. This is the field that one would use in a join.

MERGE  #Customer_New AS Target
 USING #Customer_Orig AS Source
    ON Target.CustomerNum = Source.CustomerNum

The When Matched clause determines what will happen when records exist in the Source and Target with the same CustomerNum. Notice the additional conditions I’ve added which limits the updates only to records where a value has changed. Strictly speaking, this isn’t required, but every record in the target would be updated regardless of need, which wastes resources.

                AND (Target.CustomerName <> Source.CustomerName 
                  OR Target.Planet <> Source.Planet)
  UPDATE SET --Updates Yoda's Name
      Target.CustomerName = Source.CustomerName,
      Target.Planet = Source.Planet

The When Not Matched by Target clause specifies what should be done with records in the Source that aren’t in the Target. The typical scenario is to insert records which are new. I could have added additional conditions or only added certain new records as well.

   INSERT (CustomerNum, CustomerName, Planet) -- Inserts Darth
   VALUES (Source.CustomerNum, Source.CustomerName, Source.Planet)

The When Not Matched by Source clause specifies what should be done with records in the Target that aren’t in the Source. Keep in mind that if this was a Staging table which wasn’t comprehensive, perhaps the result of an incremental extraction, then you’d want to omit this portion of the statement.


Also, keep in mind that any Merge statement must be terminated in a semicolon. So, when you put your script together, it looks like the following. Go ahead and run it on your test data.

MERGE  #Customer_New AS Target
 USING #Customer_Orig AS Source
    ON Target.CustomerNum = Source.CustomerNum
                AND (Target.CustomerName <> Source.CustomerName
                     OR Target.Planet <> Source.Planet)
      Target.CustomerName = Source.CustomerName
     ,Target.Planet = Source.Planet
   INSERT (CustomerNum, CustomerName, Planet)
   VALUES (Source.CustomerNum, Source.CustomerName, Source.Planet)

Result of Merge

What about Null Values?

Ah, very astute of you to notice that my young padawan. This is the first mistake that many people make. You’ll notice in the When Matched portion above, that I also check to see if a value changed before I run my update statement. However, depending on your settings, Null values are not equal to each other. Therefore, any record with a Null value will not be updated if one of them is NULL. A great explanation of NULL handling can be found here. First, let’s set up a NULL field comparison issue. Run the following update statement to the Customer_Orig table.

Update CO set Planet = NULL
  FROM #Customer_Orig CO
 WHERE CustomerNum = 3 -- Obi-Wan Kenobi

If you’ll re-run the original merge statement, and compare the tables, you’ll find the following:
Result of Merge Non NULL Handling
Notice that Obi-Wan’s record in Customer_New has not been updated. There are a few ways to deal with this which I’ve seen online. The first two methods each have their problems, which I’ll briefly explain. The third method is the one I recommend if you’d like to “skip ahead” using Select and Except.

Adding “Is NULL” Statements
We could re-write the When Matched clause like this:

                AND (   Target.CustomerName <> Source.CustomerName
                     OR Target.CustomerName IS NULL
                     OR Source.CustomerName IS NULL
                     OR Target.Planet <> Source.Planet
                     OR Target.Planet IS NULL
                     OR Source.Planet IS NULL)

Technically this method works in that it will update if one of your values is NULL. However, its really cumbersome to write, imagine writing a statement like that for 50 columns. Also, the resulting query plan is complex. Perhaps the biggest problem is that if both of your values are NULL, then an update will happen anyway. Go ahead and substitute the code above and re-run the merge statement where one of the values is NULL. You should receive the message “1 Row Effected.” However, if you repeatedly run the merge statement you will ALWAYS receive “1 Row Effected,” again depending on your database settings. For these reasons, this method isn’t recommended.

Using ISNULL or Coalesce
Coalesce can also be used because it chooses the best datatype between arguments based on datatype precedence. Therefore, we could use the same basic Coalesce statement for most comparisons without throwing an error and the When Matched portion from above becomes the following:

                AND (COALESCE(TARGET.CustomerName, '') <> COALESCE(SOURCE.CustomerName, '') 
                  OR COALESCE(TARGET.Planet, '') <> COALESCE(SOURCE.Planet, ''))

This works as well, with some caveats. First, the second value in each Coalesce statement has to be something that should NEVER occur in that column, therefore an empty string ” is most likely not a good choice. Second, notice how I emphasized that coalesce would automatically cast most comparisons, well there are some data types that will fail and require you to use something other than a default string value. The nice thing about this method is that the T-SQL code is short and easy to understand.

Using Select and Except to Handle Nulls(Recommended)

I used the Coalesce method successfully for some time until I came across this excellent article by Paul White. In it he correctly points out that when using Except or Intersect Null values are considered equal. Therefore, when comparing two fields with NULL values, they will be equivalent, and when comparing a NULL value to some other actual value they will be evaluated as Not Equal. He seems to favor using “Not Exists” and Intersect. However, using Exists and Except just makes more sense to my brain. Therefore, I’d write the query like the following. If you take this code and run it on the original tables, as well as the NULL test,you’ll see that it works perfectly.

MERGE  #Customer_New AS Target
 USING #Customer_Orig AS Source
    ON Target.CustomerNum = Source.CustomerNum
                    (SELECT Source.CustomerName, Source.Planet
                     SELECT Target.CustomerName, Target.Planet)
      Target.CustomerName = Source.CustomerName
     ,Target.Planet = Source.Planet
   INSERT (CustomerNum, CustomerName, Planet)
   VALUES (CustomerNum, Source.CustomerName, Source.Planet)

Limitations of Merge?

Of course, there are always limitations. The most important limitation is that both data sources should be on the same SQL Server instance. I’ve seen people use Merge with linked servers, but I wouldn’t recommend it.

Another factor that might give you some pause is that these statements are fairly complex and wordy. A long merge statement, such as merging two tables with 25 fields each, is tedious to write and it’s very easy to make a simple mistake. Well, stay tuned because later in this series I’ll share some code which will practically write the statements for you.

In the next article in this series we’ll discuss how to use the Merge statement with the Output clause as it’s required to load Slowly Changing Dimensions.

Changes in Priorities and Direction

Buy In

Everyone who knows me knows that I absolutely love Data Warehousing. I’m one of those fortunate few who finds their work fascinating, and I tend to dive head first into new projects.

I was recruited to join a team on a multi-terabyte data warehousing project late last summer. I was leery to take the position because project had already been attempted by numerous other teams, which had failed, and subsequently no longer worked for the company. However, the Architect in charge was very persuasive and the lure of a ground breaking project, using some of the best toys out there, was something that I couldn’t resist. I knew going in that this might be a short term situation, but I calculated that the benefits outweighed the negatives.

After I started, it became obvious that the team and project was in a state of flux. The Architect, who I previously mentioned, left shortly after I started, leaving the team to push forward anyway. The executive in charge of the project was fond of saying, “Teams pick their own captains,” and that’s what happened here as I was shortly placed in charge of the project as the new Architect.

I was running a team of a dozen in-house data professionals, as well as directing the efforts of world class consultants in SSIS and Dimensional Modeling. Though I already knew these consultants personally, working directly with them were some of the highlights of my career so far. While I’ve already performed almost every individual role in the complete data warehouse life cycle, being in the position of Architect (and project leader) was particularly enlightening since it gave me a deeper appreciation of the entire process.

It was an amazing, rewarding, and exhausting experience. I got the chance to prove my skills to some of the industry’s best; finding solutions to problems that stymied some of the biggest names in our industry. However, it was not all rainbows and butterflies.

Stress and Burnout

In many data warehouse projects, the technical challenges (though formidable) can pale in comparison to the political and social challenges. As many of us have experienced, business decisions aren’t always based on technical reality. I worked 70 to 80 hours a week for more than 6 months to try to fulfill very aggressive insane deadlines; with no end in sight. I’ve always been a bit of a workaholic, but I had completely lost my work/life balance. Being the geek that I am, and all of the learning that was taking place, I didn’t see the situation clearly. It was taking a toll on all of the other aspects of my life. I wasn’t able to sleep, wasn’t eating properly. I fell way behind in my personal and professional correspondence.

Moments of Clarity


My sister, Regan, and I in 2011

My sister became critically ill the third week of March, and I caught the first flight back home to Michigan to be with family. I stayed a week, but the doctors said the crisis would be long term, and they wouldn’t be able to provide a prognosis for weeks or months. I did not want to leave, but I had a multi-million dollar project with a large team that was counting on me in DFW. This incident put a spotlight on what I was doing to myself and my family due to overwork and lack of balance. After a good deal of quiet reflection, I gave two weeks notice to my previous employer near the end of March. Tragically, my sister died early Easter morning. She was only 41, had young children, and her loss devastated my entire family.

Lessons Learned

The death of my sister really put an exclamation point on the whole mess. I’ve learned a lot through this whole ordeal, and I’ll share some of them with you in the future. However, the most important thing I learned was that while career is important, and what we do affects a lot of people, it isn’t as important as your physical and emotional well-being. Family commitments absolutely have to come first. No matter how fascinating the project is, I cannot neglect my personal life for very long. It’s not fair to myself, my family, or my clients.

New Beginnings

It’s always important to focus on what is good in our lives. My career continues to go well. As I was deciding to leave my previous gig, one of my independent consulting clients came to me needing some long term help with their data warehouse project. It’s a fascinating project, terabytes of data, challenging text file imports, and I’m learning C# to boot. This gig should last for several months and has been perfect for me as I transition to the next phase of my career. This good fortune can be directly attributed to my work in the PASS Community. I want to sincerely thank my #SQLFamily for all of their help and support.

More to come in future posts.

Setup and Performance Issues with the Integration Services (SSIS) 2012 Catalog

For those who don’t know, I’m currently the Data Warehouse Architect for a large scale and complex multi-tenant data warehousing project. Fortunately, the project is being created in SQL Server 2012 and I thrilled to be using the new features. Let me preface this blog post by admitting that my experience with using the SSIS 2012 Catalog, SSISDB, is fairly limited, but then again most people are in the same boat since adoption of 2012 has been fairly slow.

However, I’ve run into some problems which aren’t getting a lot of attention yet, but I’m fairly certain they will in the near future. Let’s break them down.

Initial Catalog Setup Issues

I won’t explain step by step how to create the Integration Services Catalog, because others already have. However, you’ll notice from the following screenshot that very few settings are available when the catalog is created.

I wish that Microsoft would upgrade the Create Catalog screen to enable the user to specify SSISDB settings upon creation. For example, right click the SSISDB under Integration Services Catalogs, and select Properties. The following settings can and should be set:

  • Whether to Clean Logs Periodically – Defaults to True
  • Retention Period in Days – Defaults to 365
  • Default Logging Level – Defaults to Basic
  • Maximum Number of Old Versions Per Project – Defaults to 10
  • Whether to Periodically Remove Old Versions – Defaults to True


I wish we could set those at create time. As I’ll discuss in a minute, the default values can be disastrous for many installations.

SSISDB Database Ignores Model Database Settings

For some reason, when you create the SSISDB Catalog, the corresponding SSISDB database ignores the Model database settings. It appears that no matter what my Model database is set to, SSISDB defaults to set file sizes for the Data File and Log, 10% Growth Rate, and Full Recovery Model. I’ve reached out to Microsoft, but haven’t received an answer as to why.

This has become problematic for us because we’ve deliberately set up our Model Databases so that Simple Recovery is used in our Development and QA Environments while Full Recovery is used in our Production environments. I’ve had to set up periodic checking on all of our servers looking for databases in full recovery mode that aren’t getting transaction log backups to avoid runaway T-Log growth issues.

The script I’ve used to re-set the SSISDB database is as follows:


SSISDB Performance Issues

I’ve come to the conclusion that the SSISDB Database will not scale at all. I became painfully aware of this because of a problem we had in a “pseudo” Production environment. Our first nasty surprise was T-Log growth, because the database incorrectly defaulted to Full Recovery mode which I fixed by shrinking the log. Before you mentally criticize me dear reader, please keep in mind this wasn’t an actual Production environment and the data being kept in SSISDB was inconsequential so far.

Another issue became apparent shortly afterward. We were running with the default settings of Basic level logging, 365 Days Retention, and 10 Versions of the project being kept. Our SSIS project is already fairly large with around 50 packages, but it was only being run periodically in this environment approximately once or twice per week. We are not even hitting this thing very hard. However, the SSISDB database size was growing rapidly anyway. In this particular situation, it had grown to 60GB. While I realize that 60GB is not very large, SSISDB performance slowed to a crawl. In any of our environments, when the SSISDB approaches 50GB the performance degrades dramatically. The first sign of this is that the built in reports take longer and longer to return data and eventually seem to stop returning data at all.

Unfortunately this pseudo Production server was set up with very little hard drive space, 100GB. So, I started receiving free space warnings after the SSISDB database reached 60GB.

I did a little research about reducing the size and performance issues with the SSISDB. I lowered the Retention Period to 7 days, set the Default Logging Level to None, and reduced the Maximum Number of Old Project Versions to 1. We don’t need to save old versions of our project as we are using TFS source control. You are using source control right?

Keep in mind that changing your logging level to None is not advisable in most production environments, however as of this time we are using a 3rd party product to monitor our packages so this isn’t an issue for us. I suspect that most of you will use the Basic level, but set the Retention Levels back and run the cleanup job often to avoid performance issues. Anyway I changed my settings using the following script.

EXEC [SSISDB].[catalog].[configure_catalog] @property_name=N'MAX_PROJECT_VERSIONS', @property_value=1
EXEC [SSISDB].[catalog].[configure_catalog] @property_name=N'RETENTION_WINDOW', @property_value=7
EXEC [SSISDB].[catalog].[configure_catalog] @property_name=N'SERVER_LOGGING_LEVEL', @property_value=0

Then I proceeded to start the “SSIS Server Maintenance Job” automatically generated when you create the catalog. Wow, that was a mistake.

Don't Touch That Button

Don’t Touch That Button

The job ran for 2 hours and blew up the transaction log even though it was in Simple Recovery Model. I then started receiving emails that the server had completely run out of space. When I checked the error log, I found an error indicating that the Log was out of space and a checkpoint record could not be written.

I immediately started researching and found this article about the cleanup job causing user packages to fail. If you read between the lines, you’ll see that the SSISDB is not properly indexed to handle the cleanup job. In my case, I had to drop the SSISDB, recreate it, and apply the script above to avoid similar issues in the future. Once again, this was not a huge problem for me because it wasn’t truly a Production environment and we rely on a third party product for our SSIS logging anyway. Don’t forget that even if you do change your settings, you must also schedule the clean up job to run regularly. If you allow too much history to be accumulated, you may suffer the same fate I did when running it.

My take on all of this is that SSISDB is not ready for prime time. I don’t think I’m being unfair with that statement. I do appreciate the fact that the SSIS Catalog is a great addition to SSIS 2012, but I expect to be able to use it in Production, running packages every 5 or 15 minutes, retaining months worth of history, and for it to perform regardless.

After writing this article I ran across this excellent article on SSIS 2012 Catalog Indexing, and time permitting I will implement his suggestions and report back. Are any of you having similar issues with your SSIS 2012 Catalog (SSISDB)?

Upcoming Presentations This Week

I’m in for a busy week as I have back to back presentations this Wednesday and Thursday nights.

On Wednesday the 20th, the Forth Worth SQL Server User Group has asked me to present one of my favorites. I’m constantly revising this presentation as I encounter as I come across new mistakes to share. So, even if you’d seen it, there will still be new material to shock and amaze you.

Data Warehouse Mistakes You Can’t Afford to Make

Many data professionals understand the basics of Data Warehouse design, including Dimension and Fact Tables, slowly changing Dimensions, and the use of meaningless surrogate keys. However, it isn’t until you’ve created a dimensional model and put it into production, that you realize just how much of an impact seemingly trivial mistakes can make. They can hobble performance, allow inaccuracy, and perhaps worst of all, inhibit adoption and usage of the new system.

Learn how to avoid many common mistakes, from someone who’s made them and then found ways to correct them.

On Thursday the 21nd, I’ll be continuing my Data Warehouse Series for the North Texas SQL Server User Group.

Data Warehouse Series – Basics and Beyond

In a series of 1 hour sessions, I’ll cover both the basics and gotchas of Data Warehouse Development using the SQL Server 2012 toolset. Some of the things we’ll cover are:

  • Basic methodology for evaluating business requirements/needs, obtaining user cooperation, and identifying source data quality issues.
  • Data Warehouse basic concepts including Facts, Dimensions, Star and Snowflake Schemas.
  • Deep dive into slowly changing Dimension types and how to handle them using T-SQL and SSIS.
  • T-SQL Merge Fundamentals and how it can be used to load Dimension and Fact Tables. Yes, I swear the Merge Generation Code will be ready for the presentation. 🙂
  • Understanding the three basic Fact Table types, how they are used, and loaded with SSIS.
  • Date Dimension creation and configuration.
  • Taking advantage of SSIS 2012 functionality improvements including Environments, Project Parameters etc.
  • Strategies for SSIS and database optimization, insuring data integrity, etc.

And much, much more. Each session will build upon the previous sessions so they will be recorded and posted to my blog. Come watch this exciting and progressive series as I take you from padawan to Jedi Master.

I can’t wait to see my friends at the meetings.

Gratitude, Resolutions, and Shit

“Shit!” I said, followed by even more graphic expletives. I lost you? Sorry, I forget that you don’t follow me around? Let me explain.

Like most people, I have a junk room filled with boxes and plastic storage bins. After Thanksgiving I was in the garage looking for our stored Christmas ornaments and came across a very large bin that was unlabeled. When I opened it, I found a ton of brand new clothes including many pairs of Dockers pants and all of them had the tags still on. To understand my anger we need to back up several years.

Clowning around, but not really happy.

Clowning around, but not really happy.

Around 15 years ago, on my way to morbid obesity, I was outgrowing my clothes much faster than they wore out. During the rise, I bought a bunch of Dockers Slacks and other clothes for work because I got them on sale. However, because my weight rose so fast, I had pairs of them that were never worn. I never even took the tags off. I could have returned them, but I was embarrassed because of the reason. Those of you who know me, know that I’m unfailingly honest. I couldn’t look a clerk in the face and tell them that I had to return the pants because I was too fat to wear them. For these all of these years I have kept them in a plastic bin, telling myself that I would lose weight and wear them again someday.

I carted those pants from apartment to apartment. I had garage sales and didn’t sell those pants. Even as I gave away half my stuff when I packed up everything to move across country years ago, I kept that damn bin. I knew it would be feel great to be able to wear them. However, I lost track of it in my latest move.

So, why the expletives? Well, in case anyone doesn’t know I had weight loss surgery back in April. I figured that as my weight dropped I would be able to use the brand new Dockers (and other clothes) from that bin. However, by the time I found it they were already several sizes too large for me anyway. Sometime in December I passed 100 lbs of weight lost. So, my immediate reaction was anger that I had wasted the money, time, and effort carting around those stupid pants only to never get to wear the damn things.

Then, “Bang” I felt like something hit me upside the head.


Who's this guy? - Photo courtesy of Tim Mitchell

Who’s this guy? – Photo courtesy of Tim Mitchell

What the heck was I thinking? I have so many things for which I should be grateful. I’ve lost a hundred pounds, 10 inches from my waist, and feel great. Weight Loss Surgery saved my life. I would absolutely do it all over again, and again, and again if necessary. I’ve reached a point in my career where money isn’t a problem, who cares if I had to donate a couple of hundred dollars in clothes to Goodwill? I’ve already donated thousands of dollars of clothes as I’ve gone down through the sizes. There are lots of people in my #SQLFamily who care and I’m surrounded by great people at the North Texas SQL Server User Group as well.

I have been working for the past several months for a company in Carrollton, Texas on a fascination multi-tenant data warehouse project as the Data Warehouse Architect. The project is amazing, we’re using all of the best toys (tools), and I’ve brought in some of the very best consultants to assist with it. I have my SQL Family to thank for that. More on that in a later article.

Anyway, life is too short to dwell on meaningless things like Dockers pants. Life is good.


This is the first year that my primary resolution is NOT to lose weight, though I intend lose another 20 to 40 pounds this year. I’m back in the gym and have been investigating Aikido Dojos in the area. I have Steve Jones (Blog/Twitter) to thank for pushing me back toward the art. I can tell you that if he likes you, the man is relentless. So, I promise to be attending one before the end of January.

I want to start getting certified in the BI Stack. I’m calling myself out right now in front of all (both) of my readers. I will get at least 3 certifications this year and will shoot for more.

Regardless of my difficult work schedule, I’ll take more time to blog and present this year. In fact, I’ve just agreed to present an ongoing series at the North Texas SQL Server User Group meetings called, “Data Warehouse Series – Basics and Beyond” and I hope to see some of you there. I love giving back to my #SQLFamily and need to do as much as I can this year. Mentoring others in our profession makes me feel great.

Having shed almost a full person, I have more energy than ever. Watch out folks, you ain’t seen nothing yet.

PASS Summit and My SQL Family

Those who read this blog know that I’ve always been into the PASS Community, particularly about attending the PASS Summit. This is my third and I hope never to miss one.

I’ve written articles about why it’s worth paying your own way, how going is like finding a golden ticket and even like going on vacation. Many people, including my “real” family don’t really understand why I love my SQL Family (#sqlfamily) so much.

Changes and The Future

I’m particularly geeked about this one because so much has changed in the past year. I’ve lost almost 100 pounds and feel so much better now. Also I’ve switched jobs and am now working as a corporate employee on a huge data warehousing project which is unlike anything I’ve ever seen before. Last night I went to dinner with most of the authors of SQL Server 2012 Integration Design Patterns, shared a few tidbits of the project, and they agreed that it was something special. More on the book and my current project to come in a future article.

A Scare Brings Focus

Anyway, a few recent things really brought me back to why I am so committed to this community. I thought that I had lost the blog a few weeks ago. One of my readers notified me that my site was down and I had no idea how long it had been down. My provider did not respond immediately to my email and when they did the response was something like, “It’s down? Oh, I’d better check that out.”

Because it’s a hosted site, my backup/restore options are limited, and I had never performed a disaster recovery test of the blog. To be honest, I wouldn’t know how to. I was surprised at my visceral reaction when I thought I had lost it, even though I’ve been neglecting the blog for the past year. It turned out that the problem wasn’t data loss, but rather a Denial of Service Attack which my provider was unaware of. Anyway, everything turned out alright, but it reminded me of how important the blog and SQL Community are to me and I intend to make it a priority in the future.

My Calling

Another reminder came after the latest SQL Saturday for the North Texas SQL Server User Group. For those who don’t know, I recently gave a precon for SQL Saturday Dallas. It went extremely well because I really love presenting, and I eagerly read all of the Session Evaluations I received. I don’t think that your average attendee realizes how much we care about what they have to say and I love pouring over the comments to find ways to improve. One attendee said something which really touched me and made me re-focus on my community involvement.

“It can be easily seen that David has found his calling in life. He is incredibly knowledgeable and passionate about Data Warehousing. I only wish we had more time for a deeper dive on some topics…”

Well, this person really nailed it. I have found my calling and love presenting, blogging, and volunteering in the PASS Community. I am fascinated by Data Warehousing, Integration, and Analytics. There’s something special about sharing knowledge with your fellow database pros (and friends) which feels so good that it’s worth all of the effort.

Last night at dinner, Tim Mitchell (Blog/Twitter) summed up the feeling very well. He’s a rabid Rangers baseball fan and therefore used a baseball metaphor. I’m paraphrasing but he likened his precon experience to the feeling when a hush comes over a crowd, the batter makes solid contact with the ball, the “crack of the bat” echos through the stadium, and it’s a home run. You just know when you hit it out of the park, and it’s one of the best feelings ever.

When I’m at the top of my game, this is exactly how it feels. When you are connected to the audience, you can feel that they are learning, are entertained, and most importantly you’ve made a difference. You know that they’ll take what you’ve taught and go back to their jobs to make things better.

Join the SQL Family #SQLFamily

Well, I hope you’ll forgive my rambling. It’s 4am as I write this, and I don’t adjust well to timezone changes. The official conference starts tomorrow and I can barely contain my excitement. It’s great seeing all of my friends again, some of whom I haven’t seen since last year. If you see me out and about, and manage to recognize me since I’ve lost almost 100 lbs, please stop me even if we’ve never met before.

I’ve already met some folks who I’m sure will become part of my #SQLFamily, and am always looking to connect with more.

If you’re not here, then make plans for next year. You cannot beat the networking and educational experience you get at a PASS Summit.

SQL Saturday Dallas Reflections and Presentation Downloads

Every time I attend SQL Saturdays I am re-energized, and SQL Saturday 163 BI Edition was no different. This one was particularly special because of my precon presentation, “Data Warehousing In a Day”. I just love getting up in front of a group and teaching and it’s obvious when to the attendees.

If all of the emails I’ve received are any indication, the attendees enjoyed it as much as I did. I want to thank those who attended the precons (not just mine) and attended SQL Saturday. You folks make the PASS Community worthwhile.

I also want to thank our team of volunteers who put on the event.

As anyone who reads my blog knows, my time has been ridiculously scarce lately and I wasn’t able to do much more than get my presentations ready. It’s nice to know that if I can’t give my usual effort that everyone else picks up the slack. Everything went off great. Congrats guys.

Presenting “Dimension – It’s All Meaningless Without Them”

Anyway, as promised the following are available for download:

Source Database Backup, Scripts, and SSIS 2012 Packages
Building an ETL Automation Framework for SSIS 2008 – Rushabh Mehta

Once again folks, please accept my sincere apologies that I haven’t made enough time for my community work, especially blogging. I just recently severed ties with a project that was taking all my time and preventing me from blogging. That, and another event I’ll talk about soon, reminded me how important the SQL Community is to me. I will endeavor to make this the last time I need to apologize.

Anyway, coming up in future posts I’ll cover the following:

  • More on the Merge Statement including a wrinkle I ran into last week regarding foreign keys.
  • The Balanced Data Distributor Transform for SISS.
  • A series of articles regarding Dimensions, SCDs, and how to load them in SSIS 2008 through SSIS 2012.

Stay Tuned!

It's Not Too Late To Sign Up for a Precon

This is just a quick reminder that I and two of my friends are presenting Precons tomorrow for SQL Saturday Dallas. While I am biased and think that you should sign up for mine, any of these sessions are a huge educational value. If you’re in the DFW area, do yourself a favor and attend a Precon and SQL Saturday. The abstracts are below.

Hope to see you there!

Become an Enterprise DBA by Sean McCown

Speaker Bio: Sean McCown is a Microsoft Certified Master in SQL Server 2008. He is also a SQL Server MVP with over 15yrs experience in databases. Sean is also founder and co-owner of the renowned website where he records free SQL Server training videos and co-hosts the popular Webshow DBAs@Midnight
In addition to this training, Sean has decided to offer all of the Enterprise DBA precon attendees a FREE copy of the DBA Roadmap webinar! Which means, of course, that the precon pays for itself…or you get both for 50% off, or however you want to look at it. Bottom line: attend Sean’s precon, learn massive amounts of stuff, AND improve your career exponentially. Win-win-win.

Data Warehousing in a Day by Dave Stein

Speaker Bio: David Stein is a Senior Business Intelligence Consultant with WaveTwo in Irving, TX. He’s worked in SQL Server Business Intelligence for almost a decade. He’s designed Data Warehouses from scratch as well as saved projects in serious jeopardy. He is also very active in the community, having spoken at PASS events at all levels, including the PASS Summit, SQL Rally, and at numerous SQL Saturday events and user groups. He’s also given multi-day pre-conference seminars at third party events regarding Microsoft Business Intelligence. You can learn more about Data Warehousing, Business Intelligence, and other technical topics on his blog at or follow him on Twitter @Made2Mentor
Attend this full day pre-con and let David be your guide through this process. He’ll save you time, money, and headaches.

Additional information about my presentation can be found on my blog at:

Real World SSIS: A Survival Guide by Tim Mitchell

Speaker Bio: Tim Mitchell is a business intelligence consultant, database developer, speaker, and trainer. He has been working with SQL Server for over 9 years, working primarily in business intelligence, ETL/SSIS, database development, and reporting. He has earned a number of industry certifications and holds a Bachelor’s Degree in Computer Science from Texas A&M at Commerce, and is a Microsoft SQL Server MVP. Tim is a business intelligence consultant for Artis Consulting in the Dallas, Texas area.
As an active member of the community, Tim has spoken at national, regional, and local venues including the SQL PASS summit, SQL Connections, numerous SQL Saturday events, and various user groups and PASS virtual chapters. He is a board member and speaker at the North Texas SQL Server User Group in Dallas, serves as the co-chair of the PASS BI Virtual Chapter, and is an active volunteer for PASS. Tim is an author and forum contributor on and has published dozens of SQL Server training videos on
Tim is a contributing author on the charity book project MVP Deep Dives 2, and is coauthoring SSIS Design Patterns (available July 16th).
You can visit his website and blog at or follow him on Twitter at @Tim_Mitchell.

Rock the Vote... or Something.

Let me just start this post with an apology to all of my readers, well both of you. There have been a lot of changes in my life lately and I’m been overwhelmed with demands on my time. I won’t go into details but those demands will be letting up soon, and I will be able to get back to blogging, community involvement, and catching up on sleep.

However, this topic is important enough that I had to take a break from frantic schedule to jot down a few words about it. As I started to write this, I got a notification of a blog article by my friend and fellow North Texas SQL Server User Group member, Ryan Adams (Blog/Twitter). As I read his article I realized that our views were nearly identical about two candidates, Sri Sridhara (Blog/Twitter) and Allen Kinsel (Blog/Twitter). I urge you to read what he wrote because I echo each and every one of his sentiments.

However, I’d like to say just a few words about each candidate myself.

Allen Kinsel

I met Allen at my first SQL Saturday of which I was an active volunteer. Allen was the first PASS member that I connected with who wasn’t in one of my local groups. The guy is tireless, has a pragmatic point of view about the community, cares about member’s opinions, and I’ve never heard anyone have a negative thing to say about him. When I have something to comment (whine about) on regarding PASS at the national level Allen is the guy I go to and I can tell you that he truly cares… or at least is good at acting like he does. 🙂

Sri Sridhara

Sri is an interesting PASS leader and volunteer. I’ve worked with him for several years organizing PASS events like SQL Rally, multiple SQL Saturdays, etc. In some ways, he’s a lot like me. Some people take to him, some not so much, but what I’ve found is that people in both camps are reacting to the same thing. The man is highly driven. Nobody can question his dedication, drive, and single minded focus on pushing the community farther. Sri embodies what PASS is and should continue to be, relentlessly driven to spread the adoption of SQL Server and to raise the ability and skill set of DBAs everywhere.

He and I don’t always agree, but he’s a guy you can count on to always follow through on his commitments, to do what he says he will do. Without belaboring the point, what happened last year was shameful, yet to this day Sri hasn’t had a negative thing to say about it, even privately. Let’s not let that happen again, Vote for Sri.

Like Ryan said, I’m not commenting on the other candidates, but not because they aren’t good people. I just happen to know these two exceptionally well, and intend to vote for them. You should too.

You Should Attend SQL in the City

This is just a friendly plug for my friends at RedGate regarding SQL in the City, a free series of events traveling around the country. It’s a free day of training and networking led by spectacular presenters such as Steve Jones (Blog/Twitter) and Grant Fritchey (Blog/Twitter), who will be headlining the Austin event next week.

Steve and I at my first Summit

Grant and I at SQL Sat. Chicago

I had originally planned on going myself, but I recently got a new gig (more on that in the near future) and can’t get away at the moment. However, these guys are top notch, and you really shouldn’t miss the chance to learn from them.

So, Register Today!