Now that we’ve discussed various common metadata strategies used in creating BimlScript, I’m going to explain exactly how the metadata in my sample BimlScript framework functions. Basically, I’m going to start to pull back the curtain and explain some of the “magic” involved in my Biml – The Next Step in Agile Data Warehousing presentation.
As I mentioned in a previous article, I use two different forms of metadata in this example, SQL Server Extended Properties and Biml Annotation Tags. This metadata applies at both the table and column levels. While many examples and demonstrations work from the source forward, I’m a dimensional modeler so I tend to work from the data warehouse backward. My basic metadata process consists of the following steps.
- Metadata is defined and kept in a dimension model created with modeling software. Much of this metadata is specified automatically and requires very little effort from the architect/modeler.
- The modeling software generates the DDL necessary to convert the metadata into SQL Server extended properties.
- BimlScripts, which I’ll share in an upcoming article, convert the extended properties into Biml Annotation Tags. The other BimlScripts in the framework works from those tags. The import scripts will work fine with BidsHelper and do not require a Mist license.
There are several reasons I am choosing this multiple step process.
- Complex data warehouse projects typically require modeling software.
- I needed to decouple that software from my framework wherever possible so I wasn’t locked into using one product.
- Extended properties are already widely used for documentation.
- Biml Annotation Tags make writing BimlScript a lot easier, and the code is easier understand and maintain as well.
- If more people standardize on Biml Annotation Tags for their metadata, shared BimlScript snippets become much easier to adapt to multiple projects.
- This solution provides maximum flexibility since the rest of my code uses Biml Tags. For example, I could forego the use of a dedicated modeling and handle simpler projects entirely in Mist, or even manually with BidsHelper.
The modeling software I’m using for this demo is ModelRight. I don’t work for them, and don’t profit in any way if you purchase their software. That being said, I think it’s a good value compared to the other modeling products on the market. Anyway, the specific modeling software used isn’t integral to this solution. Any modeling tool which allows you to create custom metadata and generate the corresponding SQL Server extended properties DDL code will work. In fact, I have a friend who is using my framework the same way with ERStudio right now. Before we dive into the metadata details, let’s briefly discuss what we are going to load.
In the simple BimlScript framework example that I’m sharing, we’ll load the 4 tables shown above.
- DimCustomer – This will be a Dimension with only Type 1 columns.
- DimProduct – This Dimension has both Type 1 and Type 2 columns.
- DimProductCategory – This is a Junk Dimension and also a Role Playing Dimension. In order to keep this example simple, I’ve created a contrived example and used DimProductCategory as PrimaryProductCategory and SecondaryProductCategory. Again, this is not a real world example, but it makes for a simple demonstration.
- FactSalesOrderDetails – A Fact table with foreign keys to each of the three Dimensions listed above.
As I mentioned in the previous metadata post, I tend to favor Convention over Configuration. In order to produce a framework that was powerful and flexible, yet at the same time easy to understand, I analyzed exactly what my BimlScript code would need to “know” in order to load Dimension and Fact tables. I distilled the metadata requirements down to only three table level items and four column level items. Let’s start with what one would need to know at the table level. Notice that I do abbreviate my metadata titles which makes writing BimlScript faster, and the users never see this information anyway.
Table Level Metadata
- TableRole – This property contains the type or role of the table and therefore includes values such as Dimension, DimJunk (Junk Dimension), Fact, StaticLkp (Lookup), and Custom.
- LkpNoMatch – This stands for “Lookup No Match” and this defines the type of behavior necessary to handle any SSIS Lookup against this table which fails to find a match. The current choices are Unkown, Inferred, and None. I’ll go into more detail in a future article when I explain how the Fact table BimlScript works.
- IsScripted – This setting determines if my BimlScript will automatically execute on this particular table or not. For example, I may have already created a custom SSIS package to load a particular table and I don’t want my BimlScript over-writing it.
Column Level Metadata
- ColumnRole – This column, similar to the TableRole above, classifies the functionality of each column in the data warehouse. Common values are BK (Business Key), SK (Surrogate Key), Type 1, Type 2, etc.
- IsViewColumn – I typically create views which overlay the Dimension and Fact tables in my data warehouse. This column, which accepts a simple “Y” or “N” allows the designer to indicate whether this column should be included in any views. This setting has the same effect when I use BimlScript to create Role Playing Dimensions.
- IsLkpColumn – This indicates whether the column value is the result of a SSIS Surrogate Key Lookup and the allowed values are “Y” or “N”. Again, I’ll provide more details on this when I cover Fact table loading.
- LkpRolePlayingPrefix – When creating a Role Playing Dimension, I prefix the names of all of the columns, including the Surrogate Key with a word(s) that represents the Dimension. So, if I created a Role Playing Dimension (view) on a table called DimDate which represents a “ShipDate”, then each column would be prefixed by the word “Ship”. The word ship is the LkpRolePlayingPrefix. If the value for this piece of metadata is NULL, or is missing entirely for a column, then no prefix is applied. Again, this will make more sense later.
It’s kind of amazing so much can be accomplished with so little metadata, much of which is automatically created by my modeling tool. In the next several articles I’ll use it to create staging tables, Role Playing Dimensions and other views, and the SSIS packages necessary to load Junk Dimensions, Slowly Changing Dimensions, and Fact tables. Keep in mind that this is only one of several BimlScript frameworks I’ve developed, and is the simplest to use and understand. My more elaborate BimlScript frameworks require more extensive metadata.
In the next article, I’ll walk through the steps to creating a simple, functional BimlScript which loads the Junk Dimension above, DimProductCategory. The SQL Server DDL and Biml file will be available for download in that article. However, to end this one, I thought I would share a snippet of code from each step listed above. Notice that each snippet includes some metadata at both table and column level. This is a portion of the SQL Server extended property DDL produced automatically by ModelRight.
EXEC sp_addextendedproperty @name = 'TableRole', @value = 'DimJunk', @level0type = 'Schema', @level0name = 'dbo', @level1type = 'Table', @level1name = 'DimProductCategory'; GO EXEC sp_addextendedproperty @name = 'IsScripted', @value = 'Y', @level0type = 'Schema', @level0name = 'dbo', @level1type = 'Table', @level1name = 'DimProductCategory'; GO EXEC sp_addextendedproperty @name = 'ColumnRole', @value = 'SK', @level0type = 'Schema', @level0name = 'dbo', @level1type = 'Table', @level1name = 'DimProductCategory', @level2type = 'Column', @level2name = 'ProductCategoryKey'; GO EXEC sp_addextendedproperty @name = 'IsViewColumn', @value = 'Y', @level0type = 'Schema', @level0name = 'dbo', @level1type = 'Table', @level1name = 'DimProductCategory', @level2type = 'Column', @level2name = 'ProductCategoryKey'; GO
My BimlScript automatically imports the database DDL and converts it to Biml Annotation Tags like the following:
<Dimension Name="DimProductCategory" SchemaName="DataWarehouse.dbo" PackageSubpath="ImportTables"> <Annotations> <Annotation Tag="IsScripted">Y</Annotation> <Annotation Tag="LkpNoMatch">Unknown</Annotation> <Annotation Tag="TableRole">DimJunk</Annotation> </Annotations> <Columns> <Column Name="ProductCategoryKey" DataType="Int16" ScdType="Key"> <Annotations> <Annotation Tag="ColumnRole">SK</Annotation> <Annotation Tag="IsViewColumn">Y</Annotation> </Annotations> </Column>