Who Designs the Data Warehouse (Dimensional Model)?
If you build it, will they come? It seems obvious that you design the data warehouse, that’s why you’re taking your time read this blog, or maybe you’re just fascinated with my movie quotes. But seriously, who designs this thing? The business users.
The business users have a large part in designing a data warehouse. This isn’t an IT process, its a business process. The subjects of the data warehouse are decided by the business, as well as what information needs to be kept on each. In reality the data warehouse exists to assist users in making better decisions. The data warehouse architect is there to assist and guide in this process. One sure way to fail is to have IT department centered data warehouse because the users won’t see value in it.
Notice that I didn’t suggest that you should ask the business users what they want. If you do so, you are handing over the design role to the business user. They don’t know what they want because they don’t know what you provide. They will tell you something, but because they don’t understand what you can do, it will be wrong. It’s up to you to bridge the gap and learn the business to so you can design a successful dimensional model.
So, the business users don’t understand tables, keys, referential integrity, etc. Part of the data architect’s job is to educate the users about what can be done so they can provide better input into the process. So, how can you teach them the basics of fact and dimension tables? The same way I’m going to teach you.
Let’s start out with pivot tables, which most users understand. Ask the user for a typical reporting requirement that they might have. Suppose they say, “I want to know incoming order amounts by fiscal year, period, and product category.” Well, the most users understand how to use Excel pivot tables so that’s the method we’ll use.
Open Excel, query the data, and create a pivot table. The user may already know that they need to drag the amount to the Values box, and already know how to pull the values for year, month, and product category to the columns and rows as shown. The values, or the column you want to measure, are your facts. The things you measure facts by (by fiscal year, period, and product category) are your dimensions. Dimensions and facts go in separate tables, but the users may never see them. Now that they understand that, what’s next?
How do we model the model?
Do you want to know what the Matrix is Neo? – Morpheus The Matrix
Start with the Kimball Bus Matrix, which is kind of like a high level road map of your project. Along the left hand side are the business processes which bring value to your organization. The columns are the descriptive objects (dimensions) for your organization. If a dimension is involved with a specific business process, place an X in the intersecting cell. Notice that I have added a column called “Business Priority” which I use to indicate the order in which I will handle each process.
Typically the Bus Matrix is much larger because enterprises typically have more business processes. The example presented is a simplified version demo purposes.
How do we determine which order to tackle the business processes?
Well, the first business process takes the longest. You’re going to find that dealing with the dimensions typically takes a lot longer than fact tables. When you work on your first business process, you will need to create all of the dimensions associated with it. In my example above, if we start with Incoming Orders, every dimension but Vendors has to be created. However, each additional business process becomes easier because we re-use the already existing dimensions.
So, which business process do you start with? Start with the low hanging fruit. You want to pick a process which is relatively easy and has high business value. Perhaps that value represents money savings, increasing sales opportunities, or something that is a mandated compliance issue. Once again, all of this is considered with a focus on the business. We want a quick return on investment so we continue to have the support of management.
So, you’ve identified the business process and the facts and dimensions associated with it. Note that we haven’t started fleshing out the dimension fields with the descriptive data. That’s what we’ll handle next when I show you the basics of modeling dimensions.