I mentioned in the last article that I am taking a temporary break from the T-SQL Basics Series of articles. I’ll return to it in a few weeks, but in the meantime I’ve been working other stuff for your amusement.
Zombie Database Survival Guide: Implementing PowerPivot with Zombie Databases
Do you have a Zombie Database? Does your database suffer from roaming hordes of problems like the lack of primary and foreign keys, poor normalization, in place updates and deletes, and an insatiable drive to eat your braaaaaaaaaaaaaains (CPU cycles).
Fear not fellow survivor, for I’ll show you how to control the zombie apocalypse and prepare your data for use with PowerPivot, a free add on for Excel 2010. I’ll show you how to use PowerPivot to slice and dice your data and keep your users safe from the zombie databases which threaten their very way of life.
A great time will be had by all.
SQL University
My good friend Jorge Segarra (Blog/Twitter) aka SQLChicken has asked me to become a professor at SQL University. Next week I’ll present a series of articles on Dimensional Modeling Fundamentals.
Tune in and you learn the following:
Why your database sucks, and what you can do about it.
The reasons you actually have to listen to business folks.
What’s with all the dimensional modeling techno-babble? What does it really mean?
How to break a Data Warehouse project into digestible chunks so you don’t get overwhelmed from the start.
What you really need to know about Dimension and Fact Tables.
There will be chills, thrills, and most likely spills; charts, pretty pictures, dancing girls…. Well, maybe not dancing girls, but I’ll share my views on Dimensional Modeling.
So, attend my presentation in Forth Worth if you can, and tune in next week to see if I can’t get myself disbarred from SQL University.
In this seventh article of the T-SQL Basics Series, we will be covering both the basics as well as some specific issues with SQL Joins. If you’ll recall our last lesson, we wrote a sales query for Master Po. He then requested that we add customer information to the report.
Note, after receiving e-mails from non-M2M users complaining that they couldn’t follow along, I’ve decided to switch to using a database that everyone can use. So, I’ve created a quick database for our Kung Fu Temple Gift Shop. If you want to follow along, please run the following code on a test server: Kung Fu Database Script. It will create a small and imperfect database for us to use in this series.
Prerequisite Reading
Before we begin this lesson, I ask that you read an ancient manuscript by Master Jeff Atwood (Blog/Twitter). Yes, I know he belongs to the developer’s temple, but he’s already written the excellent primer, A Visual Explanation of SQL Joins. Please read the text and come back when you’re finished so I can fill in the details. Go ahead Grasshopper, I’ll wait.
Additional Wisdom about Joins
I love the primer because of his use of Venn Diagrams. The first time I worked with SQL Joins, memories of middle school math lessons about Venn Diagrams popped into my mind. While nothing he wrote is incorrect, I wish to add to it and flesh out your knowledge of SQL Joins.
Let me give you a brief run down of the tables in the Kung Fu Temple Gift Shop Database, and how they are related. The tables we will be using are listed below. I’ve listed the Primary Keys, but have not yet set up Foreign Keys relationships. We’ll discuss those in a later lesson.
Table Name
Primary Key
Customers
CustomerID
Products
ProductID
OrderMaster
OrderID
OrderDetails
OrderID, LineNo
Inner Join
The Inner Join is most commonly used and is in fact the default join type. This type of join only returns records which match from both tables. There are two basic ways to write it. Let’s link the Order Master and Order Items tables and return a few fields from each.
Select
OM.OrderID
,OM.CustomerID
,OM.OrderDate
,OD.[LineNo]
,OD.ProductID
,OD.OrderQtyFrom
dbo.OrderMaster OM
,dbo.OrderDetails OD
where
OM.OrderID= OD.OrderID
Does that code look strange to you? Well, if you’ve been following my writings, it should. Some of the M2M Reports, which I’ll cover in a later lesson, use this old syntax and there is technically nothing wrong with writing it. The optimizer processes it the same way as the syntax I will cover in a minute and it runs just as fast. However, Master Po has forbidden it as an affront to the laws of nature, so you will never see me use it again. Why? I’m glad you asked.
It’s hard to read. This example looks simple because it only includes two tables. However, when more tables are involved its just too easy to forget one of the Where clause conditions. Therefore, using this joining method is…..
Dangerous. If you forget to include a table join in the Where clause, you will create a cartesian product with that join. A cartesian product returns every combination of each table where the join is either missing or invalid. For example, in our Kung Fu Gift Shop database there are 4 records in Order Master and 5 in Order Items, if we neglect to add the Where clause, the recordset jumps to 20 (4 * 5) records. In a production database with millions of rows in each table, the result set becomes ridiculously large. Also, every additional join you miss exponentially increases your recordset.
For these reasons we will explicitly the Join this way:
SELECT
OM.OrderID
,OM.CustomerID
,OM.OrderDate
,OD.[LineNo]
,OD.ProductID
,OD.OrderQtyFROM
dbo.OrderMaster OM INNERJOIN dbo.OrderDetails OD ON OM.OrderID= OD.OrderID
In T-SQL the use of the word Inner is optional as well. I personally add it, but many don’t.
SELECT
OM.OrderID
,OM.CustomerID
,OM.OrderDate
,OD.[LineNo]
,OD.ProductID
,OD.OrderQtyFROM
dbo.OrderMaster OM JOIN dbo.OrderDetails OD ON OM.OrderID= OD.OrderID
Please note that you can mix both styles in the same query, but I don’t recommend it for the same reasons.
SELECT
OM.OrderID
,OM.CustomerID
,OM.OrderDate
,OD.[LineNo]
,OD.ProductID
,OD.OrderQtyFROM
dbo.Products PRO, dbo.OrderMaster OM -- Do Not do this. JOIN
dbo.OrderDetails OD
ON OM.OrderID= OD.OrderIDWHERE
PRO.ProductID= OD.ProductID-- Join between Products and Order Details.
Outer Joins – Left, Right, and Full
Left Outer Join
The Left Outer Join returns all of the records (not filtered by a where clause) from the field to the left of the Join and then any corresponding records from the table to the right of the Join. The records from the first (left) table are returned even if there are no corresponding records in the other table. As with other Joins the word Outer is optional as well. Left Outer Join works the same as Left Join.
For example, lets say that Master Po wants a list of all of the Kung Fu Temple Gift Shop’s customers and any of the orders they have placed. You could write this query two ways. First, let’s use a Left Outer Join. Option 1 Left Outer Join
SELECT
CUST.CustomerID
,CUST.Title
,CUST.CustFirstName
,CUST.CustLastName
,OM.OrderID
,OM.OrderDate
,OM.DueDate
,OM.PONoFROM
dbo.Customers CUST
LeftJOIN dbo.OrderMaster OM
ON CUST.CustomerID= OM.CustomerID
Take a look at the first couple rows of the result set for this query. 5 Total Records were returned.
My name is listed twice because I placed 2 orders. Also, notice the Null values for Master Kan? The values for Order Master fields is null because there is a customer record for him but he hasn’t placed any sales orders.
Let’s look at the same query with an Inner Join. 61 Total Records were returned, one less than the previous query because we only have 1 customer which hasn’t placed orders. Option 2 – Inner Join
SELECT
CUST.CustomerID
,CUST.Title
,CUST.CustFirstName
,CUST.CustLastName
,OM.OrderID
,OM.OrderDate
,OM.DueDate
,OM.PONoFROM
dbo.Customers CUST
INNERJOIN
dbo.OrderMaster OM
ON CUST.CustomerID= OM.CustomerID
When run, the record for Master Kan is removed. So, which is right? Well, it depends on what Master Po meant. Does he want to include customers which have no orders? Option 1 includes all customers whereas option 2 omits customers without orders. I used this particular example to emphasize that you need to learn to ask the right questions in order to get the right answers. In this case, we’ll stick with the Left Join.
What if Master Po wants a list of only those customers who never placed an order? We can achieve this using a Left Join and adding “WHERE OM.OrderID IS NULL”. In this case, only one record shows. Do you remember seeing that type of join before? I used it to find orphan records in your M2M Database. M2M Support still hasn’t forgiven me for posting that as so many of you called them asking them to fix your database. I’ll cover more about Left Join usage and problems with Nulls in a future lesson.
Right Outer Join
Take everything I just said for Left Outer Join and reverse it. Period. In fact, I have NEVER written a Right Outer Join. There isn’t a reason that why one would need to use one. There’s a rumor that persists that Left Joins are faster than Right Joins, but that isn’t true either. I only use Left Joins for consistency sake, and I suggest you do as well.
Full Outer Join
A Full Outer Join is returns all rows from both tables regardless of whether there is a match. In a way, it combines a Left and Right Outer Join. It’s not all that commonly used. The most common instance in which I use it is to compare the results of two queries. For example, let’s compare simplified versions the Inner Join and Outer Join queries I listed previously.
SELECTFirst.CustomerID FirstCustID,
First.OrderID FirstOrderId,
Second.CustomerID SecondCustID,
Second.OrderID SecondorderID
FROM(SELECT CUST.CustomerID, OM.OrderIDFROM dbo.Customers CUST
INNERJOIN
dbo.OrderMaster OM
ON CUST.CustomerID= OM.CustomerID)FirstFULLOUTERJOIN(SELECT CUST.CustomerID, OM.OrderIDFROM dbo.Customers CUST
LEFTJOIN
dbo.OrderMaster OM
ON CUST.CustomerID= OM.CustomerID)SecondONFirst.OrderID=Second.OrderIDWHEREFirst.CustomerID<>Second.CustomerIDORSecond.CustomerIDISNULLORFirst.CustomerIDISNULL
I know it looks complicated, and the results are underwhelming, but we’re working with a small, simplistic data set. Basically I took the first query and creatively aliased it as “First”, the second as “Second”, put them together with a Full Outer Join, and then looked for any differences in their CustomerID fields. Don’t be thrown off by the “Is Null” statement as we’ll be covering Nulls in detail in a future article.
Another real world example where I use Full Outer Joins is if I’m changing report logic and need to prove that the results are the same. I’ll join the results of the first query, with my new query, and look for differences. There are also some 3rd party products which do that kind of comparison automatically.
Non Equi Joins
All of the examples we’ve used so far are Equi Joins because the join conditions have always been equal. Table A Inner Join Table B on Field A = Field B. Non Equi Joins are not equal, greater than, less than, etc. One common use for Non Equi Joins is joining dates to a table of fiscal periods. I’ll cover dates and calendar tables in detail later, but let’s assume that we had a table with a list of fiscal periods with a begin and end date for each period (Like M2M’s GLRULE table). In psudo-code you would join them like this:
SELECT*FROM dbo.OrderMaster OM
INNERJOIN
FiscalPeriods FIS
ON OM.OrderDate>= Fis.BeginDateAND OM.OrderDate< Fis.BeginDate
SQL Statements in Review
So far in this series, we have discussed the basics of T-SQL Syntax, importance of coding consistently, elements of the Where clause, and summarizing and grouping data. Let’s take a moment to review the order of statements in a SQL clause as a way of closing this series of lessons in the T-SQL Basics Series.
Select – followed by a list of fields, functions, etc. and should be properly aliased.
From – followed by the tables to be used in the query and how they are to be joined.
Where – followed by the conditions by which individual records are to be filtered out.
Group By – followed by fields by which records are to be grouped. Remember that order is important.
Having – followed by the conditions by which the groups are to be filtered out.
Order By – followed by the fields by which the final result set is to be ordered.
This is not the end of our lessons in T-SQL. In future installments I’ll discuss in more depth topics like sub-queries, the challenges associated with Nulls, common functions and their usage, working with dates and date tables, as many others. Patience Grasshopper, we’ll get there.
Today Grasshopper, I will teach you the basics of ordering, summarizing, and grouping your data to give it meaning. Let’s say that the Kung Fu Temple Gift Shop has had poor sales. Master Po wants a list of sales details from you, but he wants them ordered by the part number.
Note for the non Made2Manage people, I am using the M2M Educational Database. Sorels is the Sales Order Release table. I’m using it because it is the sales table which has dollar values. I’ve attempted to document the code with comments so students from other temples can follow the lesson.
Also, all of the screenshots are the data grids produced by Toad for SQL Server which recently released a new beta. Anyone can use the full beta until March 3rd.
Order By
The Order By clause does just what the name suggests, orders records. The Default order is ascending (ASC), but you can also return them in descending (DESC) order as well.
SELECT
SOR.fsono SalesNo
,SOR.fenumber ItemNo
,SOR.frelease Release
,SOR.fpartno PartNo
,(SOR.forderqty* SOR.funetprice) NetAmount -- Order Quantity * Net Unit Price FROM
dbo.sorels SORORDERBY
SOR.partno-- ASC is default and optional, DESC needs to be specified.
Master Po now decides that he wants the records sorted in descending (DESC) order and then sorted by part revision. Note that I’ve used the aliased field names as well, more on that later.
SELECT
SOR.fsono SalesNo
,SOR.fenumber ItemNo
,SOR.frelease Release
,SOR.fpartno PartNo
,SOR.fpartrev PartRev
,(SOR.forderqty* SOR.funetprice) NetAmount -- Order Quantity * Net Unit Price FROM
dbo.sorels SORORDERBY
PartNo Desc, PartRev
Master Po likes the report, but he would like to have the details summarized for him. How do we accomplish this?
Using Aggregate Functions
The most commonly used aggregate functions are listed below. The complete list and definitions can be found here.
Sum(SomeField)
Avg(SomeField)
Count(SomeField)
Max(SomeField)
Min(SomeField)
So Grasshopper, rather than tediously explaining what you can already guess, let’s look at the following examples.
This works, but Master Po is not happy. The data has little meaning because it doesn’t differentiate by Part Number, Customer, Date, or any other useful attributes. So, how can you summarize the data by part number for him? Well, you might think that you could do this:
SELECT SOR.fpartno PartNumber
,sum(SOR.forderqty* SOR.funetprice) TotalAmountOrdered
,Avg(SOR.funetprice) AverageUnitPrice
,count(*) SorelsRecordCount
,count(distinct SOR.fsono) SalesOrderCount
,Max(SOR.funetprice) MaxUnitPrice
,MIN(SOR.funetprice) MinUnitPrice
FROM
dbo.sorels SOR
However, SQL Server will return an error like “Column ‘dbo.sorels.fpartno’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.”
Using Group By
Grouping returns a single row for each unique value in the selected column with the corresponding aggregate value. To appease Master Po, let’s add a Group By clause to the above query.
So, what have we returned? Well, now the information has more meaning. Look at it again, after I’ve adjusted the alias names.
We can now see important patterns in the data. For example, why is there such a large discrepancy between the minimum and maximum unit price for part number CP2010 and CP0500? Could it be that we’ve been swindled by Ninjas? Group by can also be used for more than one column. In the following example, we are grouping first by the Part Number and then by the Sales Order Number.
Granted, the results are not earth shattering because I have a very limited data set. Also, since we have not joined to other tables, I can’t group by less specific attributes like Customer or Product Class, but you get the idea.
Let’s say that Master Po wants to know the total quantity due to ship per month per part number in the year 2010. In the future, I’ll show you a much easier way to do this using a date table, for now, we’ll use the following code.
The Month function returns the month number of a specific date as a number. Master Po can now scan the list, or better yet you could prepare a good looking report for him using SQL Server Reporting Services (SSRS). Using it he’ll know when certain parts are due and can schedule his monk’s time accordingly. Before any of you geeks out there call me on this, yes I know Master Po was blind, but he was also magical and I’m sure he could “read” SSRS reports using those abilities.
Master Po is a demanding Sifu, he must be considering what Caine had to do with his forearms before leaving the temple, and suspects you have some duplicate records in the sorels table because your figures seem a little too high. Bitter experience has shown Master Po that M2M was designed by the Drunken Master and lacks primary and foreign keys. How can we verify that we do not have duplicates in sorels and be allowed to leave the temple?
The Having Clause
If Master Po had designed M2M the sorels would have a primary key consisting of Sales Order Number, Sales Order Line Item, and Sales Order Release. You can learn more about Primary Keys and Referential Integrity problems in M2M here.
Having is nearly identical to the Where clause. Having filters groups in the same way that Where filters records. Since we want to know if we have duplicates, we’ll group by the columns which should be unique in each record.
Zero records are returned, meaning you have no duplicates. After telling Master Po, he wants another variation of the report. He wants to know the total dollar amount per sales order, but only wants to see those with a total more than $10,000. Yes, I know he would want it in Yen but my data set is limited to USD so just play along.
To impress Master Po further, we ordered them from highest to lowest dollar amount.
Field Name Alias Usage
Notice that the aliased field name cannot be used in the Group By or Where clause , but they are permissable in the Order By clause.
Groupby PartNumber -- not legalOrderby PartNumber --Legal
Why? The reason field aliases cannot be used in these clauses statements is due to the order in which SQL Server evaluates statements. Basically, SQL Server evaluates the statement in the following steps.
The From clause including the Joins (more on Joins later).
The Where clause filters out any unnecessary records.
The Group By is then applied.
Having filters out unneeded groups.
The Select is evaluated including the aliased field names assuming they’ve been assigned.
The Order By then sorts the records.
Since the Select portion of the SQL statement is evaluated after the Where, Group By, and Having clauses, none of them can use aliased field names. SQL Server really has no idea that sor.fpartno is PartNumber at that point. The table alias can be used throughout because it is assigned in the first step. Make Sense?
The Demanding Master Po
Anyway, back to our story. You’ve accomplished what Master Po asked, verified that your table has no duplicates, and you are sure he will be pleased. However, when you bring it to him, he smiles and says, “Very good Grasshopper… but which customers placed these orders? Oh, and be sure to practice your pebble snatching as well.”
Well Grasshoppers, I’ll show you how to get the customer information in the next lesson. As far as snatching pebbles, you’re on your own.
Does anyone else have pebble snatching tips to share?
SQL Rally is a brand new event for this year. It’s kind of a hybrid between the PASS Summit and a SQL Saturday. It’s fairly low cost and being held May 11-13 in Orlando, Florida.
Anyway, the reason for this post is completely self serving. I really want to present at this event. Please register to vote and vote for my sessions. Mine are in the 10th section in the SSRS category. The abstracts I submitted are:
Title: Choose Your Own (Reporting) Adventure – SSRS Survivor Series
Abstract: Create a story of guided analytics using Reporting Services drill down, drill through, and expressions. Sure, you may be using drill down already, but I’ll show you how to make the most of Reporting Services interactions, and more importantly how to avoid becoming dragon chow at the end.
Title: The Power of Parameters – SSRS Survivor Series
Abstract: Want to know the best techniques to help your user narrow their search in the most efficient ways possible? Come watch me demonstrate which parameter techniques are best for certain situations and field types. Real world problems and applications are stressed, and I’ll demonstrate why some commonly accepted methods simply do not work and should be avoided.
–My name is Made2Mentor, and I approved this message.
All of the queries we’ve looked at so far will return all of the records in a table. This isn’t very efficient. The Where clause filters records out of your result set that you do not want.
For example, suppose we’re only interested in Sales Order Master records for “Acme Rockets Inc.” The following select statement will work:
SELECT SOM.fcustno CustNo
,SOM.fcompany CustName
,SOM.fsono SalesNo
,SOM.fsorev SORev
,SOM.fcustpono CustPONo
,SOM.forderdate OrderDate
FROM dbo.somast SOM
where SOM.fcompany='Acme Rockets Inc.'
Notice that I have used single quotes around the company name, they are required for character values, but not numerics. Please note that many fields that M2M refers to as numbers are actually not numeric fields. Most record number fields such as Sales Order Number, Purchase Order Number, and Invoice Number are 6 digit character fields even though they may contain only numeric values. Therefore, this is how to write the statement if you only want to return information for Sales Order “Number” 000001.
SELECT SOM.fcustno CustNo
,SOM.fcompany CustName
,SOM.fsono SalesNo
,SOM.fsorev SORev
,SOM.fcustpono CustPONo
,SOM.forderdate OrderDate
FROM dbo.somast SOM
where SOM.fsono='000001'
And/Or Operators
What if we took the above example and wanted to retrieve 2 sales order records? An Or statement will return as true if either expression is true. Therefore, an Or statement like this will retrieve both records.
SELECT SOM.fcustno CustNo
,SOM.fcompany CustName
,SOM.fsono SalesNo
,SOM.fsorev SORev
,SOM.fcustpono CustPONo
,SOM.forderdate OrderDate
FROM dbo.somast SOM
where SOM.fsono='000001'
or SOM.fsono='000002'
Let’s say you wanted to check for records for Acme Rockets Inc. with a particular Sales Order Number. The And statement will only evaluate to true if both conditions are true.
SELECT SOM.fcustno CustNo
,SOM.fcompany CustName
,SOM.fsono SalesNo
,SOM.fsorev SORev
,SOM.fcustpono CustPONo
,SOM.forderdate OrderDate
FROM dbo.somast SOM
where SOM.fsono='000001'
and SOM.fcompany='Acme Rockets Inc.'
I recorded a quick video to explain the gotchas of using And with Or in a Where clause.
Parenthesis Use
As I mentioned in the video, parenthesis can be used to modify expression evaluation in a Where clause. Suppose you wanted to run the previous query for two separate sales order numbers. In English we are looking for any Acme orders with an order number of ’000001′ or ’000002′ that are for Acme Rockets Inc.
SELECT SOM.fcustno CustNo
,SOM.fcompany CustName
,SOM.fsono SalesNo
,SOM.fsorev SORev
,SOM.fcustpono CustPONo
,SOM.forderdate OrderDate
FROM dbo.somast SOM
where(SOM.fsono='000001' or SOM.fsono='000002')-- evaluated as a group
and SOM.fcompany='Acme Rockets Inc.'
Greater Than, Less Than, Not Equal To
Where clauses are not limited to equal conditions. Not Equal To (<> , !=), Greater Than or Equal To (>=) Less Than or Equal To (<=) and simple Greater Than or Less Than are all supported. So, the previous query could also be written:
SELECT SOM.fcustno CustNo
,SOM.fcompany CustName
,SOM.fsono SalesNo
,SOM.fsorev SORev
,SOM.fcustpono CustPONo
,SOM.forderdate OrderDate
FROM
dbo.somast SOM
where SOM.fsono<='000002'
and SOM.fcompany='Acme Rockets Inc.'
It’s important to note that since somast.fsono is a character field the less than or equal to is treated a like an alphabetical lookup rather than a numeric one. Typically this isn’t a problem because the numbers consistently include preceding zeros, but it is something to keep in mind.
The Between Statement
Suppose we wanted to return all of the Sales Order Numbers between ’000001′ and ’000100′. Well, we’d use the Between statement of course.
Where fsono between '000001' and '000100'
Keep in mind that the statement is inclusive meaning that the first and last value are also returned. If you didn’t want to include ’000100′ you would need to edit the last value or use a less than instead. The most common use of between is with date comparison and there are gotchas with that. I’ll explain date manipulation and comparison in a later blog article.
The IN Operator
Let’s say that you want retrieve Sales Orders from several of Acme’s subsidiaries. In this example, you want to extract records for “Acme Rockets Inc.” and “Acme Rocket Skates” as well. You could use the following Where clause:
Where SOM.fcompany='Acme Rockets Inc.' or SOM.fcompany='Acme Rocket Skates'
However, suppose Acme has 20 subsidiary companies all with some version of Acme in the name. Using Or would be unwieldy. Therefore, you can use the In statement like this:
WHERE
SOM.fcompanyIN('Acme Rockets Inc.'
,'Acme Rocket Skates'
,'The Acme Company'
,'Acme Inc'
,'A Company Making Everything'
,'Acme'
,'Acme Rackets Inc.'
,'Acme Rickets Inc.')-- etc.
While this is better, listing 20 companies this way would still be a pain. So, for this case, we’re going to use the Like statement.
Using Like and Pattern Matching
The most commonly used wild card character is %. % matches any string of characters or no characters at all. For the sake of argument, let’s say that all of the variations of Acme actually started with the word Acme. They could be returned like this:
WHERE
SOM.fcompanyLIKE('Acme%')
Returns:
CustName
Acme Rockets Inc.
Acme Rocket Skates
Acme Inc
Acme
Acme Rackets Inc.
Notice that we lost “The Acme Company”. If you wanted to include them the following statement will work.
WHERE
SOM.fcompanyLIKE('%Acme%')
Returns:
CustName
Acme Rockets Inc.
Acme Rocket Skates
The Acme Company
Acme Inc
Acme
Acme Rackets Inc.
Use a preceding % sparingly because the statement becomes non-Sargable. Basically that query will be unable to use an index placed on the field and may cause performance problems.
The astute reader will notice that we missed the Acme company named “A Company Making Everything”. How can we make sure they are returned as well? More than one wildcard character can be used.
WHERE
SOM.fcompanyLIKE'a%c%m%e%'
Returns:
CustName
Acme Rockets Inc.
Acme Rocket Skates
Acme Inc
A Company Making Everything
Acme
Acme Rackets Inc.
Acme Rickets Inc.
Important to note. You will notice that my last wildcard statement is in a different case than the records in the table. Depending on the collation setting of your database, your results may vary as some are case sensitive.
An Underscore (_) replaces a single character.
WHERE
SOM.fcompanyLIKE'Acme R_ckets Inc.'
Returns:
CustName
Acme Rockets Inc.
Acme Rackets Inc.
Acme Rickets Inc.
Let’s raise the complexity a notch. Brackets [] can also be used to match a single character in a range.
WHERE
SOM.fcompanyLIKE'Acme R[a-i]ckets Inc.'
Returns:
CustName
Acme Rackets Inc.
Acme Rickets Inc.
The range can also represent distinct letters.
WHERE
SOM.fcompanyLIKE'Acme R[a,o]ckets Inc.'
CustName
Acme Rockets Inc.
Acme Rackets Inc.
Brackets can also exclude characters in a range. The symbol ^ (typically over the 6 key) represents Not.
WHERE
SOM.fcompanyLIKE'Acme R[^a-i]ckets Inc.'
Only Acme Rockets Inc is returned. Brackets and underscores are pretty powerful, but aren’t used nearly as frequently as the general wildcard %.
Using Not
I’ve already eluded to the use of Not with wildcards. However, it’s easy to use in non-wildcard clauses as well. Not negates any expression that it precedes. Using the sample database the following query:
SELECT
SOM.fsono SalesNo
FROM
dbo.somast SOM
WHERE
SOM.fcustno='000104'AND
SOM.fcompanyNOTLIKE('Acme Rockets Inc.')
Returns:
SalesNo
000003
000020
000052
That about covers the basics of filtering records using a Where clause. In future articles I’ll discuss Joins, Grouping, and the use of Sub Queries.
Any questions? Is there anything in particular you’d like me to cover?
As I mentioned in the last article, you need to develop or adopt a standardized way of writing T-SQL code. In this article, I’m going to share the habits I’ve developed, “my way” if you will.
Vizzini: Finish him. Finish him, your way.
Fezzik: Oh good, my way. Thank you Vizzini… what’s my way?
Vizzini: Pick up one of those rocks, get behind a boulder, in a few minutes the man in black will come running around the bend, the minute his head is in view, hit it with the rock.
Fezzik: My way’s not very sportsman-like.
- The Princess Bride
Just like Fezzik, you may be asking, “Well, which way is my way?” The best way for me to help you with that is to share the method(s) I’ve developed and reasons for doing so. Keep in mind that many people will disagree with my way, and that’s alright. This article however, is all about me, and since these methods work for me, I think they’ll work for you too. Adoption of a 3rd party query tool helps to keep your code consistent.
Capitalization Usage
I write all of the T-SQL functions in ALL CAPS and typically CamelCase field and database names. More on this later.
Field Order
I list the fields in an order which makes sense to me. Consider the script I shared with you in the last article:
SELECT fcompany, fcustno, fcustpono, forderdate, fsono, fsorev FROM somast
While the computer doesn’t care, I list my fields in roughly most general to most specific order. Therefore, I would order those in the following manner:
SELECT fcustno, fcompany, fsono, fsorev, fcustpono, forderdate FROM somast
This order just “feels” right to me. Customers “are larger” or less specific than Sales Orders and the fields are listed in this order. This will become a lot more important when we discuss joining tables.
I use a lot of vertical space in my queries. The query above becomes this:
SELECT fcustno,
fcompany,
fsono,
fsorev,
fcustpono,
forderdate
FROM somast
Once again, the computer couldn’t care less, it just makes everything easier for me to read.
Leading vs. Trailing Commas
I prefer to use leading commas. The query now looks like this with leading commas:
SELECT fcustno
,fcompany
,fsono
,fsorev
,fcustpono
,forderdate
FROM
somast
It may take you awhile to get used to seeing the query like that, but there’s a good reason for doing so. If you are editing a long query with trailing commas, and you need to remove the last entry in the list, you will invariably forget to remove the final comma and receive an error.
SELECT fcustno,
fcompany,
fsono,
fsorev,
fcustpono,
-- forderdate **** Using Double dashes comments out this line. This code now ends with a comma and will error out. FROM somast
Leading comma queries are easier to edit in my opinion.
Aliasing
Butch: Would you hand me a dry towel, miss beautiful tulip?
Fabienne: Oh, I like that. I like being called a tulip. Tulip is much better than mongoloid.
-Pulp Fiction
Aliasing is used in two places in a SQL Statement, the Select clause and the From clause. The From clause alias renames the database object and I typically use 3 or 4 character alias names. I also include the database owner or schema as well. Alias names can be assigned following the AS keyword, but it is optional. For example, the leading comma query becomes this:
SELECT SOM.fcustno
,SOM.fcompany
,SOM.fsono
,SOM.fsorev
,SOM.fcustpono
,SOM.forderdateFROM
dbo.somastAS SOM -- Or dbo.somast SOM
I know that seems like extra work, but its absolutely necessary when you write multiple table queries. Some people alias their tables with meaningless single letter values like A, B, and C. I can’t stand this because when I read a SQL statement aliased in this way, I have to constantly look at the from clause to remind myself what each alias represents.
Once again, consistency is key. I always use the following alias names:
somast = SOM
soitem = SOI
sorels = SOR
slcdpm = SLC
etc.
Since I am consistent, I always know that a field aliased with ARI represents Aritem; and I can cut and paste between queries as well. Incidentally, FastForms adds _EXT to the name of it’s extension tables.. Therefore, I add a lowercase e to any extension table alias. Soitem_EXT becomes SOIe. Keep it simple.
The second alias type applies to the field name. What the heck does fsono mean to a user? Therefore, I alias my queries like this:
SELECT SOM.fcustno CustNo
,SOM.fcompany CustName
,SOM.fsono SONo
,SOM.fsorev SORev
,SOM.fcustpono CustPONo
,SOM.forderdate OrderDate
FROM
dbo.somast SOM
Much Easier to Read
Also, because I’ve aliased my table names in CamelCase, many query tools such as SSRS will automatically format them as two words. “OrderDate” becomes “Order Date”. You can include spaces in your alias names, but you must put brackets around them. [Order Date] will render properly in almost any reporting tool, but I dislike all the extra typing.
Tomorrow, I’ll post my first video which will be a short demonstration of how I follow these “rules” with Toad for SQL Server.
“First learn stand, then learn fly. Nature rule, Daniel-san, not mine.” – Karate Kid
What is SQL?
SQL (Structured Query Language) is a language for manipulating database data and structure. It can be pronounced by stating it’s letters “S-Q-L” as well as the word “sequel”, as in the Matrix had two crappy sequels.
SQL is mostly standardized across Relational Database Management Systems RDMBS, but each, like SQL Server, may have it’s own unique syntax. T-SQL is Microsoft’s version of the language and it’s the version I focus on.
Two Main SQL Statement Types
There are other types of SQL Statements but we’ll restrict our discussions to these two.
Data Definition Language (DDL) – These define database structure and schema. Eg. Create Database.
Data Manipulation Language (DML) – These statements manipulate data. Eg. Select * from
Most of what I’ll cover with you are DML since most Made2Manage Admins will not be creating a lot of databases, manipulating indexes, etc.
Basic Syntax Rules of Using SQL
“Do you understand the words that are coming out of my mouth? – Rush Hour
These rules are true for the most part and in most circumstances. I’m generalizing to make things as simple to learn as possible. The syntax requirements of T-SQL are incredibly flexible. In this article, I’ll explain what you can do, and in a future article, I’ll explain what I think you should do.
I’m explaining these rules in the context of a very basic select statement. This following statement requests all of the values in the column (or field) fsono from the somast table.
SELECT fsono FROM somast
T-SQL statements are case insensitive. For example the following scripts will function the same way:
SELECT fsono FROM somast
SeleCT fSoNO froM SOMast
The “*” is a wildcard and will select all fields from a table. “Never” use this as it wastes resources and creates maintenance headaches. If you delve into M2M’s own code however, you will see many instances where they do that very thing.
When selecting fields, separate them with commas. The order of fields selected doesn’t matter to the computer, but as I said earlier I’ll explain my preferences in a future post.
White Space, and placement of some punctuation, doesn’t matter. The following statements are equivalent to the computer.
SELECT fcompany, fcustno, fcustpono, forderdate, fsono, fsorev FROM somast
-- Extra SpacesSELECT fcompany , fcustno, fcustpono, forderdate, fsono, fsorev FROM somast
-- Extra Carriage ReturnsSELECT fcompany,
fcustno,
fcustpono,
forderdate,
fsono,
fsorev
FROM somast
--Leading CommasSELECT
fcompany
,fcustno
,fcustpono
,forderdate
,fsono
,fsorev
FROM
somast
Consistency
If there’s one thing I’ve learned over the years its that consistency is key. Since most M2M Admins are the lone DBA and reporting person, you must create your own standard ways of writing SQL code and stick to them. Consistency has several benefits.
Your code will be easily read by you and others.
You will be able to cut and paste code snippets between SQL Scripts (more on this later).
Your efficiency goes up dramatically when you work this way.
So Daniel-san, as I said before this is your first lesson. Tomorrow, I’ll show you around our dojo that will use for future lessons in this T-SQL Basics Series.
In previous articles, I’ve mentioned that I use Toad for SQL Server, but until now I have not really advocated that my readers use it. While I’ve been using Toad for several years, I’ve had a love/hate relationship with it. When I first started using it, the product was buggy and problematic. Also, while Toad does everything, it is often unintuitive. To compound that problem, not many people blog about it so information isn’t readily available as to how to leverage Toad effectively. Even after all this time I still come across some new feature in the product on a weekly basis. If my readership, all 5 of you, are interested in learning more about it, I’ll create a regular blog segment for Toad tips and tricks.
Unsurprisingly, I took an active role in the improvement of the product. Quest Management sensed my enthusiasm for giving feedback (loudly demanding to be heard), so they assigned me a dedicated tech support rep. He has dutifully relayed my suggestions (rants and complaints) directly to the development team and it’s nice to be heard. Amazingly, they’ve implemented most of my suggestions over the years and it’s a much better product because of it.
What’s the Point?
Toad is in a new beta cycle for version 5.5. This means that anyone can download and use the full product until February 13th. As I am embarking on this series of basic T-SQL articles, I am encouraging you download Toad and use it with me. All of my screenshots in these articles will be using the new Toad beta.
Keep in mind that this is a beta product, and therefore will have bugs. For example, someone has disabled the ability to save editor windows individually, presumably by accident. As of now, you must save all files at once. Even so, I’ve already switched to using it in production because so many of my latest rounds of my suggestions were implemented that this tool now feels like it was designed specifically for me.
Why do I need a 3rd party program to write T-SQL?
In my opinion, all beginning SQL writers should use aids in writing T-SQL, especially if you are using SQL 2000′s Query Analyzer. Query Analyzer is basically like programming in Notepad. There is no Intellisense or code assist to speak of, the error messages are cryptic and difficult to understand, and there isn’t any structure or style enforced.
The two most common 3rd party choices are Red Gate’s SQL Prompt and Quest’s TOAD for SQL Server. Both offer code completion which increases the rate at which you code, and SQL Formatting which standardizes your code structure. However, they are paid software. SQL Prompt works directly in SSMS, while TOAD is a stand alone product. They both have features to recommend them.
There are also some free SQL Formatting options you can use such as:
SSMS Tools Pack by the great Mladen Prajdic(Blog/Twitter) adds all sorts of extra functionality to SSMS.
The web sites will format your T-SQL code, but it’s a pain to open them, copy your T-SQL from SSMS, paste it into the website, format it, and then copy and paste it out again. My time is worth money to me, so I use the paid version of TOAD for SQL Server. The SSMS Tools Pack is great, but it doesn’t offer the code formatting (vertical and horizontal spacing) that Toad and SQL Prompt do.
How to set up Toad
Simply download and install the product with the defaults. As Toad fires up the first time, you’ll encounter the following Configuration Wizard. For now, choose MS Management Studio.
After that you may get a Connections Wizard screen. Cancel that and I’ll show you how to set up a connection in the regular interface.
Create a Connection
One of the great aspects of Toad is that you can create several connections to servers and databases, save the login information if you choose, and switch back and forth between them seamlessly. Toad makes extensive use of the right click menu and I rely heavily on it.
Right click in the upper left hand corner of the screen in the Connection Manager and select Create.
The Create New Connection screen appears. Just like in SSMS, the server name can be typed in or browsed by selecting the ellipses. Select your authentication method, password if required, and default database for that connection. You aren’t limited to that database for query writing of course.
The interesting settings are below the Name line. Toad allows you to name your data connections. At a glance I can tell which server I am accessing and in what capacity. I use names such as:
M2M Production – Read Only
M2M Production – Administrator
M2M Development
M-Data Analytics Production
Categories allow you to color code editor windows. Connections with production logins that have write access I use red, development is green, etc.
Finally, you can elect to have connections start up automatically when you start Toad and to save passwords when you’re using a SQL account login. I never connect to my production server automatically with write access. I never save passwords for accounts that have write access either.
Create a New Editor Tab
Depending on the default settings you may see a blank editor or you may see something like this:
That blank tab is new in this beta, and although I like the concept I don’t care for the way they implemented it. Anyway, right click on the tab and select Open New Editor.
The editor is where you write your T-SQL Code. In future articles I’ll describe how to write T-SQL the Toad way to increase your speed and effectiveness.
As I mentioned before, one of the reasons why I’m suggesting you use Toad is for the automatic formatting. Kevin Kline (Blog/Twitter) has put out a few excellent videos on Toad. You should watch Basic Formatting and Advanced Formatting as soon as you start using the product. I just wish Quest would produce more of them (Hint Hint Hint).
Full Disclosure: I am not sponsored, nor paid by Quest in any way. I just simply like Toad for SQL Server.
So, who’s with me? Who’s going to give Toad a try? Are you interested in learning more about Toad?
In my previous article I mentioned my surprise that many readers were not running the sample code posted on my blog. When I asked a few as to why, the number one answer was that some M2M Admins don’t know which program(s) they can use to query their SQL databases. I don’t know why this came as a shock since M2M Support instructed me how to open Query Analyzer my first time a decade ago. We all start somewhere.
Anyway, I’ve put together this basic tutorial as to how to do that. I’m hitting only the bare basics of how to use the query tools, and will mostly leave it up to you dear reader to Google more about them or to ask questions in the comments.
Keep in mind that you should not remote desktop your server and log into administration and query software on it because that eats up memory and other resources. You should install the SQL tools on a workstation and administer and query remotely. Also, you should never run code from my site, or any site for that matter, on your production server, you really need a test server.
SQL 2000 Query Analyzer
In order to open up the SQL 2000 Query Analyzer, you’ll typically select Start, All Programs, Microsoft SQL Server, and Query Analyzer.
M2M requires you to install SQL Server in Mixed Mode and requires administrator access to its databases. Therefore, you can log in to QA with Windows Authentication or SQL Server Authentication. Notice how the SQL Server selection defaults to “localhost” which is the SQL Engine running on the local machine. You can also substitute that with a “.” as well. When connecting to another server, you can either type the name in directly or select the ellipses and a list of SQL Servers on your network will display for you to select.
After logging into QA, you’ll see a screen similar to the following. I’ve entered a simple query and executed it against M2M’s sample database.
Keep in mind that I’m only covering the basics. Queries are entered in the Query Pane, the results are shown in the Results Pane, and you can browse objects such as databases and fields in the Object Browser. I’ve also highlighted where you can change the database your query will run against, as well as the buttons to check your syntax and execute your query.
You can also execute your query by hitting F5 as well. If you have multiple queries in the window, all will execute. However, selecting and highlighting specific query(ies) and hitting F5 will limit execution to those queries.
SQL Server Management Studio (SSMS) 2005 through 2008/R2
I’m covering all of these together, because for the most part they function the same way. To open SSMS, typically you’ll select Start, All Programs, Microsoft SQL Server 2005 (or whichever version you are using), and SQL Server Management Studio. In the screenshot below, you’ll notice that I’ve substituted a “.” for local host.
After logging in, you’ll see something similar to the following.
Notice that I have browsed into the M2MData01 Database, and you can see the objects in the Summary Pane. To write a new query, select the button I’ve highlighted and the Summary Pane. I’ve written and executed the same query below.
This isn’t to say that there haven’t been improvements in SSMS over the years. Brad McGehee (Blog/Twitter) wrote an excellent article on these improvements some time ago.
Third Party Products
There are several third party products you can use to query your database and I will cover TOAD for SQL Server in a later blog post.
Popular Articles