Having had the opportunity to do extensive SQL work as a developer when first starting out I have been accustomed to joins for some time. Joins are one of the basic constructions of SQL and Databases as such – they combine records from two or more database tables into one row source. Depending on join type and join restrictions returned row count can be from 0 till all possible combinations of involved tables. Databases are built to make joins as efficient as possible. It means almost always joining data in database is more efficient than doing that somewhere else. It also means one has to know the power and possibilities of joins to fully exploit their strength.
Types of Joins
These are the most common joins that I have ran into:
- INNER JOIN – only rows satisfying selection criteria from both joined tables are selected.
- LEFT OUTER JOIN – rows satisfying selection criteria from both joined tables are selected as well as all remaining rows from left joined table are being kept along with Nulls instead of actual right joined table values.
- RIGHT OUTER JOIN – rows satisfying selection criteria from both joined tables are selected as well as all remaining rows from right joined table are being kept along with Nulls instead of actual left joined table values.
- FULL OUTER JOIN – rows satisfying selection criteria from both joined tables are selected as well as all remaining rows both from left joined table and right joined table are being kept along with Nulls instead of values from other table.
- CROSS JOIN – A cross join performs a cartesian product on the tuples of the two sets.
Usually cross joins are used quite rarely; some of the scenarios could be as follows:
- Possibility to generate high amount of rows. As we can see from relatively small tables there is possibility to get quite monstrous numbers.
- Find out all possible row combinations of some tables. Mostly this is useful for reports where one needs to generate all combinations for example all nationalities x genders for persons.
- To join a table with just one row. Most often used to get some configuration parameters.
Some other questions I’ve been asked over the years is proper aliasing. I’ve had people use the a,b,c, method and a short table def. for example:
from table 1 a (nolock)
inner join table 2 b on a.id = b.id
from customer cust (nolock)
inner join address adr on cust.id = adr.id
I’ve seen queries astronomicaly long with nasty subselects, joins, union alls, etc. Alot of times in my query tuning and looking at poorly designed queries, stored procedures, udf’s etc it is quickly seeable whether joins are being used correctly or incorrectly. When one takes the proper time to review the database and the schema and then build the queries with powerful joins performance can be quite optimal.
Future T-SQL Tuesdays
I urge you to take the time to check out Sebastians site and hey while I’m at it if you feel like you want to host a T-SQL Tuesday give Adam Machanic a shout. I can tell you from experience that I’m glad I did. Getting involved is a great way to get going with the SQL Community and the SQL family.
Thanks Sebastian for hosting this month!