I enjoy using sub-queries with the IN clause because it fits in my head nicely. I mean that I can translate a sub-query easily in my head with little effort whereas a join requires a little more brain gymnastics. Of course, mileage may vary for you.
Here’s an example taken from the Internet:
USE AdventureWorks ;
GO
SELECT DISTINCT c.LastName, c.FirstName
FROM Person.Contact c JOIN HumanResources.Employee e
ON e.ContactID = c.ContactID WHERE EmployeeID IN
(SELECT SalesPersonID
FROM Sales.SalesOrderHeader
WHERE SalesOrderID IN
(SELECT SalesOrderID
FROM Sales.SalesOrderDetail
WHERE ProductID IN
(SELECT ProductID
FROM Production.Product p
WHERE ProductNumber LIKE ‘FW%’)));
GO
This appeals to my way of thinking: the employeeIDs from the SalesOrderHeader table and orderID from the SalesOrderDetail and ProductID like a certain product number. The bucket brigade up the list of nested queries makes eminent sense.
Unfortunately, the SQL Server database engine isn’t human and doesn’t ‘think’ like me either. It’s inner workings, like all relational databases, are based on Set Theory from the abstract world of mathematics.
So if efficiency is what you’re after then use joins. Even if I only remove one of the sub-queries above and make it a join I get a performance boost:
USE AdventureWorks ;
GO
SELECT DISTINCT c.LastName, c.FirstName
FROM Person.Contact c JOIN HumanResources.Employee e
ON e.ContactID = c.ContactID WHERE EmployeeID IN
(SELECT SalesPersonID
FROM Sales.SalesOrderHeader
WHERE SalesOrderID IN
(SELECT SalesOrderID
FROM Sales.SalesOrderDetail sod, Production.Product p
WHERE sod.ProductID = p.ProductID
and p.ProductNumber LIKE ‘FW%’));
GO
Here the last sub-query is made into a join. Try it for yourself with your own T-SQL code.
As I’ve heard said before. The best optimization is done using brain power rather than computing power.
Two good articles on Optimization by Tim Chapman on the Tech Republic site.
Basic (agrees with many of the points Ken Henderson makes in chapter 16 [Transact-SQL Performance Tuning] of The Guru’s Guide to Transact-SQL ):
http://blogs.techrepublic.com.com/datacenter/?p=173
Advanced:
http://blogs.techrepublic.com.com/datacenter/?p=179
I came across your post when I was writing my own post on the same issue. Take a look at my solution and let me know what you think…
-Abe
Performance tuning SQL that uses the In Clause
Pingback: 2010 in review « Golbing
Hi everyone,
Please, some light over my problem…
I´m executing the query:
declare @dir varchar
SET @dir = ’4, 2′
SELECT *, empresa FROM DIRETORIAS WHERE (empresa IN (@dir))
The problem is that only the first record is showed “4″, if I change to ’2, 4′ only one record is showed, too, “2″.
Thanks in advance. Best
Daniel
@Daniel,
Three problems: first, by not specifying a width, you’re setting your @dir to the first character of the string you pass.
Second, the IN statement will match the whole value of @dir to the requisite column. If you want it to expand out, you would need to create @dir as a table variable, or use dynamic sql.
usefull link for Performance optimization
http://blogunlimited.com/MohammedRashid/archive/2012/04/10/performance-optimization-of-sql-server-using-where-clause.aspx