SQL Server Performance and the IN clause

I enjoy using sub-queries with the IN clause because it fits in my head nicely. I mean that I can translate a sub-query easily in my head with little effort whereas a join requires a little more brain gymnastics. Of course, mileage may vary for you.
Here’s an example taken from the Internet:

USE AdventureWorks ;
GO
SELECT DISTINCT c.LastName, c.FirstName
FROM Person.Contact c JOIN HumanResources.Employee e
ON e.ContactID = c.ContactID WHERE EmployeeID IN
(SELECT SalesPersonID
FROM Sales.SalesOrderHeader
WHERE SalesOrderID IN
(SELECT SalesOrderID
FROM Sales.SalesOrderDetail
WHERE ProductID IN
(SELECT ProductID
FROM Production.Product p
WHERE ProductNumber LIKE ‘FW%’)));
GO

This appeals to my way of thinking: the employeeIDs from the SalesOrderHeader table and orderID from the SalesOrderDetail and ProductID like a certain product number. The bucket brigade up the list of nested queries makes eminent sense.
Unfortunately, the SQL Server database engine isn’t human and doesn’t ‘think’ like me either. It’s inner workings, like all relational databases, are based on Set Theory from the abstract world of mathematics.
So if efficiency is what you’re after then use joins. Even if I only remove one of the sub-queries above and make it a join I get a performance boost:

USE AdventureWorks ;
GO
SELECT DISTINCT c.LastName, c.FirstName
FROM Person.Contact c JOIN HumanResources.Employee e
ON e.ContactID = c.ContactID WHERE EmployeeID IN
(SELECT SalesPersonID
FROM Sales.SalesOrderHeader
WHERE SalesOrderID IN
(SELECT SalesOrderID
FROM Sales.SalesOrderDetail sod, Production.Product p
WHERE sod.ProductID = p.ProductID
and p.ProductNumber LIKE ‘FW%’));
GO

Here the last sub-query is made into a join. Try it for yourself with your own T-SQL code.
As I’ve heard said before. The best optimization is done using brain power rather than computing power.

About tgrignon

I came I saw I rented the DVD
This entry was posted in Miscellany and tagged , , . Bookmark the permalink.

6 Responses to SQL Server Performance and the IN clause

  1. tgrignon says:

    Two good articles on Optimization by Tim Chapman on the Tech Republic site.

    Basic (agrees with many of the points Ken Henderson makes in chapter 16 [Transact-SQL Performance Tuning] of The Guru’s Guide to Transact-SQL ):
    http://blogs.techrepublic.com.com/datacenter/?p=173
    Advanced:
    http://blogs.techrepublic.com.com/datacenter/?p=179

  2. Abe Miessler says:

    I came across your post when I was writing my own post on the same issue. Take a look at my solution and let me know what you think…

    -Abe

    Performance tuning SQL that uses the In Clause

  3. Pingback: 2010 in review « Golbing

  4. Daniel says:

    Hi everyone,

    Please, some light over my problem…

    I´m executing the query:

    declare @dir varchar
    SET @dir = ‘4, 2’
    SELECT *, empresa FROM DIRETORIAS WHERE (empresa IN (@dir))

    The problem is that only the first record is showed “4”, if I change to ‘2, 4’ only one record is showed, too, “2”.

    Thanks in advance. Best

    Daniel

    • Lupis42 says:

      @Daniel,

      Three problems: first, by not specifying a width, you’re setting your @dir to the first character of the string you pass.

      Second, the IN statement will match the whole value of @dir to the requisite column. If you want it to expand out, you would need to create @dir as a table variable, or use dynamic sql.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s