Wednesday, October 2, 2013

Different table alias => Different execution plan


So, finally I decided to start writing a little bit of my SQL experiences along my career as a DB involved person.

For my first entry, I decided to write about a problem I had a year ago, about a query that performed very poorly in some conditions, mostly because of SQL parameter sniffing. I had the following scenario:


And the problematic query was the next (inside a SP, of course):
       
declare @SubscriptionName   nvarchar(60)

select @SubscriptionName = SubscriptionName
from Subscription where SubscriptionId = 246939

select
          c.ClientId
         ,c.ClientName                 
         ,su.SubscriptionNumber   
from Client c
inner join Subscription su
       on c.SubscriptionId = su.SubscriptionId
where su.SubscriptionName = @SubscriptionName
The parameter sniffing occurred because the distribution of the SubscriptionName field in Subscription table. We have a lot of duplicated names for different SubscriptionIDs. So, for instance, we have:
       
select count(*) as [ = 1 ] from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) = 1 ) as Subscriptions
= 1
26275
       
select count(*) as [ > 1] from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) > 1 ) as Subscriptions
> 1
11093

So, a lot of non-repeated, against a lot of repeated. But, a deeper analysis showed this:
select count(*) as [ >= 1000] from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) >= 1000 ) as Subscriptions
>= 1000
17
select count(*) as [ < 1000]  from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) < 1000 ) as Subscriptions
< 1000
37351

So now we have a LOT of low repeated names, and very few (in comparison) highly repeated names. And when one of these rows were used to create the execution plan (or vice versa), the resulting plan was not adequate for the other kind of distribution.

I couldn't change the table’s structure (nor create any index), and I was working on SQL2005 at that time, so a lot of possible optimizations could not be applied. No RECOMPILE option was available either, since the SP was HEAVILY used, and the DBA was explicitly against it. So I had to think in a different way of dealing whit this. So he proposed using different aliases to the Subscription table according to the distribution of the Name used for the search.

So, if the Name count was above 1000, one alias should be used, and if not, the default alias had to be used:
declare
  @SubscriptionName   nvarchar(60),
  @sql                nvarchar(max),
  @alias              nvarchar(6)

select @alias = N'su'

select @SubscriptionName = SubscriptionName 
from Subscription 
where SubscriptionId = @SubscriptionId

if (select count(SubscriptionName) 
   from Subscription 
   where SubscriptionName = @SubscriptionName) > 1000
  select @alias = N'sus'

set @sql =
  N'select top 100
    c.ClientId
  , c.ClientName                 
  , ' + @alias + '.SubscriptionNumber   
  from Client c
  inner join Subscription ' + @alias + '
  on c.SubscriptionId = ' + @alias + '.SubscriptionId
  where ' + @alias + '.SubscriptionName = @SubscriptionName'

exec sp_executesql @sql, 
    N'@SubscriptionName nvarchar(60)', 
      @SubscriptionName
This way, I was forcing the optimizer to use one or another execution plan. If the SubscriptionName distribution was under 1000, this was the generated execution plan:


But, if the SubscriptionName distribution was above 1000, this happened:


If I used the same alias for both cases, the execution plan was the same, no matter what distribution the column had. So, this way was able to force the optimizer to use different ways to deal with the name distribution issue we had (and have) on that table. I think it’s a nice way to handle some specific parameter sniffing problems, when you’re not able to use the “normal” ways of dealing with it.

Ok, that’s it, thanks for reading! This is my first time writing a IT blog, so any comments and suggestions are highly welcomed. Stay tuned, I will be talking about running background traces in SQL server to help debugging for the next installment of this blog.