So, finally I decided to start writing a little bit of my SQL experiences
along my career as a DB involved person.
For my first entry, I decided to write about a problem I had a year ago,
about a query that performed very poorly in some conditions, mostly because of
SQL parameter sniffing. I had the following scenario:
And the problematic query was the next (inside a SP, of course):
declare @SubscriptionName nvarchar(60)
select @SubscriptionName = SubscriptionName
from Subscription where SubscriptionId = 246939
select
c.ClientId
,c.ClientName
,su.SubscriptionNumber
from Client c
inner join Subscription su
on c.SubscriptionId = su.SubscriptionId
where su.SubscriptionName = @SubscriptionName
The parameter sniffing occurred because the distribution of the SubscriptionName field in Subscription table. We have a lot of
duplicated names for different SubscriptionIDs.
So, for instance, we have:
select count(*) as [ = 1 ] from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) = 1 ) as Subscriptions
select count(*) as [ > 1] from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) > 1 ) as Subscriptions
So, a lot of non-repeated, against a lot of repeated. But, a deeper analysis
showed this:
select count(*) as [ >= 1000] from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) >= 1000 ) as Subscriptions
select count(*) as [ < 1000] from (
select SubscriptionName, count(*) as SubscriptionCount
from Subscription su
group by SubscriptionName
having count(*) < 1000 ) as Subscriptions
So now we have a LOT of low repeated names, and very few (in comparison)
highly repeated names. And when one of these rows were used to create the
execution plan (or vice versa), the resulting plan was not adequate for the
other kind of distribution.
I couldn't change the table’s structure (nor create any index), and I
was working on SQL2005 at that time, so a lot of possible optimizations could
not be applied. No RECOMPILE option was available either, since the SP was
HEAVILY used, and the DBA was explicitly against it. So I had to think in a
different way of dealing whit this. So he proposed using different aliases to
the Subscription table according to the distribution of the Name used for the
search.
So, if the Name count was above 1000, one alias should be used, and if
not, the default alias had to be used:
declare
@SubscriptionName nvarchar(60),
@sql nvarchar(max),
@alias nvarchar(6)
select @alias = N'su'
select @SubscriptionName = SubscriptionName
from Subscription
where SubscriptionId = @SubscriptionId
if (select count(SubscriptionName)
from Subscription
where SubscriptionName = @SubscriptionName) > 1000
select @alias = N'sus'
set @sql =
N'select top 100
c.ClientId
, c.ClientName
, ' + @alias + '.SubscriptionNumber
from Client c
inner join Subscription ' + @alias + '
on c.SubscriptionId = ' + @alias + '.SubscriptionId
where ' + @alias + '.SubscriptionName = @SubscriptionName'
exec sp_executesql @sql,
N'@SubscriptionName nvarchar(60)',
@SubscriptionName
This way, I was forcing the optimizer
to use one or another execution plan. If the SubscriptionName distribution was under 1000, this was the
generated execution plan:
But, if the SubscriptionName
distribution was above 1000, this happened:
If I used the same alias for both cases, the execution plan was the
same, no matter what distribution the column had. So, this way was able to
force the optimizer to use different ways to deal with the name distribution
issue we had (and have) on that table. I think it’s a nice way to handle some
specific parameter sniffing problems, when you’re not able to use the “normal”
ways of dealing with it.
Ok,
that’s it, thanks for reading! This is my first time writing a IT blog, so any comments and suggestions are highly welcomed. Stay tuned, I will be talking about running background
traces in SQL server to help debugging for the next installment of
this blog.