Thursday, June 16, 2011

Dynamic search in SQL

There may be a number of reasons to implement dynamic SQL, some are good, some bad, and others...well, how else would we learn?
In this post I am going to focus on one of the better uses for dynamic SQL-building dynamic search based on conditions.
Let's take an example scenario where we need to search a table based on user user input. For the sake of this demonstration let's assume that application logic implies that empty fields should be excluded from search conditions. This leaves a question on how to implement supporting functionality on the server side. While this may not be a problem for databases that use row modeling where data is represented across multiple rows, most of today's relational databases use good ol' columnar representation. 
One of the ways to implement dynamic search query in MS SQL Server 2005/2008 is using COALESCE function.

Select Firstname,LastName, Phone,CustID from myTable where
Firstname=COALESCE(@name,Firstname)
and
Phone=COALESCE(@phone, Phone)


This functionality can easily be replaced with ISNULL checks if you are running an earlier version of SQL Server.
First nuance, which  is rarely addressed, is the issue of NULLs in tuples. Without introducing additional NULL checks for column contents you are running into risk of having a portion of your query turn into NULL=NULL comparison, yielding empty result sets.
Another drawback to keep in mind is that similar queries are great for small tables and smaller set of searchable variables. COALESCE usage may cause full table scans thus significantly affecting performance.  As a faster option for searching large tables we can use dynamic queries. So the statement will now look like this:

Declare @QueryString as nvarchar(1000)
set @QueryString='Select Name,Address,Phone from Customer where '

if @CustName is not null
                set @QueryString+='Name = '+@CustName +' and '
if @CustAdderss is not null
                set @QueryString+='Address='+@CustAdderss+'and '
set @QueryString+='1=1'
EXECUTE sp_executesql @QueryString

While the second option performs faster and generally utilizes less resources, there are a few caveats to keep in mind. As Erland Sommarskog warns in his much referenced "The Curse and Blessings of Dynamic SQL", there are consequences to using dynamic SQL, a few of which are:
  * You cannot access local variables within the block of the dynamic SQL. We can still pass parameters in and out though.
  * Temp tables created in dynamic SQL will not be accessible to the rest of the SP (as a matter of fact, they will be dropped as soon as the dynamic SQL exectues).
  * USE statements within the dynamic SQL do not affect the calling stored procedure.
I have mentioned above that dynamic search query performs better than coalesce search. Now it is time to put the money where the cursor is and deliver the test results. For this performance test I have created three tables, with 78,000 , 430,000 and 2,200,000 records.  I ran each query several times, each time introducing more search variables to see whether the number of search variables may (or may not) affect performance.  
Trace on the smallest table yields the following results:

COALESCE Search
Dynamic Query
78,000 rows
Search Variables
CPU
Duration
Reads
Search Variables
CPU
Duration
Reads
1
47
65
1598
1
0
49
1598
2
31
69
1598
2
31
34
1598
4
63
62
1598
4
16
42
1598
6
32
70
1680
6
31
42
1680
avg
43.25
66.5
1618.5
avg
19.5
41.75
1618.5

As we can see, the dynamic query is slightly faster than COALESCE search.


For the larger table trace results yield the following:
COALESCE Search
Dynamic Query
430,000 rows
Search Variables
CPU
Duration
Reads
Search Variables
CPU
Duration
Reads
1
297
231
8834
1
125
102
8870
2
295
217
8834
2
126
128
8876
4
249
221
8834
4
172
118
8878
6
281
181
8834
6
171
126
8834
avg
280.5
212.5
8834
avg
148.5
118.5
8864.5

It is clear from the second table that the performance gap increases.
And the largest table yields:
COALESCE Search
Dynamic Query
2,200,000 rows
Search Variables
CPU
Duration
Reads
Search Variables
CPU
Duration
Reads
1
1450
1120
44568
1
719
483
44502
2
1435
1117
44568
2
671
434
44518
4
1451
1068
44568
4
811
604
44520
6
1295
1049
44568
6
857
497
44568
avg
1407.75
1088.5
44568
avg
764.5
504.5
44527
Now we can see that the dynamic query performs nearly twice as fast as COALESCE search, with CPU resource utilization of nearly half than that of COALESCE search.  Based on trace results it is also clear that there is no significant performance change associated with the number of search variables. 

No comments:

Post a Comment