Wednesday, August 27, 2014

NULL vs Empty Strings

A lot of situations I had to face, info I had to learn, or study I had to make about SQL came from a job colleague asking me something. This is no different, since this time the question was: "I have an address field. It is better to store the empty fields as NULL or as empty strings?"

Ok, let's get down to business. This is a rather easy test, since we only need to create two tables, one with a varchar (or nvarchar) NULLable field , and another table with the same datatype, but non NULLable field. After that, fill the first table with NULL data, and the second one with empty strings. Also fill both tables with random numbers. This is the creation/filling script:
create table temp_null(id int identity primary key, data nvarchar(4000) NULL)
create table temp_emptystring(id int identity primary key, data nvarchar(4000) NOT NULL)

declare @counter int = 0
while (@counter < 100000)
begin
 insert into temp_null(data) values(NULL)
 insert into temp_emptystring(data) values('')

 set @counter = @counter + 1
end

set @counter = 0
while (@counter < 100000)
begin
 insert into temp_null(data) values(abs(checksum(@counter)))
 insert into temp_emptystring(data) values(abs(checksum(@counter)))

 set @counter = @counter + 1
end
Let's also create some indexes on both tables on the data column:
create index IX_temp_null_data on temp_null(data)

create index IX_temp_emptystring on temp_emptystring(data)
Ok, now with the tables/index created , let's check different things. First, space used:
exec sp_spaceused 'temp_null'
exec sp_spaceused 'temp_emptystring'
name rows reserved data index_size unused
temp_null 200000 11544 KB 6720 KB 3752 KB 1072 KB
name rows reserved data index_size unused
temp_emptystring 200000 10392 KB 6720 KB 3144 KB 528 KB

Less space used by date on the empty string option, and also less space used on index on the same option, ok. Next, search performance check. For that, let's execute some search queries. The search will be done on the data column, if it's NULL, or equal to an empty string, and also if it's NOT NULL, or different than an empty string:
select * from temp_null
where data is NULL

select * from temp_emptystring
where data = ''

select * from temp_null
where data is not NULL

select * from temp_emptystring
where data != ''
These are the results:

Lower logical reads and lower execution plan consumption when using empty strings. So using NULL values is not so performant as using empty strings. Good info to have in mind when designing new tables and CRUDs.

No comments:

Post a Comment