Friday 19 April 2013

Null storage

I was trying to figure out how much space storage of nulls takes. There are any number of articles online (SQL Server 2012 and prior versions) saying that nulls take no space except for the 2 byte row overhead for having any number of nullable columns.

However, I am convinced I read somewhere that null columns only take no space when they are not subsequently followed by non null data.

So I set up a small experiment that is easy to repeat :

       
 create table fred (
      name varchar(100) not null,
      name2 varchar(100) null,
      name3 varchar(100) not null
)
create table fred2 (
      name varchar(100) not null,
      name3 varchar(100) not null,
      name2 varchar(100) null
)

set nocount on

declare @i int
set @i = 0
while @i < 50000 begin
  set @i=@i+1
  insert into fred values ('dklfsdklfsdlhfsdfhsdfjsdjk',null,'jkadlfhklsdfsdfjksdjkfjksd')
  insert into fred2 values ('dklfsdklfsdlhfsdfhsdfjsdjk','jkadlfhklsdfsdfjksdjkfjksd',null)
  if @i%1000 = 0 select @i -- display progress occassionally
end
 
exec sp_spaceused 'fred'
exec sp_spaceused 'fred2'
 
drop table fred
drop table fred2
       
 
I had a couple of false starts when running this so I received the output below from sp_spaceused where I had more rows that specified in the code but the principle remains the same :

So to summarise what I've done, both tables have 3 fields of varchar(100), the table "fred" has the middle field nullable and the table "fred2" has the final field nullable. Both are filled with identical data and the results show that fred2 with the trailing null colum has taken less space.

This demonstrates that storage of nulls last is most space efficient which on a larger table could translate to real returns on the amount of data being stored on disk and in the buffer cache.

I have a document called "Writing Efficient SQL" which is aimed at in-house developers and I have updated that to recommend all not null fields in a table come first, followed by fixed length nullable fields (fixed length fields are still a fixed length whether null or not it seems) followed by nullable variable length fields in order the the most likely to be not null first to give the greatest chance of storage efficiency.

This does mean that fields are not in a natural order, indeed, it might be easier to simply keep to natural order for all except the variable length nullable fields but generally ordering of fields is not relevant to presentation to the user so shouldn't really make any usability differences.

No comments:

Post a Comment