Why & When should I use SPARSE COLUMN? (SQL SERVER 2008)
A sparse column doesn't use 4x the amount of space to store a value, it uses a (fixed) 4 extra bytes per non-null value. (As you've already stated, a NULL takes 0 space.)
So a non-null value stored in a bit column would be 1 bit + 4 bytes = 4.125 bytes. But if 99% of these are NULL, it is still a net savings.
A non-null value stored in a GUID (UniqueIdentifier) column is 16 bytes + 4 bytes = 20 bytes. So if only 50% of these are NULL, that's still a net savings.
So the "expected savings" depends strongly on what kind of column we're talking about, and your estimate of what ratio will be null vs non-null. Variable width columns (varchars) are probably a little more difficult to predict accurately.
This Books Online Page has a table showing what percentage of different data types would need to be null for you to end up with a benefit.
So when should you use a Sparse Column? When you expect a significant percentage of the rows to have a NULL value. Some examples that come to mind:
- A "Order Return Date" column in an order table. You would hope that a very small percent of sales would result in returned products.
- A "4th Address" line in an Address table. Most mailing addresses, even if you need a Department name and a "Care Of" probably don't need 4 separate lines.
- A "Suffix" column in a customer table. A fairly low percent of people have a "Jr." or "III" or "Esquire" after their name.
Use column set for ALL_SPARSE_COLUMNS in the Sql View
This behaviour is documented for SPARSE
columns when there is a COLUMN_SET
present.
Warning:
Adding a column set changes the behavior of
SELECT *
queries. The query will return the column set as an XML column and not return the individual sparse columns. Schema designers and software developers must be careful not to break existing applications. Individual sparse columns can still be queried by name in aSELECT
statement.
So the view will never contain that column, unless you specifically select it, not just using select *
.
There is another issue that you would get even if it wasn't SPARSE
.
You are adding the column after creating the view.
You need to then run the following statement:
EXEC sp_refreshview N'dbo.v_my_view';
When creating a view, the view is parsed into a compiled expression tree (without any optimizations). Then, when you use the view, the compiler does not simply dump the view text into the outer query. Instead, it parses the outer query into an expression tree, and uses the expression tree from the view in the correct place.
So when you add a column, the expression tree is not updated. So you need to refresh the view definition.
You also need to rebuild any stored procedures which access this table or the view, for the same reason.
EXEC sp_refreshsqlmodule N'dbo.YourProc';
db<>fiddle
Is SPARSE the best way to implement an SQL column union?
According to the BOL section Estimated Space Savings by Data Type you should consider SPARSE
for INT
columns if at least 64% are NULL
.
You have 94.7% NULL
so seems like you could benefit from this to me. This does seem an unusual design though.
Remove SPARSE from column definition
According to Microsoft:
Changing a column from sparse to nonsparse or nonsparse to sparse requires changing the storage format of the column. The SQL Server Database Engine uses the following procedure to accomplish this change:
1) Adds a new column to the table in the new storage size and format.
2) For each row in the table, updates and copies the value stored in the old column to the new column.
3) Removes the old column from the table schema.
4) Rebuilds the table (if there is no clustered index) or rebuilds the clustered index to reclaim space used by the old column.
Note
Step 2 can fail when the size of the data in the row exceeds the maximum allowable row size. This size includes the size of the data stored in the old column and the updated data stored in the new column. This limit is 8060 bytes for tables that do not contain any sparse columns or 8018 bytes for tables that contain sparse columns. This error can occur even if all eligible columns have been pushed off-row.
https://learn.microsoft.com/en-us/sql/relational-databases/tables/use-sparse-columns
Related Topics
The Transaction Log for the Database Is Full
Concat All Column Values in SQL
Should Every SQL Server Foreign Key Have a Matching Index
Sql: Using Null Values VS. Default Values
How to Pass a Temp Table as a Parameter into a Separate Stored Procedure
"Order By" Using a Parameter for the Column Name
Oracle: Updating a Table Column Using Rownum in Conjunction with Order by Clause
SQL Server Join Tables and Pivot
Using Bind Variables with Dynamic Select into Clause in Pl/Sql
How to Get the Full Resultset from Ssms
How to Select the First N Rows of Each Group
SQL Server Cumulative Sum by Group
Redundant Data in Update Statements
SQL Query for Getting Data for Last 3 Months
How to Make a View Column Not Null
Copy Data from One Column to Other Column (Which Is in a Different Table)