In SQL Server, Why Is It That Null Does Not Equal Empty String and Doesn't Not Equal Empty String

In SQL Server, why is it that NULL does not equal empty string AND doesn't not equal empty string?

"How can both of those WHEREs be "false"?"

It's not!
The answer is not "true" either!
The answer is "we don't know".

Think of NULL as a value you don't know yet.

Would you bet it's '' ?

Would you bet it's not '' ?

So, safer is to declare you don't know yet. The answer to both questions, therefore, is not false but I don't know, e.g. NULL in SQL.

How to check for Is not Null And Is not Empty string in SQL server?

If you only want to match "" as an empty string

WHERE DATALENGTH(COLUMN) > 0 

If you want to count any string consisting entirely of spaces as empty

WHERE COLUMN <> '' 

Both of these will not return NULL values when used in a WHERE clause. As NULL will evaluate as UNKNOWN for these rather than TRUE.

CREATE TABLE T 
(
C VARCHAR(10)
);

INSERT INTO T
VALUES ('A'),
(''),
(' '),
(NULL);

SELECT *
FROM T
WHERE C <> ''

Returns just the single row A. I.e. The rows with NULL or an empty string or a string consisting entirely of spaces are all excluded by this query.

SQL Fiddle

Empty string vs NULL

Some differences between them:

  • NULL can be assigned to any type, as opposed to empty string which won't be compatible with date/numerical fields.
  • NULL is an UNKNOWN value, it doesn't have a value as opposed to an empty string, which is a value, but empty one.
  • As far as I'm aware of, NULL shouldn't capture memory as opposed to an empty string which does.
  • null = null will result in null as opposed to ''='' which will result in TRUE.

Comparing empty string with null value - SQL Server

Your first example returns fail because you have the wrong operator. If you want to see if something equals something else you use =, not !=

Here is the code that proves that NULL can be compared to '':

DECLARE @EmptyString VARCHAR(20) = '',
@Null VARCHAR(20) = Null;

SELECT
CASE WHEN ISNULL(@EmptyString, '') = ISNULL(@Null, '')
THEN 'Pass' ELSE 'Fail'
END AS EmptyStringVsNull

It returns pass because you use =, not !=

SQL Server: Null VS Empty String

There's a nice article here which discusses this point. Key things to take away are that there is no difference in table size, however some users prefer to use an empty string as it can make queries easier as there is not a NULL check to do. You just check if the string is empty. Another thing to note is what NULL means in the context of a relational database. It means that the pointer to the character field is set to 0x00 in the row's header, therefore no data to access.

Update
There's a detailed article here which talks about what is actually happening on a row basis

Each row has a null bitmap for columns that allow nulls. If the row in
that column is null then a bit in the bitmap is 1 else it's 0.

For variable size datatypes the acctual size is 0 bytes.

For fixed size datatype the acctual size is the default datatype size
in bytes set to default value (0 for numbers, '' for chars).

the result of DBCC PAGE shows that both NULL and empty strings both take up zero bytes.

SQL Server: NULL or empty string?

What you have already stated yourself and the code you have given is correct.

Notably, the code from your first example,

`INNER JOIN myTable B ON ISNULL(B.someColumn, '') = ISNULL(A.someColumn, '')`

forces SQL server to scan the whole tables and to compute ISNULL(...) before being able to do the join. This means that it can't use any index to speed up the join, which will drastically decrease performance. The same procedure will be repeated when executing that query the next time, so you can't expect that subsequently running the query again will be faster.

[ For the record, of course you could make ISNULL(...) a computed column, index that and use it in the joins, but that seems a little exaggerated given the other possible solutions. ]

Whether you should use '' instead of NULL or rather make another approach depends of the effort the former requires:

If your application touches that column only if it wants to write a meaningful (read: non-null) value there, then you could solve the problem in three easy steps (update the column and turn all NULL values into ''; make the column non-nullable; change the column's default value from NULL to '').

But if your application touches that column in every case and actively writes NULL to it if appropriate, you would have to change the application itself to take that route. Whether changing the application is possible is known only to you ...

If you can't change the application, your third approach is good and works reliably:

(B.someColumn = A.someColumn) or (B.someColumn is NULL and A.someColumn is NULL)

It wouldn't disturb me that it is "wordy". The length of code in general does not mean anything with respect to performance, and in this case, the wordy code enables you to leave things as-is in your application (except this query), while enabling SQL server to use indexes to speed up the join.

In summary:

Use '' instead of NULL if your application allows that easily, or can be changed easily to allow it. Otherwise, use your third approach.

A final note regarding ANSI_NULLS:

Walter Vehoeven's comment below your question is correct, but in your case, ANSI_NULLS would not change anything. From the documentation (first paragraph below the table, formatting mine):

SET ANSI_NULLS ON affects a comparison only if one of the operands of the comparison is either a variable that is NULL or a literal NULL. If both sides of the comparison are columns or compound expressions, the setting does not affect the comparison.

The second sentence means that it won't have any effect in your case, because you are comparing / joining two columns (even if it actually is the same column on both sides of the comparison).

Not equal != operator on NULL

<> is Standard SQL-92; != is its equivalent. Both evaluate for values, which NULL is not -- NULL is a placeholder to say there is the absence of a value.

Which is why you can only use IS NULL/IS NOT NULL as predicates for such situations.

This behavior is not specific to SQL Server. All standards-compliant SQL dialects work the same way.

Note: To compare if your value is not null, you use IS NOT NULL, while to compare with not null value, you use <> 'YOUR_VALUE'. I can't say if my value equals or not equals to NULL, but I can say if my value is NULL or NOT NULL. I can compare if my value is something other than NULL.

How to use NULL or empty string in SQL

Select *
From Table
Where (col is null or col = '')

Or

Select *
From Table
Where IsNull(col, '') = ''


Related Topics



Leave a reply



Submit