Is It Faster to Check If Length = 0 Than to Compare It to an Empty String

Is it faster to check if length = 0 than to compare it to an empty string?

Edit
You've updated your question since I first looked at it. In that example I would say that you should definitely always use

SELECT user_id FROM users WHERE user_email = ''

Not

SELECT user_id FROM users WHERE LEN(user_email) = 0

The first one will allow an index to be used. As a performance optimisation this will trump some string micro optimisation every time! To see this

SELECT * into #temp FROM [master].[dbo].[spt_values]

CREATE CLUSTERED INDEX ix ON #temp([name],[number])

SELECT [number] FROM #temp WHERE [name] = ''

SELECT [number] FROM #temp WHERE LEN([name]) = 0

Execution Plans

Execution Plans

Original Answer

In the code below (SQL Server 2008 - I "borrowed" the timing framework from @8kb's answer here) I got a slight edge for testing the length rather than the contents below when @stringToTest contained a string. They were equal timings when NULL. I probably didn't test enough to draw any firm conclusions though.

In a typical execution plan I would imagine the difference would be negligible and if you're doing that much string comparison in TSQL that it will be likely to make any significant difference you should probably be using a different language for it.

DECLARE @date DATETIME2
DECLARE @testContents INT
DECLARE @testLength INT

SET @testContents = 0
SET @testLength = 0

DECLARE
@count INT,
@value INT,
@stringToTest varchar(100)

set @stringToTest = 'jasdsdjkfhjskdhdfkjshdfkjsdehdjfk'
SET @count = 1

WHILE @count < 10000000
BEGIN

SET @date = GETDATE()
SELECT @value = CASE WHEN @stringToTest = '' then 1 else 0 end
SET @testContents = @testContents + DATEDIFF(MICROSECOND, @date, GETDATE())

SET @date = GETDATE()
SELECT @value = CASE WHEN len(@stringToTest) = 0 then 1 else 0 end
SET @testLength = @testLength + DATEDIFF(MICROSECOND, @date, GETDATE())

SET @count = @count + 1
END

SELECT
@testContents / 1000000. AS Seconds_TestingContents,
@testLength / 1000000. AS Seconds_TestingLength

Is checking whether string.length == 0 still faster than checking string == ?

Usually, string object store their length and therefore getting and comparing the integer is very fast and has less memory access than an equals() where you - in the worst case - have to check the length and loop over the characters.

Anyway, nowadays the equals() method of a string should also check for the length first and therefore it should be - nearly - the same speed as checking for the length.

equals part in Java (http://www.docjar.com/html/api/java/lang/String.java.html):

int n = count;
if (n == anotherString.count) {...}

equals part in Objective-C (http://www.opensource.apple.com/source/CF/CF-476.15/CFString.c) - NSString is based on CFString:

if (len1 != __CFStrLength2(str2, contents2)) return false;

Checking for string contents? string Length Vs Empty String

Yes, it depends on language, since string storage differs between languages.

  • Pascal-type strings: Length = 0.
  • C-style strings: [0] == 0.
  • .NET: .IsNullOrEmpty.

Etc.

Why should I use string.length == 0 over string == when checking for empty string in ECMAScript?

I actually prefer that technique in a number of languages, since it's sometimes hard to differentiate between an empty string literal "" and several other strings (" ", '"').

But there's another reason to avoid theString == "" in ECMAScript: 0 == "" evaluates to true, as does false == "" and 0.0 == ""...

...so unless you know that theString is actually a string, you might end up causing problems for yourself by using the weak comparison. Fortunately, you can avoid this with judicious use of the strict equal (===) operator:

if ( theString === "" )
// string is a string and is empty

See also:

  • What is the best way to check for an empty string in JavaScript?

Why is String.IsNullOrEmpty faster than String.Length?

It's because you ran your benchmark from within Visual Studio which prevents JIT compiler from optimizing code. Without optimizations, this code is produced for String.IsNullOrEmpty

00000000   push        ebp 
00000001 mov ebp,esp
00000003 sub esp,8
00000006 mov dword ptr [ebp-8],ecx
00000009 cmp dword ptr ds:[00153144h],0
00000010 je 00000017
00000012 call 64D85BDF
00000017 mov ecx,dword ptr [ebp-8]
0000001a call 63EF7C0C
0000001f mov dword ptr [ebp-4],eax
00000022 movzx eax,byte ptr [ebp-4]
00000026 mov esp,ebp
00000028 pop ebp
00000029 ret

and now compare it to code produced for Length == 0

00000000   push   ebp 
00000001 mov ebp,esp
00000003 sub esp,8
00000006 mov dword ptr [ebp-8],ecx
00000009 cmp dword ptr ds:[001E3144h],0
00000010 je 00000017
00000012 call 64C95BDF
00000017 mov ecx,dword ptr [ebp-8]
0000001a cmp dword ptr [ecx],ecx
0000001c call 64EAA65B
00000021 mov dword ptr [ebp-4],eax
00000024 cmp dword ptr [ebp-4],0
00000028 sete al
0000002b movzx eax,al
0000002e mov esp,ebp
00000030 pop ebp
00000031 ret

You can see, that code for Length == 0 does everything that does code for String.IsNullOrEmpty, but additionally it tries something like foolishly convert boolean value (returned from length comparison) again to boolean and this makes it slower than String.IsNullOrEmpty.

If you compile program with optimizations enabled (Release mode) and run .exe file directly from Windows, code generated by JIT compiler is much better. For String.IsNullOrEmpty it is:

001f0650   push    ebp
001f0651 mov ebp,esp
001f0653 test ecx,ecx
001f0655 je 001f0663
001f0657 cmp dword ptr [ecx+4],0
001f065b sete al
001f065e movzx eax,al
001f0661 jmp 001f0668
001f0663 mov eax,1
001f0668 and eax,0FFh
001f066d pop ebp
001f066e ret

and for Length == 0:

001406f0   cmp     dword ptr [ecx+4],0
001406f4 sete al
001406f7 movzx eax,al
001406fa ret

With this code, result are as expected, i.e. Length == 0 is slightly faster than String.IsNullOrEmpty.

It's also worth mentioning, that using Linq, lambda expressions and computing modulo in your benchmark is not such a good idea, because these operations are slow (relatively to string comparison) and make result of benchmark inaccurate.

Is string comparison faster than string length?

Well, why not test it? http://jsperf.com/empty-string-comparison2

In terms of calculations per second, they differ by less than 1% (at least on Chromium). Unless you're testing millions of strings every second, I wouldn't worry about it.

Which is faster, string.empty() or string.size() == 0?

The standard defines empty() like this:

bool empty() const noexcept;

Returns: size() == 0.

You'd be hard-pressed to find something that doesn't do that, and any performance difference would be negligible due to both being constant time operations. I would expect both to compile to the exact same assembly on any reasonable implementation.

That said, empty() is clear and explicit. You should prefer it over size() == 0 (or !size()) for readability.

len(string) == 0 or len(string) 1

Since the empty string is the nil value of a string, you should compare against that.

str == ""

Checking variables against their nil values to see if they are empty is the Go way of doing this.

In terms of performance, there's no notable difference. Using len(str) is a function call, so it should in theory be slower.

EDIT: Some evidence:

I benchmarked this code:

func BenchmarkNil(b *testing.B) {
str := "asd"
cnt := 0
for i := 0; i < b.N; i++ {
if str == "" {
cnt++
}
}
}

with the three different checks in the if-statement: str == "", len(str) == 0 and len(str) < 1.

BenchmarkLenEq-8        2000000000               0.77 ns/op
BenchmarkLenLess-8 2000000000 0.76 ns/op
BenchmarkNil-8 2000000000 0.50 ns/op

For checking against an empty string (str := "" instead of str := "asd"), there is no measurable difference. Checking against a non-empty string takes more time, and there, the nil check is noticeably faster.

BenchmarkLenEq-8        2000000000               0.34 ns/op
BenchmarkLenLess-8 2000000000 0.33 ns/op
BenchmarkNil-8 2000000000 0.33 ns/op

EDIT2:
The only thing you can do these days to be somewhat sure of how fast something is is to benchmark it. Modern CPU's are superscalar, so one clock cycle per instruction is simply not true anymore. The benchmark code comparing against the empty string ran at 2.94GHz (2.94*10^9 op/s) on my 4GHz 6700k, which is less than two clock cycles per loop iteration. The nil check against the non-empty string ran at 2GHz (2*10^9 op/s) on the same CPU.

This means 2 cpu cycles per loop iteration on the nil check, and 3 on the len check, or a single instruction per loop iteration on the check against the empty string.

What is the best way to test for an empty string in Go?

Both styles are used within the Go's standard libraries.

if len(s) > 0 { ... }

can be found in the strconv package: http://golang.org/src/pkg/strconv/atoi.go

if s != "" { ... }

can be found in the encoding/json package: http://golang.org/src/pkg/encoding/json/encode.go

Both are idiomatic and are clear enough. It is more a matter of personal taste and about clarity.

Russ Cox writes in a golang-nuts thread:

The one that makes the code clear.

If I'm about to look at element x I typically write

len(s) > x, even for x == 0, but if I care about

"is it this specific string" I tend to write s == "".

It's reasonable to assume that a mature compiler will compile

len(s) == 0 and s == "" into the same, efficient code.

...

Make the code clear.

As pointed out in Timmmm's answer, the Go compiler does generate identical code in both cases.

Why is string.IsNullOrEmpty faster than comparison?

MS Analyzer recommends to use string.IsNullOrEmpty instead of comparising it either with null or empty string for performance reasons

Warning 470 CA1820 : Microsoft.Performance : Replace the call to 'string.operator ==(string, string)' in ... with a call to 'String.IsNullOrEmpty'.

Just read the fine manual:

A string is compared to the empty string by using Object.Equals.

...

Comparing strings using the String.Length property or the String.IsNullOrEmpty method is significantly faster than using Equals. This is because Equals executes significantly more MSIL instructions than either IsNullOrEmpty or the number of instructions executed to retrieve the Length property value and compare it to zero.

...

To fix a violation of this rule, change the comparison to use the Length property and test for the null string. If targeting .NET Framework 2.0, use the IsNullOrEmpty method.

Your problem is not so much the null check, but instead testing for equality (via Equals) with an empty string instance rather than checking its Length.

Again, from the fine manual:

  public void EqualsTest()
{
// Violates rule: TestForEmptyStringsUsingStringLength.
if (s1 == "")
{
Console.WriteLine("s1 equals empty string.");
}
}

// Use for .NET Framework 1.0 and 1.1.
public void LengthTest()
{
// Satisfies rule: TestForEmptyStringsUsingStringLength.
if (s1 != null && s1.Length == 0)
{
Console.WriteLine("s1.Length == 0.");
}
}


Related Topics



Leave a reply



Submit