Why Does String to Number Comparison Work in JavaScript

Why does string to number comparison work in Javascript

Because JavaScript defines >= and <= (and several other operators) in a way that allows them to coerce their operands to different types. It's just part of the definition of the operator.

In the case of <, >, <=, and >=, the gory details are laid out in §11.8.5 of the specification. The short version is: If both operands are strings (after having been coerced from objects, if necessary), it does a string comparison. Otherwise, it coerces the operands to numbers and does a numeric comparison.

Consequently, you get fun results, like that "90" > "100" (both are strings, it's a string comparison) but "90" < 100 (one of them is a number, it's a numeric comparison). :-)

Is it okay to have this comparison like this or should I use parseInt() to convert x to integer ?

That's a matter of opinion. Some people think it's totally fine to rely on the implicit coercion; others think it isn't. There are some objective arguments. For instance, suppose you relied on implicit conversion and it was fine because you had those numeric constants, but later you were comparing x to another value you got from an input field. Now you're comparing strings, but the code looks the same. But again, it's a matter of opinion and you should make your own choice.

If you do decide to explicitly convert to numbers first, parseInt may or may not be what you want, and it doesn't do the same thing as the implicit conversion. Here's a rundown of options:

  • parseInt(str[, radix]) - Converts as much of the beginning of the string as it can into a whole (integer) number, ignoring extra characters at the end. So parseInt("10x") is 10; the x is ignored. Supports an optional radix (number base) argument, so parseInt("15", 16) is 21 (15 in hex). If there's no radix, assumes decimal unless the string starts with 0x (or 0X), in which case it skips those and assumes hex. Does not look for the new 0b (binary) or 0o (new style octal) prefixes; both of those parse as 0. (Some browsers used to treat strings starting with 0 as octal; that behavior was never specified, and was [specifically disallowed][2] in the ES5 specification.) Returns NaN if no parseable digits are found.

  • Number.parseInt(str[, radix]) - Exactly the same function as parseInt above. (Literally, Number.parseInt === parseInt is true.)

  • parseFloat(str) - Like parseInt, but does floating-point numbers and only supports decimal. Again extra characters on the string are ignored, so parseFloat("10.5x") is 10.5 (the x is ignored). As only decimal is supported, parseFloat("0x15") is 0 (because parsing ends at the x). Returns NaN if no parseable digits are found.

  • Number.parseFloat(str) - Exactly the same function as parseFloat above.

  • Unary +, e.g. +str - (E.g., implicit conversion) Converts the entire string to a number using floating point and JavaScript's standard number notation (just digits and a decimal point = decimal; 0x prefix = hex; 0b = binary [ES2015+]; 0o prefix = octal [ES2015+]; some implementations extend it to treat a leading 0 as octal, but not in strict mode). +"10x" is NaN because the x is not ignored. +"10" is 10, +"10.5" is 10.5, +"0x15" is 21, +"0o10" is 8 [ES2015+], +"0b101" is 5 [ES2015+]. Has a gotcha: +"" is 0, not NaN as you might expect.

  • Number(str) - Exactly like implicit conversion (e.g., like the unary + above), but slower on some implementations. (Not that it's likely to matter.)

  • Bitwise OR with zero, e.g. str|0 - Implicit conversion, like +str, but then it also converts the number to a 32-bit integer (and converts NaN to 0 if the string cannot be converted to a valid number).

So if it's okay that extra bits on the string are ignored, parseInt or parseFloat are fine. parseInt is quite handy for specifying radix. Unary + is useful for ensuring the entire string is considered. Takes your choice. :-)

For what it's worth, I tend to use this function:

const parseNumber = (str) => str ? +str : NaN;

(Or a variant that trims whitespace.) Note how it handles the issue with +"" being 0.

And finally: If you're converting to number and want to know whether the result is NaN, you might be tempted to do if (convertedValue === NaN). But that won't work, because as Rick points out below, comparisons involving NaN are always false. Instead, it's if (isNaN(convertedValue)).

How does string comparison work in JavaScript?

This is calculated using The Abstract Relational Comparison Algorithm in ECMA-5. The relevant part is quoted below.

4. Else, both px and py are Strings
a) If py is a prefix of px, return false. (A String value p is a prefix
of String value q if q can be the result of concatenating p and some
other String r. Note that any String is a prefix of itself, because
r may be the empty String.)
b) If px is a prefix of py, return true.
c) Let k be the smallest nonnegative integer such that the character
at position k within px is different from the character at position
k within py. (There must be such a k, for neither String is a prefix
of the other.)
d) Let m be the integer that is the code unit value for the character
at position k within px.
e) Let n be the integer that is the code unit value for the character
at position k within py.
f) If m < n, return true. Otherwise, return false.

Can any one explain me how does the string and number comparison work in javascript?

Comparing a string to a number will force the string data to evaluate into a number value. If the string data is not convertible to a numerical value it will return a NaN number to the given comparison.

Since NaN is not comparable nor equal to anything at all, not even to another NaN

NaN == NaN > false

The 'greater than' or 'smaller than' NaN comparison will have to return false both ways. Because that's the only correct answer, nothing can be greater nor smaller than the value you don't have. Therefore both claims are false. e.g.: 0 > NaN and 0 < NaN > false.

But keep in mind that comparing two strings of data such as:
"98A" > "999" will return a comparative false,
whereas:
"9A" > "999" will return true

Which is a very powerful thing to know, because knowing this (two strings will be compared by alphabetical order of magnitude) you are able to compare time data without taking the burden of converting those values to numbers and directly go with:

"09:32:28" > "09:31:59" > true

And luckily "PM" > "AM" > true by pure (linguistic) chance.

Why is comparing strings 0(n), but comparing numbers 0(1)?

Numbers in computers are usually handled in fixed-size units. A int might be 32 or 64 bits in any given language and/or compiler/platform combination, but it will never be variable-length.

Therefore you have a fixed number of bits to compare when comparing numbers. It's very easy to build a hardware circuit that compares that many bits at once (i.e. as "one action").

Strings, on the other hand, have inherently variable lengths, so you just saying "string" doesn't tell you how many bits you'll have to compare.

There are exceptions however, as there are variable-length numbers, usually called something like BigInteger or BigDecimal which will behave very similar to String comparison as it might end up being O(n) to compare two BigDecimal values for equality (where n is the length of the BigDecimals, not either of their numeric values).

Is JavaScript string comparison just as fast as number comparison?

String comparison could be "just as fast" (depending on implementation and values) - or it could be "much slower".

The ECMAScript specification describes the semantics, not the implementation. The only way to Know for Certain is to create an applicable performance benchmark on run it on a particular implementation.

Trivially, and I expect this is the case1, the effects of string interning for a particular implementation are being observed.

That is, all string values (not String Objects) from literals can be trivially interned into a pool such that implIdentityEq("foo", "foo") is true - that is, there need only one string object. Such interning can be done after constant folding, such that "f" + "oo" -> "foo" - again, per a particular implementation as long as it upholds the ECMAScript semantics.

If such interning is done, then for implStringEq the first check could be to evaluate implIdentityEq(x,y) and, if true, the comparison is trivially-true and performed in O(1). If false, then a normal string character-wise comparison would need to be done which is O(min(n,m)).

(Immediate falseness can also be determined with x.length != y.length, but that seems less relevant here.)


1 While in the above I argue for string interning being a likely cause, modern JavaScript implementations perform a lot of optimizations - as such, interning is only a small part of the various optimizations and code hoistings that can (and are) done!

I've created an "intern breaker" jsperf. The numbers agree with the hypothesis presented above.

  1. If a string is interned then comparison is approximate in performance to testing for "identity" - while it is slower than a numeric comparison, this is still much faster than a character-by-character string comparison.

  2. Holding the above assertion, IE10 does not appear to consider object-identity for pass-fast string comparisons although it does use a fast-fail length check.

  3. In Chrome and Firefox, two intern'ed strings which are not equal are also compared as quickly as two that are - there is likely a special case for comparing between two different interned strings.

  4. Even for small strings (length = 8), interning can be much faster. IE10 again shows it doesn't have this "optimization" even though it appears to have an efficient string comparison implementation.

  5. The string comparison can fail as soon as the first different character is encountered: even comparing long strings of equal length might only compare the first few characters.


  • Do common JavaScript implementations use string interning? (but no references given)

    Yes. In general any literal string, identifier, or other constant string in JS source is interned. However implementation details (exactly what is interned for instance) varies, as well as when the interning occurs

  • See JS_InternString (FF does have string interning, although where/how the strings are implicitly interened from JavaScript, I know not)

How does Javascript's string comparison work at the lower level?

There are two considerations:

  • === comparison of string primitives is implemented with lower level, compiled code, which indeed executes faster than explicit JavaScript looping, which has additional work, including in updating a JavaScript variable (i), performing an access with that variable (a[i]); where all the prescribed ECMAScript procedures must be followed.

  • The JavaScript engine may optimise memory usage and use the knowledge that two strings are the same (for instance when that is already detected at parsing time, or one string is assigned to a second variable/property) and only store that string once (cf. String pool). In that case the comparison is a trivial O(1) comparison of two references. There is however no way in JavaScript to inspect whether two string primitives actually share the same memory.

As an illustration of the second point, note how the comparison-time is different for two cases of comparing long strings that are equal -- which probably is a hint that this string pooling is happening:

function compare(a, b) {
let sum = 0, start, p;
for (let i = 0; i < 10; i++) { // Perform 10 comparisons
start = performance.now();
p = a === b;
sum += performance.now() - start;
}
return sum / 10; // Return average time to make the comparison
}

console.log("Hold on while strings are being created...");
setTimeout(function () {
// Create long, non-trivial string
let a = Array.from({length: 10000000}, (_, i) => i).join("");
let b = a.slice(0); // Plain Copy - engine realises it is the same string & does not allocate space
let c = a[0] + a.slice(1); // More complex - engine does not realise it is the same string
console.log("Strings created. Test whether they are equal:", a === b && b === c);
console.log(compare(a, b) + "ms");
console.log(compare(a, c) + "ms");
});

Javascript compare numbers as strings

If you want to compare them without converting them to numbers, you can set numeric: true in the options parameter

console.log(

"116457".localeCompare("3085", undefined, { numeric: true })

)

console.log(

"116457".localeCompare("3085")

)

Why the small number string is greater than big number string

You're comparing strings, so a lexical comparison is performed instead of a numerical comparison.

Lexically, 2 comes after 1.



Related Topics



Leave a reply



Submit