Java Integer Compareto() - Why Use Comparison VS. Subtraction

Java Integer compareTo() - why use comparison vs. subtraction?

This is due to integer overflow. When thisVal is very large and anotherVal is negative then subtracting the latter from the former yields a result that is bigger than thisVal which may overflow to the negative range.

JavaScript numeric compare function - comparison vs subtraction

Does the following implementation of compareNumeric() work as expected in JavaScript?

Yes. As you've already figured out, there are no integer overflows or underflows in javascript (because all the number values are floats). If your values are getting too large or too small, you'll end up with Infinity or -Infinity, which are positive and negative respectively.

Why are Byte.compare() and Integer.compare() implemented differently?

The implementation of Integer.compare does not use subtraction, as this could cause an overflow in case you're comparing an integer that is close to Integer.MIN_VALUE with another that is close to Integer.MAX_VALUE.

This overflow cannot happen in case of Byte.compare, as there the byte values are implicitely converted to integers before x-y is calculated.

(see also: Java Language Specification - 5.6.2 Binary Numeric Promotion)

how can compareTo() method avoid overflowing?

As said in this post, the problem with using subtraction is the following.

This is due to integer overflow. When thisVal is very large and anotherVal is negative then subtracting the latter from the former yields a result that is bigger than thisVal which may overflow to the negative range.

https://stackoverflow.com/a/2728810/9520059

Indeed, the compiler will probably use a subtraction at very low level when evaluating the '<', however, it will check for overflows.

Why does compareTo return an integer

[This answer is for C#, but it probably also apples to Java to some extent.]

This is for historical, performance and readability reasons. It potentially increases performance in two places:

  1. Where the comparison is implemented. Often you can just return "(lhs - rhs)" (if the values are numeric types). But this can be dangerous: See below!
  2. The calling code can use <= and >= to naturally represent the corresponding comparison. This will use a single IL (and hence processor) instruction compared to using the enum (although there is a way to avoid the overhead of the enum, as described below).

For example, we can check if a lhs value is less than or equal to a rhs value as follows:

if (lhs.CompareTo(rhs) <= 0)
...

Using an enum, that would look like this:

if (lhs.CompareTo(rhs) == CompareResult.LessThan ||
lhs.CompareTo(rhs) == CompareResult.Equals)
...

That is clearly less readable and is also inefficient since it is doing the comparison twice. You might fix the inefficiency by using a temporary result:

var compareResult = lhs.CompareTo(rhs);

if (compareResult == CompareResult.LessThan || compareResult == CompareResult.Equals)
...

It's still a lot less readable IMO - and it's still less efficient since it's doing two comparison operations instead of one (although I freely admit that it is likely that such a performance difference will rarely matter).

As raznagul points out below, you can actually do it with just one comparison:

if (lhs.CompareTo(rhs) != CompareResult.GreaterThan)
...

So you can make it fairly efficient - but of course, readability still suffers. ... != GreaterThan is not as clear as ... <=

(And if you use the enum, you can't avoid the overhead of turning the result of a comparison into an enum value, of course.)

So this is primarily done for reasons of readability, but also to some extent for reasons of efficiency.

Finally, as others have mentioned, this is also done for historical reasons. Functions like C's strcmp() and memcmp() have always returned ints.

Assembler compare instructions also tend to be used in a similar way.

For example, to compare two integers in x86 assembler, you can do something like this:

CMP AX, BX ; 
JLE lessThanOrEqual ; jump to lessThanOrEqual if AX <= BX

or

CMP AX, BX
JG greaterThan ; jump to greaterThan if AX > BX

or

CMP AX, BX
JE equal ; jump to equal if AX == BX

You can see the obvious comparisons with the return value from CompareTo().

Addendum:

Here's an example which shows that it's not always safe to use the trick of subtracting the rhs from the lhs to get the comparison result:

int lhs = int.MaxValue - 10;
int rhs = int.MinValue + 10;

// Since lhs > rhs, we expect (lhs-rhs) to be +ve, but:

Console.WriteLine(lhs - rhs); // Prints -21: WRONG!

Obviously this is because the arithmetic has overflowed. If you had checked turned on for the build, the code above would in fact throw an exception.

For this reason, the optimization of suusing subtraction to implement comparison is best avoided. (See comments from Eric Lippert below.)

Is there the matter with the compareTo return value?

... why do almost the all method implementations return only -1, 0, 1?

I can't speak for other programmers, but I typically do this because it is simpler and more convenient to do that in most case. And most important, it is correct.

I imagine that you are thinking along the lines of doing this:

  public int compareTo (MyClass other) {
return this.intField - other.intField;
}

Beware. This code is subtly wrong. See this Q&A: Java Integer compareTo() - why use comparison vs. subtraction?

Negative and positive return values of compare and compareTo

Is there any reason for using Math.signum

Yes there is.

order = (int) Math.signum(book1.getPrice() - book2.getPrice());

Suppose you have replace the above line with this

order = (int)(book1.getPrice() - book2.getPrice());

Now let us assume

book1.getPrice() returns 10.50 
book2.getPrice() returns 10.40

If you do not use signum you will never have any compile time or run time error but value of order will be 0. This implies that book1 is equals to book2 which is logically false.

But if you use signum value of order will be 1 which implies book1 > book2.

But it must be mentioned that you should never make any assumption about compare function returning value between 1 and -1.
You can read official document for comparator http://docs.oracle.com/javase/7/docs/api/java/util/Comparator.html.

Does the specific signed integer matter when implementing compareTo in a Comparable Type class?

Interesting question, but nonetheless no, the magnitude of the int has no significance as per Comparable<T> and Comparator<T> specifications, only the sign. Conceivably some sorting algorithm can additionally specify that they can take "hints" from the magnitude, but I'm not sure how practical that would be for comparison-based sorting, since we really need only to know if a < b, a == b, or a > b (which is really what Comparable and Comparator are OOP abstractions of).


Now it needs to be said that there may be a hidden intention here of using the subtraction idiom for comparing numeric values, i.e. something like this:

public int compare(T t1, T t2) {
return t1.intField - t2.intField;
}

Do note that this comparison method is potentially broken, due to possible overflow when the difference between the two numbers is greater than Integer.MAX_VALUE. In fact, this is one of the puzzles covered in Java Puzzlers.

To demonstrate, consider the following snippet (taken from the book):

int x = -2000000000;
int z = 2000000000;
System.out.println(x - z); // prints a positive number due to overflow

Clearly x < z, and yet x - z is a positive number. Beware of using this subtraction idiom: it's always much safer to do an explicit comparison and return -1, 0, or 1 instead.



Related Topics



Leave a reply



Submit