Why Are Interface Method Invocations Slower Than Concrete Invocations

Why are interface method invocations slower than concrete invocations?

There are many performance myths, and some were probably true several years ago, and some might still be true on VMs that don't have a JIT.

The Android documentation (remember that Android don't have a JVM, they have Dalvik VM) used to say that invoking a method on an interfaces was slower than invoking it on a class, so they were contributing to spreading the myth (it's also possible that it was slower on the Dalvik VM before they turned on the JIT). The documentation does now say:

Performance Myths

Previous versions of this document made various misleading claims. We
address some of them here.

On devices without a JIT, it is true that invoking methods via a
variable with an exact type rather than an interface is slightly more
efficient. (So, for example, it was cheaper to invoke methods on a
HashMap map than a Map map, even though in both cases the map was a
HashMap.) It was not the case that this was 2x slower; the actual
difference was more like 6% slower. Furthermore, the JIT makes the two
effectively indistinguishable.

Source: Designing for performance on Android

The same thing is probably true for the JIT in the JVM, it would be very odd otherwise.

Does having a dependency on interface rather than concrete class reduce compile time in Java if changes are made to concrete class?

I have never seen compile time as a design consideration in Java ( worked with Java since v 1). Specifically to your question, the language spec provides no guranteess in this regard. It may change from version to version and implementation to implementation.

Outside of javac, a build system may do the sensible thing, see for example : https://blog.gradle.org/incremental-compiler-avoidance

Edit: Just double checked, javac recompiles a source even if did not change:

$ java -version
openjdk version "10.0.2" 2018-07-17

$ javac *.java && ls -l A.*
-rw-rw-r-- 1 usr usr 143 Oct 14 20:07 A.class
-rw-rw-r-- 1 usr usr  59 Oct 14 19:57 A.java

After waiting a minute...

$ javac *.java && ls -l A.*
-rw-rw-r-- 1 usr usr 143 Oct 14 20:08 A.class
-rw-rw-r-- 1 usr usr  59 Oct 14 19:57 A.java

Which one is fast, Abstract class or Interface?

The answer depends on the programming language and possibly the compiler you use. In environments like the Java VM where run-time optimizations are used, it probably cannot be answered at all. And honestly, in a typical Java project, no-one cares because even if there was a difference, it would be so tiny that it won't slow down your software noticeably. Unless you have strict real-time constraints, in which case you wouldn't use Java (and possibly no polymorphism at all).

Basically, both interface methods and abstract methods make use of dynamic dispatching, so the difference is minimal if there is any at all. Without much knowledge of the details, I would assume that theoretically, abstract methods dispatch faster as long as the language doesn't implement multiple inheritance for classes. The position of the method pointer in the dispatch vector would be static, while it is not for interface methods (a class can typically implement multiple interfaces).

But as I said, I don't know the details about what happens in the compiler. There may be other factors I didn't think about. If I had to answer this question in an interview, I'd quote Don Knuth's "premature optimization is the root of all evil".

in which scenario abstract class and interface should not be used?

Perhaps when you explicitly want just one implementation, so that you don't maintain two versions of the same thing. One might also make the class final in this situation.

Are there any differences between a normal interface class and an abstract class that only has abstract methods?

Yes, they are different.

With an interface, clients could implement it aswell as extend a class:

class ClientType implements YourInterface, SomeOtherInterface { //can still extend other types

}

With a class, clients will be able to extend it, but not extend any other type:

class ClientType extends YourClass { //can no longer extend other types

}

Another difference arises when the interface or abstract class have only a single abstract method declaration, and it has to do with anonymous functions (lambdas).

As @AlexanderPetrov said, an interface with one method can be used as a functional interface, allowing us to create functions "on-the-fly" where ever a functional interface type is specified:

//the interface
interface Runnable {
    void run()
}

//where it's specified
void execute(Runnable runnable) {
    runnable.run();
}

//specifying argument using lambda
execute(() -> /* code here */);

This cannot be done with an abstract class.
So you cannot use them interchangably. The difference comes in the limitations of how a client can use it, which is enforced by the semantics of the JVM.

As for differences in resource usage, it's not something to worry about unless it's causing your software problems. The idea of using a memory-managed language is to not worry about such things unless you are having problems. Don't preoptimize, I'm sure the difference is negligable. And even if there is a difference, it should only matter if it may cause a problem for your software.

If your software is having resource problems, profile your application. If it does cause memory issues, you will be able to see it, as well as how much resources each one consumes. Until then, you shouldn't worry about it. You should prefer the feature that makes your code easier to manage, as opposed to which consumes the least amount of resources.

Using reflection based static invocation instead of interfaces

Well because Reflection API is much slower than method calling as when using reflection, the compiler can do no optimization whatsoever as it can have no real idea about what you are doing.

By having an interface and multiple concrete classes which implement it, we can have the advantage of polymorphic behavior. Add Factory pattern in the equation, and you can directly work with your interfaces without actually needing to know which concrete class method you are invoking. The added benefit is that the compiler will be able to optimize your code making it run much faster compared to using Reflection API.

As of Java 8, we can have static and instance method implementations in interfaces. Here's a demo:

public class InterfaceDemo
{
  public static void main(String... args)
  {
    XYZ.executeStatic("Hello");

    XYZ object = new Implementer();
    object.executeInstance("Message");
  }
}

interface XYZ
{
  static void executeStatic(String message)
  {
    System.out.println("Static: " + message);
  }

  default void executeInstance(String message)
  {
    System.out.println("Instance: " + message);
  }
}

class Implementer implements XYZ {}

Then there are functional interfaces like:

@FunctionalInterface
public interface Returnable<T>
{
  public T value();
}

Which allows the creation of different implementations using lambda expressions like:

Returnable<Integer> integerReturnable = () -> 42;
Returnable<String> stringReturnable = () -> "Hello";

And use like:

System.out.println(integerReturnable.value()); // Prints 42
System.out.println(stringReturnable.value()); // Prints Hello

By the way, if there are lots of objects that need to be created, we use Object Pool pattern or something similar, and if there are lots of similar objects, we use Flyweight Pattern to minimise memory usage while still keeping speed.

Why is returning a Java object reference so much slower than returning a primitive

TL;DR: You should not put BLIND trust into anything.

First things first: it is important to verify the experimental data before jumping to the conclusions from them. Just claiming something is 3x faster/slower is odd, because you really need to follow up on the reason for the performance difference, not just trust the numbers. This is especially important for nano-benchmarks like you have.

Second, the experimenters should clearly understand what they control and what they don't. In your particular example, you are returning the value from @Benchmark methods, but can you be reasonably sure the callers outside will do the same thing for primitive and the reference? If you ask yourself this question, then you'll realize you are basically measuring the test infrastructure.

Down to the point. On my machine (i5-4210U, Linux x86_64, JDK 8u40), the test yields:

Benchmark                    (value)   Mode  Samples  Score   Error   Units
...benchmarkReturnOrdinal          3  thrpt        5  0.876 ± 0.023  ops/ns
...benchmarkReturnOrdinal          2  thrpt        5  0.876 ± 0.009  ops/ns
...benchmarkReturnOrdinal          1  thrpt        5  0.832 ± 0.048  ops/ns
...benchmarkReturnReference        3  thrpt        5  0.292 ± 0.006  ops/ns
...benchmarkReturnReference        2  thrpt        5  0.286 ± 0.024  ops/ns
...benchmarkReturnReference        1  thrpt        5  0.293 ± 0.008  ops/ns

Okay, so reference tests appear 3x slower. But wait, it uses an old JMH (1.1.1), let's update to current latest (1.7.1):

Benchmark                    (value)   Mode  Cnt  Score   Error   Units
...benchmarkReturnOrdinal          3  thrpt    5  0.326 ± 0.010  ops/ns
...benchmarkReturnOrdinal          2  thrpt    5  0.329 ± 0.004  ops/ns
...benchmarkReturnOrdinal          1  thrpt    5  0.329 ± 0.004  ops/ns
...benchmarkReturnReference        3  thrpt    5  0.288 ± 0.005  ops/ns
...benchmarkReturnReference        2  thrpt    5  0.288 ± 0.005  ops/ns
...benchmarkReturnReference        1  thrpt    5  0.288 ± 0.002  ops/ns

Oops, now they are only barely slower. BTW, this also tells us the test is infrastructure-bound. Okay, can we see what really happens?

If you build the benchmarks, and look around what exactly calls your @Benchmark methods, then you'll see something like:

public void benchmarkReturnOrdinal_thrpt_jmhStub(InfraControl control, RawResults result, ReturnEnumObjectVersusPrimitiveBenchmark_jmh l_returnenumobjectversusprimitivebenchmark0_0, Blackhole_jmh l_blackhole1_1) throws Throwable {
    long operations = 0;
    long realTime = 0;
    result.startTime = System.nanoTime();
    do {
        l_blackhole1_1.consume(l_longname.benchmarkReturnOrdinal());
        operations++;
    } while(!control.isDone);
    result.stopTime = System.nanoTime();
    result.realTime = realTime;
    result.measuredOps = operations;
}

That l_blackhole1_1 has a consume method, which "consumes" the values (see Blackhole for rationale). Blackhole.consume has overloads for references and primitives, and that alone is enough to justify the performance difference.

There is a rationale why these methods look different: they are trying to be as fast as possible for their types of argument. They do not necessarily exhibit the same performance characteristics, even though we try to match them, hence the more symmetric result with newer JMH. Now, you can even go to -prof perfasm to see the generated code for your tests and see why the performance is different, but that's beyond the point here.

If you really want to understand how returning the primitive and/or reference differs performance-wise, you would need to enter a big scary grey zone of nuanced performance benchmarking. E.g. something like this test:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(5)
public class PrimVsRef {

    @Benchmark
    public void prim() {
        doPrim();
    }

    @Benchmark
    public void ref() {
        doRef();
    }

    @CompilerControl(CompilerControl.Mode.DONT_INLINE)
    private int doPrim() {
        return 42;
    }

    @CompilerControl(CompilerControl.Mode.DONT_INLINE)
    private Object doRef() {
        return this;
    }

}

...which yields the same result for primitives and references:

Benchmark       Mode  Cnt  Score   Error  Units
PrimVsRef.prim  avgt   25  2.637 ± 0.017  ns/op
PrimVsRef.ref   avgt   25  2.634 ± 0.005  ns/op

As I said above, these tests require following up on the reasons for the results. In this case, the generated code for both is almost the same, and that explains the result.

prim:

                  [Verified Entry Point]
 12.69%    1.81%    0x00007f5724aec100: mov    %eax,-0x14000(%rsp)
  0.90%    0.74%    0x00007f5724aec107: push   %rbp
  0.01%    0.01%    0x00007f5724aec108: sub    $0x30,%rsp         
 12.23%   16.00%    0x00007f5724aec10c: mov    $0x2a,%eax   ; load "42"
  0.95%    0.97%    0x00007f5724aec111: add    $0x30,%rsp
           0.02%    0x00007f5724aec115: pop    %rbp
 37.94%   54.70%    0x00007f5724aec116: test   %eax,0x10d1aee4(%rip)        
  0.04%    0.02%    0x00007f5724aec11c: retq

ref:

                  [Verified Entry Point]
 13.52%    1.45%    0x00007f1887e66700: mov    %eax,-0x14000(%rsp)
  0.60%    0.37%    0x00007f1887e66707: push   %rbp
           0.02%    0x00007f1887e66708: sub    $0x30,%rsp         
 13.63%   16.91%    0x00007f1887e6670c: mov    %rsi,%rax     ; load "this"
  0.50%    0.49%    0x00007f1887e6670f: add    $0x30,%rsp
  0.01%             0x00007f1887e66713: pop    %rbp
 39.18%   57.65%    0x00007f1887e66714: test   %eax,0xe3e78e6(%rip)
  0.02%             0x00007f1887e6671a: retq

[sarcasm] See how easy it is! [/sarcasm]

The pattern is: the simpler the question, the more you have to work out to make a plausible and reliable answer.

Why Are Interface Method Invocations Slower Than Concrete Invocations