Jit Compiler VS Offline Compilers

JIT compiler vs offline compilers

Yes, there certainly are such scenarios.

JIT compilation can use runtime profiling to optimize specific cases based on measurement of the characteristics of what the code is actually doing at the moment, and can recompile "hot" code as necessary. That's not theoretical; Java's HotSpot actually does this.
JITters can optimize for the specific CPU and memory configuration in use on the actual hardware where the program happens to be executing. For example, many .NET applications will run in either 32-bit or 64-bit code, depending upon where they are JITted. On 64 bit hardware they will use more registers, memory, and a better instruction set.
Virtual method calls inside of a tight loop can be replaced with static calls based on runtime knowledge of the type of the reference.

I think there will be breakthroughs in the future. In particular, I think that the combination of JIT compilation and dynamic typing will be significantly improved. We are already seeing this in the JavaScript space with Chrome's V8 and TraceMonkey. I expect to see other improvements of similar magnitude in the not-too-distant future. This is important because even so-called "statically typed" languages tend to have a number of dynamic features.

Does a JIT compiler have any disadvantages compared to a traditional compiler?

Interpreter, JIT Compiler and "Offline" Compiler

Difference between a JIT compiler and an
interpreter

To keep it simple, let's just say that an interpreter will run the bytecode (intermediate code/language). When the VM/interpreter decides it is better to do so, the JIT compilation mechanism will translate that same bytecode into native code targetted for the hardware in question, with focus on the type of optimizations requested.

So basically a JIT might produce a
faster executable but take way longer
to compile?

I think what you are missing is that the JIT compilation happens at runtime and not compile time (unlike an "offline" compiler)

JIT Compilation has overhead

Compiling code is not free, takes time also. If it invests time on compiling it and then goes to run it only a few times, it might not have made a good trade. So the VM still has to decide what to define as a "hot spot" and JIT-compile it.

Allow me to give examples on the Java virtual machine (JVM):

The JVM can take switches with which you can define the threshold after which the code will be JIT compiled. -XX:CompileThreshold=10000

To illustrate the cost of the JIT compilation time, suppose you set that threshold to 20, and have a piece of code that needs to run 21 times. What happens is after it runs 20 times, the VM will now invest some time into JIT compiling it. Now you have native code from the JIT compilation, but it will only run for one more time (the 21), which may not have brought any performance boost to make up for the JIT process.

I hope this illustrates it.

Here is a JVM switch that shows the time spent on JIT compilation -XX:-CITime "Prints time spent in JIT Compiler"

Side Note: I don't think it's a "big deal", just something I wanted to point out since you brought up the question.

What does a just-in-time (JIT) compiler do?

A JIT compiler runs after the program has started and compiles the code (usually bytecode or some kind of VM instructions) on the fly (or just-in-time, as it's called) into a form that's usually faster, typically the host CPU's native instruction set. A JIT has access to dynamic runtime information whereas a standard compiler doesn't and can make better optimizations like inlining functions that are used frequently.

This is in contrast to a traditional compiler that compiles all the code to machine language before the program is first run.

To paraphrase, conventional compilers build the whole program as an EXE file BEFORE the first time you run it. For newer style programs, an assembly is generated with pseudocode (p-code). Only AFTER you execute the program on the OS (e.g., by double-clicking on its icon) will the (JIT) compiler kick in and generate machine code (m-code) that the Intel-based processor or whatever will understand.

Why are JIT-ed languages still slower and less memory efficient than native C/C++?

Some reasons for differences;

JIT compilers mostly compile quickly and skip some optimizations that take longer to find.
VM's often enforce safety and this slows execution. E.g. Array access is always bounds checked in .Net unless guaranteed within the correct range
Using SSE (great for performance if applicable) is easy from C++ and hard from current VM's
Performance gets more priority in C++ over other aspects when compared to VM's
VM's often keep unused memory a while before returning to the OS seeming to 'use' more memory.
Some VM's make objects of value types like int/ulong.. adding object memory overhead
Some VM's auto-Align data structures a lot wasting memory (for performance gains)
Some VM's implement a boolean as int (4 bytes), showing little focus on memoryconservation.

Can JIT compilation run faster than compile time template instantiation?

JIT compilers can measure the likelihood of a conditional jump being taken, and adjust the emitted code accordingly. A static compiler can do this as well, but not automatically; it requires a hint from the programmer.

Obviously this is just one factor among many, but it does indicate that it's possible for JIT to be faster under the right conditions.

What can a JIT compiler do that an AOT compiler cannot?

A JIT can optimize based on run-time information which result in stricter border conditions which were not provable at compile time. Examples:

It can see that a memory location is not aliased (because the code path taken never aliased it) and thus keep the variable in a register;
it can eliminate a test for a condition which can never occur (e.g. based on the current values of parameters);
it has access to the complete program and can inline code where it sees fit;
it can perform branch prediction based on the specific use pattern at run time so that it's optimal.

The inlining is principally also open to link time optimization of modern compilers/linkers but may lead to prohibitive code bloat if applied throughout the code just in case; at run time it can be applied just where necessary.

The branch prediction can be improved with normal compilers if the program is compiled twice, with a test run inbetween; in a first run the code is instrumented so that it generates profiling data which is used in the production compilation run to optimize branch prediction. The prediction is also less than optimal if the test run was not typical (and it's not always easy to prduce typical test data, or the usage patterns may shift over the life time of the program) .

Additionally, both link time and run time data optimization with static compilation need significant effort in the build process (to a degree that I have not seen them employed in production in the 10 or so places where I have worked in my life); with a JIT they are on by default.

Why is it hard to beat AOT compiler with a JIT compiler (in terms of app. performance)?

There's a definite trade-off between JIT and AOT (ahead-of-time) compilation.

As you stated, JIT has access to run-time information that can aid in optimization. This includes data about the machine it's executing on, enabling platform-specific native optimization. However, JIT also has the overhead of translating byte-code to native instructions.

This overhead often becomes apparent in applications where a fast start-up or near real-time responses are necessary. JIT is also not as effective if the machine does not have sufficient resources for advanced optimization, or if the nature of the code is such that it cannot be "aggressively optimized."

For example, taken from the article you linked:

... what should we
improve in the absence of clear performance bottlenecks? As you may
have guessed, the same problem exists for profile-guided JIT
compilers. Instead of a few hot spots to be aggressively optimized,
there are plenty of "warm spots" that are left intact.

AOT compilers can also spend as much time optimizing as they like, whereas JIT compilation is bound by time requirements (to maintain responsiveness) and the resources of the client machine. For this reason AOT compilers can perform complex optimization that would be too costly during JIT.

Also see this SO question: JIT compiler vs offline compilers

JIT compilers for math

You might want to take a look at LLVM.

when is java faster than c++ (or when is JIT faster then precompiled)?

In practice, you're likely to find your naively written Java code outperform your naively written C++ code in these situations (all of which I've personally observed):

Lots of little memory allocations/deallocations. The major JVMs have extremely efficient memory subsystems, and garbage collection can be more efficient than requiring explicit freeing (plus it can shift memory addresses and such if it really wants to).
Efficient access through deep hierarchies of method calls. The JVM is very good at eliding anything that is not necessary, usually better in my experience than most C++ compilers (including gcc and icc). In part this is because it can do dynamic analysis at runtime (i.e. it can overoptimize and only deoptimize if it detects a problem).
Encapsulation of functionality into small short-lived objects.

In each case, if you put the effort in, C++ can do better (between free lists and block-allocated/deallocated memory, C++ can beat the JVM memory system in almost every specific case; with extra code, templates, and clever macros, you can collapse call stacks very effectively; and you can have small partially-initialized stack-allocated objects in C++ that outperform the JVM's short-lived object model). But you probably don't want to put the effort in.

Why do compiled languages not perform equally if they eventually become machine code?

For one thing, C++ optimizers are much more mature. Another, performance has always been the overarching goal of the C++ language designers ("you don't pay for what you don't use" is the mantra, which clearly can't be said about Java's every-method-is-virtual policy).

Beyond that, C++ templates are far more optimization-friendly than Java or C# generics. Although JITs are often praised for their ability to optimize across module boundaries, generics stops this dead in its tracks. The CLR (.NET runtime) generates only one version of machine code for a generic which covers all reference types. On the other hand, the C++ optimizer runs for each combination of template parameters, and can inline dependent calls.

Next, with C# and Java you have very little control over memory layout. Parallel algorithms can suffer an order of magnitude performance degradation from false sharing of cache lines, and there's almost nothing that the developer can do about it. OTOH C++ provides tools to place objects at specific offsets relative to RAM pages and cache boundaries.

Jit Compiler VS Offline Compilers