Compiled VS. Interpreted Languages

Compiled vs. Interpreted Languages

A compiled language is one where the program, once compiled, is expressed in the instructions of the target machine. For example, an addition "+" operation in your source code could be translated directly to the "ADD" instruction in machine code.

An interpreted language is one where the instructions are not directly executed by the target machine, but instead read and executed by some other program (which normally is written in the language of the native machine). For example, the same "+" operation would be recognised by the interpreter at run time, which would then call its own "add(a,b)" function with the appropriate arguments, which would then execute the machine code "ADD" instruction.

You can do anything that you can do in an interpreted language in a compiled language and vice-versa - they are both Turing complete. Both however have advantages and disadvantages for implementation and use.

I'm going to completely generalise (purists forgive me!) but, roughly, here are the advantages of compiled languages:

  • Faster performance by directly using the native code of the target machine
  • Opportunity to apply quite powerful optimisations during the compile stage

And here are the advantages of interpreted languages:

  • Easier to implement (writing good compilers is very hard!!)
  • No need to run a compilation stage: can execute code directly "on the fly"
  • Can be more convenient for dynamic languages

Note that modern techniques such as bytecode compilation add some extra complexity - what happens here is that the compiler targets a "virtual machine" which is not the same as the underlying hardware. These virtual machine instructions can then be compiled again at a later stage to get native code (e.g. as done by the Java JVM JIT compiler).

Difference between compiled and interpreted languages?

Neither approach has a clear advantage over the other - if one approach was always better, chances are that we'd start using it everywhere!

Generally speaking, compilers offer the following advantages:

  1. Because they can see all the code up-front, they can perform a number of analyses and optimizations when generating code that makes the final version of the code executed faster than just interpreting each line individually.

  2. Compilers can often generate low-level code that performs the equivalent of a high-level ideas like "dynamic dispatch" or "inheritance" in terms of memory lookups inside of tables. This means that the resulting programs need to remember less information about the original code, lowering the memory usage of the generated program.

  3. Compiled code is generally faster than interpreted code because the instructions executed are usually just for the program itself, rather than the program itself plus the overhead from an interpreter.

Generally speaking, compilers have the following drawbacks:

  1. Some language features, such as dynamic typing, are difficult to compile efficiently because the compiler can't predict what's going to happen until the program is actually run. This means that the compiler might not generate very good code.
  2. Compilers generally have a long "start-up" time because of the cost of doing all the analysis that they do. This means that in settings like web browsers where it's important to load code fast, compilers might be slower because they optimize short code that won't be run many times.

Generally speaking, interpreters have the following advantages:

  1. Because they can read the code as written and don't have to do expensive operations to generate or optimize code, they tend to start up faster than compilers.

  2. Because interpreters can see what the program does as its running, interpreters can use a number of dynamic optimizations that compilers might not be able to see.

Generally speaking, interpreters have the following disadvantages:

  1. Interpreters typically have higher memory usage than compilers because the interpreter needs to keep more information about the program available at runtime.

  2. Interpreters typically spend some CPU time inside of the code for the interpreter, which can slow down the program being run.

Because interpreters and compilers have complementary strengths and weaknesses, it's becoming increasingly common for language runtimes to combine elements of both. Java's JVM is a good example of this - the Java code itself is compiled, and initially it's interpreted. The JVM can then find code that's run many, many times and compile it directly to machine code, meaning that "hot" code gets the benefits of compilation while "cold" code does not. The JVM can also perform a number of dynamic optimizations like inline caching to speed up performance in ways that compilers typically don't.

Many modern JavaScript implementations use similar tricks. Most JavaScript code is short and doesn't do all that much, so they typically start off using an interpreter. However, if it becomes clear that the code is being run repeatedly, many JS engines will compile the code - or at least, compile bits and pieces of it - and optimize it using standard techniques. The net result is that the code is fast at startup (useful for loading web pages quickly) but gets faster the more that it runs.

One last detail is that languages are not compiled or interpreted. Usually, C code is compiled, but there are C interpreters available that make it easier to debug or visualize the code that's being run (they're often used in introductory programming classes - or at least, they used to be.) JavaScript used to be thought of as an interpreted language until some JS engines started compiling it. Some Python implementations are purely interpreters, but you can get Python compilers that generate native code. Now, some languages are easier to compile or interpret than others, but there's nothing stopping you from making a compiler or interpreter for any particular programming language. There's a theoretical result called the Futamura projections that shows that anything that can be interpreted can be compiled, for example.

What's the difference between compiled and interpreted language?

What’s the difference between compiled and interpreted language?

The difference is not in the language; it is in the implementation.

Having got that out of my system, here's an answer:

  • In a compiled implementation, the original program is translated into native machine instructions, which are executed directly by the hardware.

  • In an interpreted implementation, the original program is translated into something else. Another program, called "the interpreter", then examines "something else" and performs whatever actions are called for. Depending on the language and its implementation, there are a variety of forms of "something else". From more popular to less popular, "something else" might be

    • Binary instructions for a virtual machine, often called bytecode, as is done in Lua, Python, Ruby, Smalltalk, and many other systems (the approach was popularized in the 1970s by the UCSD P-system and UCSD Pascal)

    • A tree-like representation of the original program, such as an abstract-syntax tree, as is done for many prototype or educational interpreters

    • A tokenized representation of the source program, similar to Tcl

    • The characters of the source program, as was done in MINT and TRAC

One thing that complicates the issue is that it is possible to translate (compile) bytecode into native machine instructions. Thus, a successful intepreted implementation might eventually acquire a compiler. If the compiler runs dynamically, behind the scenes, it is often called a just-in-time compiler or JIT compiler. JITs have been developed for Java, JavaScript, Lua, and I daresay many other languages. At that point you can have a hybrid implementation in which some code is interpreted and some code is compiled.

Why are Interpreted Languages Slow?

Native programs runs using instructions written for the processor they run on.

Interpreted languages are just that, "interpreted". Some other form of instruction is read, and interpreted, by a runtime, which in turn executes native machine instructions.

Think of it this way. If you can talk in your native language to someone, that would generally work faster than having an interpreter having to translate your language into some other language for the listener to understand.

Note that what I am describing above is for when a language is running in an interpreter. There are interpreters for many languages that there is also native linkers for that build native machine instructions. The speed reduction (however the size of that might be) only applies to the interpreted context.

So, it is slightly incorrect to say that the language is slow, rather it is the context in which it is running that is slow.

C# is not an interpreted language, even though it employs an intermediate language (IL), this is JITted to native instructions before being executed, so it has some of the same speed reduction, but not all of it, but I'd bet that if you built a fully fledged interpreter for C# or C++, it would run slower as well.

And just to be clear, when I say "slow", that is of course a relative term.

What are the pros and cons of interpreted languages?

Blatant copy from wikipedia so I'll make this community wiki.

Advantages of interpreted languages

Interpreted languages give programs certain extra flexibility over compiled languages. Features that are easier to implement in interpreters than in compilers include (but are not limited to):

  • platform independence (Java's byte code, for example)
  • reflection and reflective usage of the evaluator (e.g. a first-order eval function)
  • dynamic typing
  • ease of debugging (it is easier to get source code information in interpreted languages)
  • small program size (since interpreted languages have flexibility to choose instruction code)
  • dynamic scoping
  • automatic memory management

Disadvantages of interpreted languages

An execution by an interpreter is usually much less efficient than regular program execution. It happens because either every instruction should pass an interpretation at runtime or as in newer implementations, the code has to be compiled to an intermediate representation before every execution. The virtual machine is a partial solution to the performance issue as the defined intermediate-language is much closer to machine language and thus easier to be translated at run-time. Another disadvantage is the need for an interpreter on the local machine to make the execution possible.

Compiled interpreted language

Compilation vs. "interpretation" is essentially a matter of implementation, not the language itself. For example, MRI Ruby 1.8 is interpreted, while MacRuby is compiled to native machine code. Both include an interactive REPL. All the languages I know that have at least one machine-code compiler and at least one REPL:

  • Ruby
  • Python
  • Almost all Lisps (Lisp was the language that pioneered this technique, AFAIK)
  • OCaml
  • Haskell
  • Forth

If we're counting compilation to bytecode as well as machine code, it's true of the vast majority of popular bytecode-compiled languages:

  • Java
  • Scala
  • Groovy
  • Erlang
  • C#
  • F#
  • Smalltalk


Related Topics



Leave a reply



Submit