How to Compile Ruby to Byte Code as with Python

How to compile Ruby?

The simple answer is that you can't, at least with MRI 1.8 (the standard). This is because 1.8 works by walking the Abstract Syntax Tree. Python, Ruby 1.9, JRuby, and Rubinius use byte code, which allows compilation to an Intermediate Representation (byte code). From MRI Ruby 2.3 it has become easy to do this, see this answer below.

With Rubinius, you can do something as described in this post: http://rubini.us/2011/03/17/running-ruby-with-no-ruby/

In JRuby you can use the "Ahead Of Time" compiler through, I believe, jrubyc.

This isn't really the standard way of doing things and you're generally better off just letting your Ruby implementation handle it like it wants to. Rubinius, at least, will cache byte code after the first compilation, updating it as it needs to.

Compile a string to Ruby bytecode for better performance -- like compile() in Python

Based on the solution of jhs, but directly using the lambda as the loop body (the & calls to_proc on the lambda and passes it as block to the select function).

given_code = 'n % 2 == 1'
pred = eval "lambda { |n| #{given_code} }"
p all = (1..10).select(&pred)

Does Ruby have an equivalent to the __pycache__ folder in Python?

Ruby doesn't haven an equivalent. It just wouldn't make sense: Ruby is a programming language. A programming language is an abstract mathematical concept, a specification. Putting such detailed things as the name of the directory of the byte code cache in a language would be way too restrictive: what if somebody wants to implement Ruby on a platform which doesn't have files? What if someone wants to implement Ruby on a platform where underscores are illegal in directory names? What if someone wants to implement Ruby with an interpreter instead of a compiler?

There are, however, some Ruby implementations which do compile to byte code. YARV and Rubinius are two examples of those. YARV only compiles in memory, whereas Rubinius caches the compiled byte code on disk. In fact, it must have the ability to save and read the compiled byte code because the compiler itself is written in Ruby, and otherwise it would have to compile itself in order to be able to compile any code, but in order to compile itself it would first have to compile itself and in order to that it would first have to …

But that is a private internal implementation detail of Rubinius. It is not part of Ruby nor should it be.

Can Ruby, PHP, or Perl create a pre-compiled file for the code like Python?

There is no portable bytecode specification for Ruby, and thus also no standard way to load precompiled bytecode archives. However, almost all Ruby implementations use some kind of bytecode or intcode format, and several of them can dump and reload bytecode archives.

YARV always compiles to bytecode before executing the code, however that is usually only done in memory. There are ways to dump out the bytecode to disk. At the moment, there is no way to read it back in, however. This will change in the future: work is underway on a bytecode verifier for YARV, and once that is done, bytecode can safely be loaded into the VM, without fear of corruption. Also, the JRuby developers have indicated that they are willing to implement a YARV VM emulator inside JRuby, once the YARV bytecode format and verifier are stabilized, so that you could load YARV bytecode into JRuby. (Note that this version is obsolete.)

Rubinius also always compiles to bytecode, and it has a format for compiled files (.rbc files, analogous to JVM .class files) and there is talk about a bytecode archive format (.rba files, analogous to JVM .jar files). There is a chance that Rubinius might implement a YARV emulator, if deploying apps as YARV bytecode ever becomes popular. Also, the JRuby developers have indicated that they are willing to implement a Rubinius bytecode emulator inside JRuby, if Rubinius bytecode becomes a popular way of deploying Ruby apps. (Note that this version is obsolete.)

XRuby is a pure compiler, it compiles Ruby sourcecode straight to JVM bytecode (.class files). You can deploy these .class files just like any other Java application.

JRuby started out as an interpreter, but it has both a JIT compiler and an AOT compiler (jrubyc) that can compile Ruby sourcecode to JVM bytecode (.class files). Also, work is underway to create a new compiler that can compile (type-annotated) Ruby code to JVM bytecode that actually looks like a Java class and can be used from Java code without barriers.

Ruby.NET is a pure compiler that compiles Ruby sourcecode to CIL bytecode (PE .dll or .exe files). You can deploy these just like any other CLI application.

IronRuby also compiles to CIL bytecode, but typically does this in-memory. However, you can pass commandline switches to it, so it dumps the .dll and .exe files out to disk. Once you have those, they can be deployed normally.

BlueRuby automatically pre-parses Ruby sourcecode into BRIL (BlueRuby Intermediate Language), which is basically a serialized parsetree. (See Blue Ruby - A Ruby VM in SAP ABAP(PDF) for details.)

I think (but I am definitely not sure) that there is a way to get Cardinal to dump out Parrot bytecode archives. (Actually, Cardinal only compiles to PAST, and then Parrot takes over, so it would be Parrot's job to dump and load bytecode archives.)

how to run the bytecode generated by ruby?

TL;DR; You are looking for .eval method.

The .compile method would return an instance of RubyVM::InstructionSequence class, which has .eval method that evaluates/runs your "compiled" instructions.

iseq = RubyVM::InstructionSequence.compile("x = 50; x > 100 ? 'foo' : 'bar'")    
iseq.eval # => "bar"

Or, a oneliner:

RubyVM::InstructionSequence.compile("x = 50; x > 100 ? 'foo' : 'bar'").eval

Should I use Python or Ruby for creating a cross-platform, compiled application?

You can indeed distribute Python bytecode (.pyc files) to avoid distributing your Python source code.

According to this answer, some Ruby implementations also support compiling to bytecode.

So it sounds like, depending on which Ruby implementation you pick, you may find very little difference between using Python and Ruby.

Keep in mind that bytecode isn't hard to disassemble, so a motivated user would still be able to find out quite a bit about your program's internals. Using an obfuscator can make it harder (but still not impossible) to reverse engineer your bytecode. This is discussed more in this Python question and this Ruby question.

What does compiling prevent?

There is no standardized byte code format for Ruby. Therefore, whatever you have there, it is not "Ruby byte code", it is byte code for one version of one implementation of Ruby.

In your particular case, it is byte code for YARV. It will not work on MRuby, JRuby, Rubinius, Opal, MagLev, IronRuby, Topaz, MRI, or any other Ruby implementation.

Also, YARV does not guarantee forwards or backwards compatibility for its byte code, so there is no guarantee it will work on newer or older versions of YARV. The documentation says:

The instruction sequence results will almost certainly change as Ruby changes

Likewise, YARV does not guarantee byte code portability, so there is no guarantee it will work on a different operating system, different CPU, or different platform, even using the same version of YARV.

Lastly, YARV's byte code is unsafe, and there is no verifier. YARV will happily execute any unsafe byte code without checking, and you can construct byte code that leaves the VM in an unsafe state. Therefore, you should never ever do this with byte code you haven't created yourself and that is fully under your own control. The documentation says:

This loader does not have a verifier, so that loading broken/modified binary causes critical problem.

You should not load binary data provided by others. You should use binary data translated by yourself.

Note that with regards to your specific question:

Are there any obvious downsides that come with compiling Ruby to byte code, (except of course readability)?

You seem to be under the false impression that you actively need to do something special in order to compile Ruby to byte code. That is not necessarily true.

If you use YARV, Rubinius, MRuby, MagLev, or Topaz, then your Ruby code is always compiled to byte code, without you having to do anything. With IronRuby and JRuby, it may or may not be compiled to byte code, depending on whether the code is "hot" enough.

Also, with MagLev, your byte code will be compiled to native code if it is "hot" enough, with Rubinius and YARV, it might get compiled to native code depending on the version, and with IronRuby and JRuby, the CIL / JVM byte code might get compiled to native code depending on the CLI VES / JVM implementation.



Related Topics



Leave a reply



Submit