Using Fork in Windows with Ruby

Equivalent of fork do for windows machines in ruby

fork and Windows don't really go well together. The philosophy towards processes in Unix and Windows is very different.

In Unix, processes are the primary mechanism for functional decomposition. I.e. in Unix, an application or a service would be composed as a pipeline of multiple cooperating processes. Processes are used in Unix in a similar way as objects are in Ruby. In order for that to work, processes are cheap and lightweight.

In Windows, components are the primary mechanism for functional decomposition. I.e. in Windows, an application or a service would consist of a single process, implemented using multiple components. Components are used like objects in Ruby or processes in Unix. That means processes in Windows do not need to be cheap or lightweight, and in fact, they are pretty expensive, heavy, and slow to create and destroy.

In addition to processes being used very differently and being more expensive, they are also interacted with very differently. The APIs for creating, managing, and destroying processes are very different. There is no exact equivalent to fork in Windows, and no easy, performant way to implement it.

Unfortunately, going to TruffleRuby or JRuby also won't help: while TruffleRuby's and JRuby's Windows support traditionally tends to be even better than YARV's, this does not apply to fork because the JVM does not allow it: forking on the JVM creates a fork which has only the main thread in it, but none of the GC, compiler, I/O, or other helper and auxiliary threads, which leaves you with a broken JVM. It only really works if you immediately exec a different process, thus exiting the JVM.

That being said, this particular usage of Kernel#fork could probably be implemented on JRuby, because (if I remember correctly), you can instantiate multiple JRuby instances in the same JVM. So, for this particular usage, where you only execute some Ruby code in the background, you don't actually need to fork, you could spin up a new JRuby instance in a different thread.

But as far as I know, that is not implemented. (Might make an interesting project for a first contribution, though!)

The closest equivalent to your code using Kernel#spawn in Windows would probably be something like:

spawn(RbConfig.ruby, '-e' << <<~BLOCK_END)
#Example background code
x = 0

while x < 100 do
File.write("./example_file.txt", x.to_s, Mode: "a")
end
BLOCK_END

But that is not the correct way to approach your problem. You are approaching this as an X/Y problem: you have problem X. You know that the solution in Unix is Y. Now, instead of asking the question "how do I solve problem X on Windows", you are asking the question "how do I implement solution Y on Windows", without even questioning whether Y is the right solution.

So, your question should not be what is the equivalent of fork in Windows, but rather how to design an application in Windows.

Using Process.spawn as a replacement for Process.fork

EDIT: There is one common use case of fork() that can be replaced with spawn() -- the fork()--exec() combo. A lot of older (and modern) UNIX applications, when they want to spawn another process, will first fork, and then make an exec call (exec replaces the current process with another). This doesn't actually need fork(), which is why it can be replaced with spawn(). So, this:

if(!fork())
exec("dir")
end

can be replaced with:

Process.spawn("dir")

If any of the gems are using fork() like this, the fix is easy. Otherwise, it is almost impossible.


EDIT: The reason why win32-process' implementation of fork() doesn't work is that (as far as I can tell from the docs), it basically is spawn(), which isn't fork() at all.


No, I don't think it can be done. You see, Process.spawn creates a new process with the default blank state and native code. So, while I can do something like Process.spawn('dir') will start a new, blank process running dir, it won't clone any of the current process' state. It's only connection to your program is the parent - child connection.

You see, fork() is a very low level call. For example, on Linux, what fork() basically does is this: first, a new process is created with exactly cloned register state. Then, Linux does a copy-on-write reference to all of the parent process' pages. Linux then clones some other process flags. Obviously, all of these operations can only be done by the kernel, and the Windows kernel doesn't have the facilities to do that (and can't be patched to either).

Technically, only native programs need the OS for some sort of fork()-like support. Any layer of code needs the cooperation of the layer above it to do something like fork(). So while native C code needs the cooperation of the kernel to fork, Ruby theoretically only needs the cooperation of the interpreter to do a fork. However, the Ruby interpreter does not have a snapshot/restore feature, which would be necessarily to implement a fork. Because of this, normal Ruby fork is achieved by forking the interpreter itself, not the Ruby program.

So, while if you could patch the Ruby interpreter to add a stop/start and snapshot/restore feature, you could do it, but otherwise? I don't think so.

So what are your options? This is what I can think of:

  • Patch the Ruby interpreter
  • Patch the code that uses fork() to maybe use threads or spawn
  • Get a UNIX (I suggest this one)
  • Use Cygwin

Edit 1:
I wouldn't suggest using Cygwin's fork, as it involves special Cygwin process tables, there is no copy-on-write, which makes it very inefficient. Also, it involves a lot of jumping back and forth and a lot of copying. Avoid it if possible. Also, because Windows provides no facilities to copy address spaces, forks are very likely to fail, and will quite a lot of the time (see here).

Running fork(2) from Windows with Cygwin. Possible?

fork(2) is kludgey under Cygwin, as the Windows process model does not easily allow it to happen. Cygwin may allow its spawn to use it, but you're going to suffer a serious performance hit as Cygwin has to emulate everything by hand -- including copying the executable data, copying the open handles, etc.

Depending on how much shotgun uses fork(2), this emulation could be painful or it could be relatively minor.

Here's a good thread on GameDev.net discussing the lack of a fork facility on Win32. They bring up something which I don't have the patience or platform accessibility to investigate, but certainly sounds fun, dangerous, and explosive all at the same time:

So, you need to bypass Win32 and call the native API ({Nt|Zw}CreateProcess). The book "Windows Nt/2000 Native Api Reference" has an example "Forking a Win32 Process". This may be what you need.

I'm intrigued, but I doubt Cygwin uses it. It's probably there, to reiterate my answer to your question -- a lot of Unix apps rely on fork, and Cygwin likely makes it available. Just don't expect miracles, and you'll have to make Ruby aware of Cygwin by recompiling it to include its emulation layer.

Spawn a background process in Ruby on Windows?

The win32-process library, part of the Win32Utils suite, is probably what you're after.

http://win32utils.rubyforge.org/

The win32-process library adds the Process.create and Process.fork methods for MS Windows. In add addition, it provides different implementations of the wait, wait2, waitpid, and waitpid2 methods.
The Process.create method allows you to create native MS Windows processes using a variety of different configuration options.

The Process.fork implementation should be considered experimental and not used in production code.

Installation: gem install win32-process

Fork child process with timeout and capture output

You can use IO.pipe and tell Process.spawn to use the redirected output without the need of external gem.

Of course, only starting with Ruby 1.9.2 (and I personally recommend 1.9.3)

The following is a simple implementation used by Spinach BDD internally to capture both out and err outputs:

# stdout, stderr pipes
rout, wout = IO.pipe
rerr, werr = IO.pipe

pid = Process.spawn(command, :out => wout, :err => werr)
_, status = Process.wait2(pid)

# close write ends so we could read them
wout.close
werr.close

@stdout = rout.readlines.join("\n")
@stderr = rerr.readlines.join("\n")

# dispose the read ends of the pipes
rout.close
rerr.close

@last_exit_status = status.exitstatus

The original source is in features/support/filesystem.rb

Is highly recommended you read Ruby's own Process.spawn documentation.

Hope this helps.

PS: I left the timeout implementation as homework for you ;-)



Related Topics



Leave a reply



Submit