Marshal (Ruby) Pipes: Sending Serialized Object to Child Processes

Object serialization sent through pipe NoMethodError

The reason is that the line reader.gets returns a string and not a ruby object. Change the line obj = reader.gets to:

obj = Marshal.load(reader.gets)

And in the first fork block, you weren't writing the marshalled object but the object Pair.new(1,2).to_s. It should be writer.puts temp_str instead of writer.puts temp.

Python multiprocessing guidelines seems to conflict: share memory or pickle?

The second guideline is the one relevant to your use case.

The first is reminding you that this isn't threading where you manipulate shared data structures with locks (or atomic operations). If you use Manager.dict() (which is actually SyncManager.dict) for everything, every read and write has to access the manager's process, and you also need the synchronization typical of threaded programs (which itself may come at a higher cost from being cross-process).

The second guideline suggests inheriting shared, read-only objects via fork; in the forkserver case, this means you have to create such objects before the call to set_start_method, since all workers are children of a process created at that time.

The reports on the usability of such sharing are mixed at best, but if you can use a small number of any of the C-like array types (like numpy or the standard array module), you should see good performance (because the majority of pages will never be written to deal with reference counts). Note that you do not need multiprocessing.Array here (though it may work fine), since you do not need writes in one concurrent process to be visible in another.

Cross platform IPC

In terms of speed, the best cross-platform IPC mechanism will be pipes. That assumes, however, that you want cross-platform IPC on the same machine. If you want to be able to talk to processes on remote machines, you'll want to look at using sockets instead. Luckily, if you're talking about TCP at least, sockets and pipes behave pretty much the same behavior. While the APIs for setting them up and connecting them are different, they both just act like streams of data.

The difficult part, however, is not the communication channel, but the messages you pass over it. You really want to look at something that will perform verification and parsing for you. I recommend looking at Google's Protocol Buffers. You basically create a spec file that describes the object you want to pass between processes, and there is a compiler that generates code in a number of different languages for reading and writing objects that match the spec. It's much easier (and less bug prone) than trying to come up with a messaging protocol and parser yourself.



Related Topics



Leave a reply



Submit