How Does Ruby Serialization (Marshaling) Work

How does ruby serialization (Marshaling) work?

The Ruby marshalling methods store the type of the object they're encoding. The way those two hooks work is like this:

  • marshal_dump has to return some data describing the state of your object. Ruby couldn't care less about the format of this data — it just has to be something that your marshal_load method can use to reconstruct the object's state.

  • marshal_load is called on the marshalled object just after it's been recreated. It's basically the initialize method for a marshalled object. It's passed whatever object marshal_dump returned and has to use that data to reconstruct its state.

Here's an example:

class Messenger

attr_accessor :name, :message

def marshal_dump
{'name' => name, 'message' => message}
end

def marshal_load(data)
self.name = data['name']
self.message = data['message']
end

end

ruby Marshal serialization

Four bytes? It so happens that marshalling the empty array [] produces a 4-byte string on my Ruby:

> Marshal.dump([]).length
=> 4

Are you sure vuser_ary isn't empty when you try marshalling it?

As it happens, there's no difference between a "reference" to an object and the object itself: there are no pointers in Ruby, so if you have an array (and it's non-empty), then it'll get marshalled:

> Marshal.dump([1, 2, 3]).length
=> 10

For good measure:

> vuser_ary = [{1=>"Fail", 2=>"Fail", 3=>"Pass", 4=>"Pass", 5=>"Fail"}]
=> [{1=>"Fail", 2=>"Fail", 3=>"Pass", 4=>"Pass", 5=>"Fail"}]
> Marshal.dump(vuser_ary).length
=> 72

How do I marshal a lambda (Proc) in Ruby?

You cannot marshal a Lambda or Proc. This is because both of them are considered closures, which means they close around the memory on which they were defined and can reference it. (In order to marshal them you'd have to Marshal all of the memory they could access at the time they were created.)

As Gaius pointed out though, you can use ruby2ruby to get a hold of the string of the program. That is, you can marshal the string that represents the ruby code and then reevaluate it later.

Why marshaling can serialize circular referenced list and json can't?

In my understanding both of them are converting an object to string and get the object back from the string.

Yes. That is pretty much the definition of "serialization" or "marshaling".

I also able to see that json don't have any construct to refer other part of the json which may be the reason why it can't support this kind of operation.

Yes, that is the reason.

But is it that difficult to introduce such construct in json to facilitate the current situation.

You cannot introduce constructs in JSON. It was deliberately designed to have no version number, so that it can never, ever be changed.

Of course, this only means that we cannot add it now, but could Doug Crockford have added it from the beginning, back when he was designing JSON? Yes, of course. But he didn't. JSON was deliberately designed to be simple (bold emphasis mine):

JSON is not a document format. It is not a markup language. It is not even a general serialization format in that it does not have a direct representation for cyclical structures […]

See, for example, YAML, a superset of JSON, which has references and thus can represent cyclical data.

Marshalling vs ActiveRecord Serialization in Ruby On Rails

IIRC:

Ruby Marshall is not guaranteed to work across different ruby versions or the same ruby versions on different platforms.

Because you may have different Ruby versions accessing the same serialized column, Rails implements it's serialization using YAML. Whilst this is slower, it does guarantee your serialized column can be read by other ruby versions, ruby on other OSs and also other programming languages.

Modify an object before marshaling it in Ruby

It may be that there is no superclass method for _dump. If it's defined on your object it's called. If not, the default handler is used.

You probably want to clone your object and remove the sensitive fields, returning that as a Hash inside your _dump function, then undo that within the _load method.

You can also read the documentation on Marshal where it describes the recommended methods.

How to write to file when using Marshal::dump in Ruby for object serialization

Like this:

class Line
attr_reader :p1, :p2
def initialize point1, point2
@p1 = point1
@p2 = point2
end
end

line = Line.new([1,2], [3,4])

Save line:

FNAME = 'my_file'

File.open(FNAME, 'wb') {|f| f.write(Marshal.dump(line))}

Retrieve into line1:

line1 = Marshal.load(File.binread(FNAME))

Confirm it works:

line1.p1 # => [1, 2]


Related Topics



Leave a reply



Submit