Ruby Yaml Parser by Passing Constructor

Ruby YAML parser by passing constructor

Deserializing an object from Yaml doesn’t use the initialize method because in general there is no correspondance between the object’s instance variables (which is what the default Yaml serialization stores) and the parameters to initialize.

As an example, consider an object with an initialize that looks like this (with no other instance variables):

def initialize(param_one, param_two)
@a_variable = some_calculation(param_one, param_two)
end

Now when an instance of this is deserialized, the Yaml processor has a value for @a_variable, but the initialize method requires two parameters, so it can’t call it. Even if the number of instance variables matches the number of parameters to initialize it is not necessarily the case that they correspond, and even if they did the processor doesn’t know the order they shoud be passed to initialize.

The default process for serializing and deserializing a Ruby object to Yaml is to write out all instance variables (with their names) during serialization, then when deserializing allocate a new instance of the class and simply set the same instance variables on this new instance.

Of course sometimes you need more control of this process. If you are using the Psych Yaml processor (which is the default in Ruby 1.9.3) then you should implement the encode_with (for serialisation) or or init_with (for deserialization) methods as appropriate.

For serialization, Psych will call the encode_with method of an object if it is present, passing a coder object. This object allows you to specify how the object should be represented in Yaml – normally you just treat it like a hash.

For deserialization, Psych will call the init_with method if it is present on your object instead of using the default procedure described above, again passing a coder object. This time the coder will contain the information about the objects representation in Yaml.

Note you don’t need to provide both methods, you can just provide either one if you want. If you do provide both, the coder object you get passed in init_with will essentially be the same as the one passed to encode_with after that method has run.

As an example, consider an object that has some instance variables that are calculated from others (perhaps as an optimisation to avoid a large calculation), but shouldn’t be serialized to the Yaml.

class Foo

def initialize(first, second)
@first = first
@second = second
@calculated = expensive_calculation(@first, @second)
end

def encode_with(coder)
# @calculated shouldn’t be serialized, so we just add the other two.
# We could provide different names to use in the Yaml here if we
# wanted (as long as the same names are used in init_with).
coder['first'] = @first
coder['second'] = @second
end

def init_with(coder)
# The Yaml only contains values for @first and @second, we need to
# recalculate @calculated so the object is valid.
@first = coder['first']
@second = coder['second']
@calculated = expensive_calculation(@first, @second)
end

# The expensive calculation
def expensive_calculation(a, b)
...
end
end

When you dump an instance of this class to Yaml, it will look something like this, without the calculated value:

--- !ruby/object:Foo
first: 1
second: 2

When you load this Yaml back into Ruby, the created object will have the @calculated instance variable set.

If you wanted you could call initialize from within init_with, but I think it would be better to keep the a clear separation between initializing a new instance of the class, and deserializing an existing instance from Yaml. I would recommend extracting the common logic into methods that can be called from both instead,

How to pass parameters from a yaml file to a constructor, without expicitly mentioning each one?

Having look into source code I found this:

def hostnames_from(options)
options.fetch(:hosts_shuffle_strategy, @default_hosts_shuffle_strategy).call(
[ options[:hosts] || options[:host] || options[:hostname] || DEFAULT_HOST ].flatten
)
end

It seems it is expecting a symbol :host not, string 'host' which is practicaly the only difference between two ways you're calling initializer. Try:

config = HashWithIndifferentAccess.new YAML.load_file('config/rabbitmq.yml')

Calling initialize when loading an object serialized with YAML

I don't think you can. Since the code you will add is really specific to the class being deserialized, you should consider adding the feature in the class. For instance, let Foo be the class you want to deserialize, you can add a class method such as:

class Foo
def self.from_yaml( yaml )
foo = YAML::load( yaml )
# edit the foo object here
foo
end
end

myFoo = Foo.from_yaml( "myFoo.yaml" )

ruby YAML parser producing extraneous `` when aliasing array?

It seems that you cannot use arrays as top level keys when merging.
If you create a new rails app, you can see rails using this pattern inside config/database.yml (here's a snippet)

# config/database.yml

default: &default
adapter: postgresql

development:
<<: *default

In the example above it works fine, but see there's no array right below the default key.

In order for you code to work fine, you should wrap your array inside another key, like so:

# example.yml
foo: &foo
list:
- a: 1
bar:
<<: *foo

And then you'll get:

YAML.load_file('example.yml')
#=> {"foo"=>{"list"=>[{"a"=>1}]}, "bar"=>{"list"=>[{"a"=>1}]}}

As you can see, the main problem is using arrays without wrapping them into a parent key.

Ruby custom class to and from YAML;

When you write to yaml, you don't need to first call to_yaml, just pass the object itself to YAML.dump( object )

This probably led you into other problems because the output of to_yaml was a string.. and the YAML.dump actually wrote your object as a string to to the file (that's why you have an initial "-- |" line. Anything code loading that file would load that data as a string.

Load a single object like this:

File.open( 'test_set.yaml', 'r') { |fh|  mq_loaded = YAML.load( fh ) }

The "new" you're using is generally confusing because new is a keyword.

Serialise hash with dates as YAML rails

Rails5.1/Ruby2.4 do it correct, since 2017-12-27 00:00:00 is not a valid yaml value.

The good thing is serialize accepts two parameters, the second one being a serializer class.

So, all you need to do is:

class ReportService < ApplicationRecord
# irrelevant AR associations omitted
serialize :report_params, MyYaml
serialize :output_urls, MyYaml
end

and implement MyYaml, delegating everything, save for date/time to YAML and producing whatever you need for them.

The above is valid for any format of serialized data, it’s completely format-agnostic. Examples.

instance_variable_set in constructor

Diagnosing the problem

What you're doing here is a fairly simple example of metaprogramming, i.e. dynamically generating code based on some input. Metaprogramming often reduces the amount of code you need to write, but makes the code harder to understand.

In this particular case, it also introduces some coupling concerns: the public interface of the class is directly related to the internal state in a way that makes it hard to change one without changing the other.

Refactoring the example

Consider a slightly longer example, where we make use of one of the instance variables:

class Foo
def initialize(opts={})
opts.each do |k, v|
instance_variable_set("@#{k}", v)
end
end

def greet(name)
greeting = @greeting || "Hello"
puts "#{greeting}, name"
end
end

Foo.new(greeting: "Hi").greet

In this case, if someone wanted to rename the @greeting instance variable to something else, they'd possibly have a hard time understanding how to do that. It's clear that @greeting is used by the greet method, but searching the code for @greeting wouldn't help them find where it was first set. Even worse, to change this bit of internal state they'd also have to change any calls to Foo.new, because the approach we've taken ties the internal state to the public interface.

Remove the metaprogramming

Let's look at an alternative, where we just store all of the opts and treat them as state:

class Foo
def initialize(opts={})
@opts = opts
end

def greet(name)
greeting = @opts.fetch(:greeting, "Hello")
puts "#{greeting}, name"
end
end

Foo.new(greeting: "Hi").greet

By removing the metaprogramming, this clarifies the situation slightly. A new team member who's looking to change this code for the first time is going to have a slightly easier time of things, because they can use editor features (like find-and-replace) to rename the internal ivars, and the relationship between the arguments passed to the initialiser and the internal state is a bit more explicit.

Reduce the coupling

We can go even further, and decouple the internals from the interface:

class Foo
def initialize(opts={})
@greeting = opts.fetch(:greeting, "Hello")
end

def greet(name)
puts "#{@greeting}, name"
end
end

Foo.new(greeting: "Hi").greet

In my opinion, this is the best implementation we've looked at:

  1. There's no metaprogramming, which means we can find explicit references to variables being set and used, e.g. with an editor's search features, grep, git log -S, etc.
  2. We can change the internals of the class without changing the interface, and vice-versa.
  3. By calling opts.fetch in the initialiser, we're making it clear to future readers of our class what the opts argument should look like, without making them read the whole class.

When to use metaprogramming

Metaprogramming can sometimes be useful, but those situations are rare. As a rough guide, I'd be more likely to use metaprogramming in framework or library code which typically needs to be more generic (e.g. the ActiveModel::AttributeAssignment module in Rails), and to avoid it in application code, which is typically more specific to a particular problem or domain.

Even in library code, I'd prefer the clarity of a few lines of repetition.

ruby 2.2. : passing parameter to SQL request string in config yaml file

For the Ruby DB interfaces I'm familiar with, you can pass arguments to execute that will be SQL escaped and interpolated into the query at points marked by ?. So, first you want to rewrite the query to: SELECT * FROM mytable WHERE name = ?;. Then, you can call @db.execute(@config['folders']['tree']['bottom'], name). Compared to Ruby string interpolation, this also has the advantage of ensuring that any untrusted parameters are properly escaped.



Related Topics



Leave a reply



Submit