Rails XML parsing
There are a lot of Ruby XML parsing libraries. However, if your XML is small, you can use the ActiveSupport Hash extension .from_xml
:
Hash.from_xml(x)["message"]["param"].inject({}) do |result, elem|
result[elem["name"]] = elem["value"]
result
end
# => {"msg"=>"xxxxxxxxxxxxx", "messageType"=>"SMS", "udh"=>nil, "id"=>"xxxxxxxxxxxxxx", "target"=>"xxxxxxxxxxxxx", "source"=>"xxxxxxxxxxx"}
Parsing XML with Ruby
As @pguardiario mentioned, Nokogiri is the de facto XML and HTML parsing library. If you wanted to print out the Id
and Name
values in your example, here is how you would do it:
require 'nokogiri'
xml_str = <<EOF
<THING1:things type="Container">
<PART1:Id type="Property">1234</PART1:Id>
<PART1:Name type="Property">The Name</PART1:Name>
</THING1:things>
EOF
doc = Nokogiri::XML(xml_str)
thing = doc.at_xpath('//things')
puts "ID = " + thing.at_xpath('//Id').content
puts "Name = " + thing.at_xpath('//Name').content
A few notes:
at_xpath
is for matching one thing. If you know you have multiple items, you want to usexpath
instead.- Depending on your document, namespaces can be problematic, so calling
doc.remove_namespaces!
can help (see this answer for a brief discussion). - You can use the
css
methods instead ofxpath
if you're more comfortable with those. - Definitely play around with this in
irb
orpry
to investigate methods.
Resources
- Parsing an HTML/XML document
- Getting started with Nokogiri
Update
To handle multiple items, you need a root element, and you need to remove the //
in the xpath
query.
require 'nokogiri'
xml_str = <<EOF
<root>
<THING1:things type="Container">
<PART1:Id type="Property">1234</PART1:Id>
<PART1:Name type="Property">The Name1</PART1:Name>
</THING1:things>
<THING2:things type="Container">
<PART2:Id type="Property">2234</PART2:Id>
<PART2:Name type="Property">The Name2</PART2:Name>
</THING2:things>
</root>
EOF
doc = Nokogiri::XML(xml_str)
doc.xpath('//things').each do |thing|
puts "ID = " + thing.at_xpath('Id').content
puts "Name = " + thing.at_xpath('Name').content
end
This will give you:
Id = 1234
Name = The Name1
ID = 2234
Name = The Name2
If you are more familiar with CSS selectors, you can use this nearly identical bit of code:
doc.css('things').each do |thing|
puts "ID = " + thing.at_css('Id').content
puts "Name = " + thing.at_css('Name').content
end
Parse xml file with nokogiri
Problem #1
In this line:
@parentN =parent.xpath('///ancestor::*/@name')
you override the previous value of @parentN
.
Problem #2
By running
<% for x in 0...@parentN.count %>
You will be getting 2 values for a single valued array. .count
is equivalent to the last index +1 (for an array with only [0] .count
is 1. Your @parentN
is assigned to an object
Recommendation (simple)
Use a single array to hold the nested values (as a hash) rather than two variables.
#xmlController.rb
@codes = []
doc.xpath('Report/Node').each do |parent|
@codes << { parent.xpath('@name') => parent.xpath('Node').map { |child| child.text }
end
#show.html.erb
<% @codes.each do |parent, children| %>
<p> PARENT: <%= @parent %> </p>
<p> CHILDREN: <%= @children.each { |child| p child } %> </p>
Recommendation based on comments below
The above was shown to demonstrate the simpilest way to think about the problem. Now that we are ready to parse all the data in the node, we need to change our xpath and our map. The doc.xpath('Report/Node')
is used to select the parent node, and that can stay the same. We will want to set the @codes
key to the actual value of the string embedded in the Node which is not parent.xpath('@name')
but actually parent.xpath('@name')[0].value
. There could be multiple xml representations of nodes with the attribute 'name' and we want the first ([0]
) one. The value of the name attribute is returned using the .value
method.
Make a class so the nodes become objects
Your Parent node has a name and a color and your children have name, color, and rank. It looks like you have a model for Node that looks like:
class Node
include ActiveModel::Model
attr_accessor :name, :color, :rank, :children
end
I'm simplifying things by not using persistence here, but you may want to save your records to disk, and if you do look into the slew of things ActiveRecord does on RailsGuides
Now when we go through the xml document, we will create an array of objects rather than the hash of strings (which both happen to be objects, but I'll leave that quandry for you to check out).
Parse the Xpath to get attributes of Node Objects
A quick way to set the name and color attributes of the parent looks like this:
@node = Node.new(doc.xpath('Report/Node').first.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs })
OK, so maybe that wasn't all that easy. What we do is take the Enumerable result of the XPath, navigate to the first attributes and make a hash of string attribute names (name, color, rank) and their corresponding values. Once we have the hash we pass it to our Node class' new method to instanciate (create) a node. This will pass us an object that we can use:
@node.name
#=> "Example Parent 1"
Extend the Class for children
Once we have the parent node, we can give it children, creating new nodes in an array. To facilitate this, we extend the definition of the model to include an overridden initializer (new()).
class Node
include ActiveModel::Model
attr_accessor :name, :color, :rank, :children
def initialize(*args)
self.children = []
super(*args)
end
end
Adding children@node.children << Node.new(doc.xpath('Report/Node').first.xpath('Node').first.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs })
We can automate this process now that we know how to create a Node object using .first
and a child of it using .first
with the previous enumeration.
doc.xpath('Report/Node').each do |parent|
node = Node.new(parent.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs }))
node.children = parent.xpath('Node').map do |child|
Node.new(child.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs }))
end
end
Ugly controller code
Move it to the modelBut Wait! That isn't very DRY! Let's move the logic that hurts our eyes to look at into the model to make it easier to work with.
class Node
include ActiveModel::Model
attr_accessor :name, :color, :rank, :children
def initialize(*args)
self.children = []
super(*args)
end
def self.new_from_xpath(xml_node)
self.new(xml_node.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs })
end
end
Final controller
Now the controller looks like this:
@nodes = []
doc.xpath('Report/Node').each do |parent|
node = Node.new_from_xpath(parent)
node.children = parent.xpath('Node').map do |child|
Node.new_from_xpath(child)
end
@nodes << node
end
Using this in the view
In the view you can use the @nodes like this:
<% for @node in @nodes %>
Parent: <%= @node.name %>
Children: <% for @child in @node.children %>
<%= @child.name %> is <%= @child.color %>
<% end %>
<% end %>
Rails nokogiri parse XML file
You're on the right track. parts = xml_doc.xpath('/root/rows/row')
gives you back a NodeSet
i.e. a list of the <row>
elements.
You can loop through these using each
or use row indexes like parts[0]
, parts[1]
to access specific rows. You can then get the values of child nodes using xpath
on the individual rows.
e.g. you could build a list of the AnalogueCode
for each part with:
codes = []
parts.each do |row|
codes << row.xpath('AnalogueCode').text
end
Looking at the full example of the XML you're processing there are 2 issues preventing your XPath from matching:
the
<root>
tag isn't actually the root element of the XML so/root/..
doesn't matchThe XML is using namespaces so you need to include these in your XPaths
so there are a couple of possible solutions:
use CSS selectors rather than XPaths (i.e. use
search
) as suggested by the Tin Manafter
xml_doc = Nokogiri::XML(response.body)
doxml_doc.remove_namespaces!
and then useparts = xml_doc.xpath('//root/rows/row')
where the double slash is XPath syntax to locate theroot
node anywhere in the documentspecify the namespaces:
e.g.
xml_doc = Nokogiri::XML(response.body)
ns = xml_doc.collect_namespaces
parts = xml_doc.xpath('//xmlns:rows/xmlns:row', ns)
codes = []
parts.each do |row|
codes << xpath('xmlns:AnalogueCode', ns).text
end
I would go with 1. or 2. :-)
Ruby on Rails error when parsing XML
Depending on the version of Rails you use, you can change the following line to one of the options below it:
action.file_name = [doc.xpath("//field[@index='103']").first.content]
Updating to:
action.file_name = [doc.xpath("//field[@index='103']").first&.content]
# or
action.file_name = [doc.xpath("//field[@index='103']").first.try(:content)]
Both of these options protect against NilClass
errors. If you don't necessarily need value for action.file_name
, this will fix the error.
Otherwise, it's a case of ensuring the selector (doc.xpath("//field[@index='103']")
) is definitely correct (it seems to be, as you're not getting an error calling first
) and, if so, that there is definitely data in the array it returns.
Hope that helps - let me know if you've any questions.
How to use XML with Ruby on Rails
You can use from_xml
to parse XML data to hash:
xml = <<-XML
<?xml version="1.0" encoding="UTF-8"?>
<hash>
<foo type="integer">1</foo>
<bar type="integer">2</bar>
</hash>
XML
hash = Hash.from_xml(xml)
# => {"hash"=>{"foo"=>1, "bar"=>2}}
Reading from a local file:
# reading the file content into a variable
xml_file = File.read("my_xml_file.xml")
hash = Hash.from_xml(xml_file)
Reference:
https://apidock.com/rails/v4.2.7/Hash/from_xml/class
Xml parsing in rails
you can use Nokigiri here.
suppose this is your error.xml
<?xml version="1.0" encoding="UTF-8"?>
<responseParam>
<RESULT>-1</RESULT>
<ERROR_CODE>509</ERROR_CODE>
</responseParam>
you can do something like:-
@doc = Nokogiri::XML(File.open("error.xml"))
@doc.xpath("//ERROR_CODE")
will give you something like:-
# => ["<ERROR_CODE>509</ERROR_CODE>]"
The Node methods xpath
and css
actually return a NodeSet, which acts very much like an array
, and contains matching nodes from the document.
Related Topics
Best Way to Use HTML5 Data Attributes with Rails Content_Tag Helper
How to Access a Variable Within a Heredoc in Ruby
Factory_Girl + Rspec Doesn't Seem to Roll Back Changes After Each Example
Package Configuration for Libffi Is Not Found in MACos While Installing Travis-Cli
Celluloid Async Inside Ruby Blocks Does Not Work
Access Translation File (I18N) from Inside Rails Model
How to Get Systemd to Restart Rails App with Puma
How to Refresh a Page with Turbolinks
How Do Get a Random Datetime Rounded to Beginning of Hour in Rails
How to Return Everything After Last Slash(/) in a Ruby String
How to Trigger Mouse Event in Capybara Test
Good Resources to Learn MACruby
Coping with "String Contains Null Byte" Sent from Users
Truncate a Floating Point Number Without Rounding Up
Error Installing JSON 1.8.3 with Ruby 2.4
Where Is the Best Place to Add Methods to the Integer Class in Rails