How to get the value of an attribute using Nokogiri
It's idiomatic to access parameter values by treating the node as a hash:
require 'nokogiri'
doc = Nokogiri::HTML('<div class="foo"></div>')
doc.at('div')['class'] # => "foo"
And, just like a hash, you can assign to it too:
doc.at('div')['class'] = 'bar'
puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body><div class="bar"></div></body></html>
See []
and []=
"Modifying Nodes and Attributes" in the documentation.
Nokogiri - Get attributes?
Meditate on this:
require 'nokogiri'
doc = Nokogiri::XML("<root attr=1></root>")
doc.errors # => [#<Nokogiri::XML::SyntaxError: 1:12: FATAL: AttValue: " or ' expected>, #<Nokogiri::XML::SyntaxError: 1:12: FATAL: attributes construct error>, #<Nokogiri::XML::SyntaxError: 1:12: FATAL: Couldn't find end of Start Tag root line 1>, #<Nokogiri::XML::SyntaxError: 1:12: FATAL: Extra content at the end of the document>]
doc.errors
is your friend.
Nokogiri to Find All Data Attrabutes Using a Wildcard
You can search for img tags with an attribute that starts with "data-" using the following:
//img[@*[starts-with(name(),'data-')]]
To break this down:
- // - Anywhere in the document
- img - img tag
- @* - All Attributes
- starts-with(name(),'data-') - Attribute's name starts with "data-"
Example:
require 'nokogiri'
doc = Nokogiri::HTML(<<-END_OF_HTML)
<img src='' />
<img data-method='a' src= ''>
<img data-info='b' src= ''>
<img data-type='c' src= ''>
<img src= ''>
END_OF_HTML
imgs = doc.xpath("//img[@*[starts-with(name(),'data-')]]")
puts imgs
# <img data-method="a" src="">
# <img data-info="b" src="">
# <img data-type="c" src="">
or using your desired loop
doc.css('img').select do |img|
img.xpath(".//@*[starts-with(name(),'data-')]").any?
end
#[#<Nokogiri::XML::Element:0x384 name="img" attributes=[#<Nokogiri::XML::Attr:0x35c name="data-method" value="a">, #<Nokogiri::XML::Attr:0x370 name="src">]>,
# #<Nokogiri::XML::Element:0x3c0 name="img" attributes=[#<Nokogiri::XML::Attr:0x398 name="data-info" value="b">, #<Nokogiri::XML::Attr:0x3ac name="src">]>,
# #<Nokogiri::XML::Element:0x3fc name="img" attributes=[#<Nokogiri::XML::Attr:0x3d4 name="data-type" value="c">, #<Nokogiri::XML::Attr:0x3e8 name="src">]>]
UPDATE To remove the attributes:
doc.css('img').each do |img|
img.xpath(".//@*[starts-with(name(),'data-')]").each(&:remove)
end
puts doc.to_s
#<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" #\"http://www.w3.org/TR/REC-html40/loose.dtd\">
#<html>
#<body>
# <img src=\"\">
# <img src=\"\">
# <img src=\"\">
# <img src=\"\">
# <img src=\"\">
#</body>
#</html>
This can be simplified to doc.xpath("//img/@*[starts-with(name(),'data-')]").each(&:remove)
how can I get some attributes when using Nokogiri
You can use css selector:
result.css("attr[name='English']").children.to_s
will give you "B"
How to get an attribute of the children of a Nokogiri nodeset
You can use @
to get the value of an attribute:
file.xpath('//w:ins/w:r/@w:rsidR|//w:del/w:r/@w:rsidDel').each do |id|
puts id
end
The w:r
element inside the w:del
element doesn't have a w:rsidR
attribute only a w:rsidDel
attribute.
how to get attribute values using nokogiri
To select all attributes of an element that is selected using the XPath expression someExpr
, you need to evaluate a new XPath expression:
someExpr/@*
where someExpr
must be substituted with the real XPath expression used to select the particular element.
This selects all attributes of all (we assume that's just one) elements that are selected by the Xpath expression someExpr
For example, if the element we want is selected by:
/a/b/c
then all of its attributes are selected by:
/a/b/c/@*
Related Topics
How to Skip Has_Secure_Password Validations
Redirect User After Log in Only If It's on Root_Path
Regex, How to Match Multiple Lines
Ruby Method That Returns Itself
Ruby Mocha: Is There an Equivalent to Rspec-Mocks' #And_Call_Original
How Does Require Rubygems Help Find Rubygem Files
Is There an Equivalent Null Prevention on Chained Attributes of Groovy in Ruby
How to Test If Parameters Exist in Rails
Insecure World Writable Dir /Users/Username in Path, Mode 040777 When Running Ruby Commands
How to Stub Applicationcontroller Method in Request Spec
Failing Installing Pg Gem, "Mkmf.Rb Can't Find Header Files for Ruby" (MAC Osx 10.6.5)
How to Check If a Ruby Array Includes One of Several Values
Is Everything an Object in Ruby
How to Use Params with Slashes with Sinatra