Why does this Nokogiri XPath have a null return?
The <tbody>
tag is an optional tag which is implicit if it is omitted. This means the <tbody>
tags are inserted automatically by the browser when not present. They are not in the source code in your example, so nokogiri doesn't know about them.
Firebug uses the generated DOM, which does contains the tbody
elements, so the statement does match inside a browser.
Remove both the tbody
selectors and you should be fine.
Nokogiri returning nothing when doing an XPath search
The solution to your problem is to get rid of the tbody
parts in your xPath, as suggested in "Why does this Nokogiri XPath have a null return?".
Firefox generated tbody
elements for you, which is why they appear in Firefox's xPath, but they are not part of the original page source.
Try the following:
response = RestClient.get "http://www.buenosaires.gob.ar/areas/seguridad_justicia/seguridad_urbana/estaciones_servicio/buscador.php?&pag=0"
doc = Nokogiri::HTML(response.body,nil,'utf-8')
doc.remove_namespaces!
table = doc.xpath(".//*[@id='contsinderecha']/form/table/tr[4]/td/table/tr[5]/td/table")
XPath in Nokogiri returning empty array [] whereas I am expecting to have results
Include the namespace information when calling xpath
:
doc.xpath("//x:Meta", "x" => "test:com.test.search")
Why does .css work with this Nokogiri object but not XPath?
Assuming case_insensitive_equals
does what its name suggests, it is because the class
attribute isn’t equal to post-head
(case insensitively or not), but it does contain it. XPath treats class
attributes as plain strings, it doesn’t split them and handle the classes individually as CSS does.
A simple XPath that would work would be:
doc.xpath('//hgroup[contains(@class, "post-head")]//h2')
(I’ve removed the custom function, you will need to write your own to do this case insensitively.)
This isn’t quite the same though, as it will also match classes such as not-post-head
. A more complete XPath would be something like this:
doc.xpath('//hgroup[contains(concat(" ", normalize-space(@class), " "), " post-head ")]//h2')
nokogiri: why is this an invalid xpath?
normalize-space is a function. You can't use it there.
You need a node-set.
maybe you mean
//br/preceding-sibling::*
or you could use normalize-space in a predicate, inside square brackets. Think of the predicate as a filter or selector on the node-set. So you can do this:
//br/preceding-sibling::*[normalize-space()='Fred']
In English that translates to "all elements preceding <br>
in the document, and for which the (normalized) text is 'Fred' ". In this document:
<html>
<p>
<h2>Fred</h2>
<br/>
</p>
<table>
<tr>
<td>
<br/>
</td>
</tr>
</table>
</html>
...the xpath expression selects the <h2>
node.
I figured this out with the free XpathVisualizer tool available on codeplex.
Why does parsing HTML with Nokogiri return a blank?
To answer the question asked, link returns "", because it's an empty NodeSet. In other words, Nokogiri didn't find what you were looking for. A NodeSet behaves like an Array, so when you try to puts
an empty array you get ""
.
Because it's a NodeSet you should iterate over it, as you would an array. (The same is true of your doc.css
, which would also return a NodeSet.)
The reason it's empty is because Nokogiri can't find what you want. You're looking for the contents of aa
which are:
"audio59779184_153635497_-28469067_16663"
Substituting that into "//input[@id='#{aa}']"
gives:
"//input[@id='audio59779184_153635497_-28469067_16663']"
but should be:
"//input[@id='audio_info59779184_153635497_-28469067_16663']"
Searching for that finds content:
doc.search("//input[@id='audio_info59779184_153635497_-28469067_16663']").size => 1
XPath : Is it possible to return an empty String if elements are not present
In XPath 2.0:
parent/string(definition)
In XPath 1.0 - oh dear, why would anyone want to use XPath 1.0? ;-(
Nokogiri get xpath from Nokogiri::XML::Element
rc is not an element-it's an array of matching elements:
results = rc.map do |node|
Nokogiri::CSS.xpath_for node.css_path
end
p results
Or, if you know there is only one matching element:
xpath = Nokogiri::CSS.xpath_for rc[0].css_path
Note that xpath_for returns an array, so you will need to extract the first element of the array:
xpath.first
Related Topics
M Hartl's Ruby on Rails Tutorial Chapter 5 Custom Title on Home Page
Remove Substring from the String
Parse Command Line Arguments in a Ruby Script
How to Remove Carriage Returns with Ruby
What Is the Use of "#!/Usr/Local/Bin/Ruby -W" at the Start of a Ruby Program
Mongo - Ruby Connection Problem
How to Run Selenium (Used Through Capybara) at a Lower Speed
Where to Put Common Code Found in Multiple Models
How to Use Ruby's Readlines.Grep for Utf-16 Files
How to Resolve "Gpg: Command Not Found" Error During Rvm Installation
Vim Command-T Plugin Error: Could Not Load the C Extension
Ruby - Can't Modify Frozen String (Typeerror)