Undefined namespace prefix in Nokogiri and XPath
I'm not sure why, but it seems that you have to drop the namespace prefix to get the node:
xmlfeed.at_xpath("//totalresults")
Also note that I added the double forward slash, which scopes the search over the whole document (it won't work without it).
UPDATE:
Based on this answer: How do I get Nokogiri to understand my namespaces? I'd guess that the namespace (openSearch:totalResults
) is not correctly declared as an attribute on the root node of the document, and hence Nokogiri is just ignoring it, which is why the selector above works but the namespaced one doesn't.
Avoiding Nokogiri::XML::XPath::SyntaxError: ERROR: Undefined namespace prefix
I ended up solving the problem by editing the XML file and adding the namespaces in the root. Here is an example:
temp = Nokogiri::XML(@document_xml)
temp.root['xmlns:w'] = "http://schemas.openxmlformats.org/wordprocessingml/2006/main"
@doc = Nokogiri::XML(temp.to_xml(:save_with => Nokogiri::XML::Node::SaveOptions::AS_XML))
Nokogiri/Xpath namespace query
All namespaces need to be registered when parsing. Nokogiri automatically registers namespaces on the root node. Any namespaces that are not on the root node you have to register yourself. This should work:
puts doc.xpath('//dc:title', 'dc' => "URI")
Alternately, you can remove namespaces altogether. Only do this if you are certain there will be no conflicting node names.
doc.remove_namespaces!
puts doc.xpath('//title')
How do I use xpath on nodes with a prefix but without a namespace?
The problem is that the namespace is not properly defined in the XML document. As a result, Nokogiri sees the node names as being "a:root" instead of "a" being a namespace and "root" being the node name:
xml = %Q{
<?xml version="1.0" encoding="UTF-8"?>
<a:root>
<a:thing>stuff0</a:thing>
<a:thing>stuff1</a:thing>
</a:root>
}
doc = Nokogiri::XML(xml)
puts doc.at_xpath('*').node_name
#=> "a:root"
puts doc.at_xpath('*').namespace
#=> ""
Solution 1 - Specify node name with colon
One solution is to search for nodes with the name "a:thing". You cannot do //a:thing
since the XPath will treat the "a" as a namespace. You can get around this by doing //*[name()="a:thing"]
:
xml = %Q{
<?xml version="1.0" encoding="UTF-8"?>
<a:root>
<a:thing>stuff0</a:thing>
<a:thing>stuff1</a:thing>
</a:root>
}
doc = Nokogiri::XML(xml)
things = doc.xpath('//*[name()="a:thing"]')
puts things
#=> <a:thing>stuff0</a:thing>
#=> <a:thing>stuff1</a:thing>
Solution 2 - Modify the XML document to define the namespace
An alternative solution is to modify the XML file that you get to properly define the namespace. The document will then behave with namespaces as expected:
xml = %Q{
<?xml version="1.0" encoding="UTF-8"?>
<a:root>
<a:thing>stuff0</a:thing>
<a:thing>stuff1</a:thing>
</a:root>
}
xml.gsub!('<a:root>', '<a:root xmlns:a="foo">')
doc = Nokogiri::XML(xml)
things = doc.xpath('//a:thing')
puts things
#=> <a:thing>stuff0</a:thing>
#=> <a:thing>stuff1</a:thing>
Syntax error about XPath in Nokogiri, when combining namespace and node()
Different from elements, you don't need to use a namespace prefix to match by node()
. The following will return all nodes in any namespace just fine:
result = xml_doc.xpath("//node()")
There are several types of nodes in XPath, namely text node, comment node, element node, so on. node()
is a node tests which simply returns true for any node type whatsoever. Compare to text()
which is another type of node tests that returns true only for text nodes. (See "w3.org > Xpath > Node Tests")
In my understanding, the notion of local name and namespace are only exists in the context of element nodes, so using a namespace prefix along with the node()
test simply doesn't make sense.
If you meant to select all elements in a specific namespace use *
instead of node()
:
result = xml_doc.xpath("//x:*", 'x' => 'www.example.com')
How do I get Nokogiri to understand my namespaces?
It doesn't look like the namespaces in this document are correctly declared - there should be xmlns:samlp
and xmlns:saml
attributes on the root node. In cases like this, Nokogiri essentially ignores the namespaces (as it can't map them to URIs or URNs), so your XPath works if you remove them, i.e.
doc.xpath(XPATH_QUERY)
Splunk-client (with Nokogiri) giving Undefined Namespace Prefix
I found out the issue -- the splunk client wasn't authenticating properly, and so search
was actually a broken SplunkJob object (with a nil username and authentication key). It's strange that there was no error raised until the wait
command, but upon inspecting the search
object, one of the fields stated that the object was malformed.
Remove nokogiri attribute based on namespace prefix
Node objects have a remove
method that drops them from the tree, so you can write something like this:
require 'nokogiri'
doc = Nokogiri::XML(DATA)
puts '--- Before'
puts doc.to_s
doc.traverse do |node|
next unless node.respond_to? :attributes
node.attributes.each do |key, val|
val.remove if val&.namespace&.prefix == 'opf'
end
end
puts
puts '--- After'
puts doc.to_s
__END__
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:identifier id="iden" opf:scheme="ISBN">xxxx</dc:identifier>
<dc:creator opf:role="aut" opf:file-as="Name">xxxx</dc:creator>
<dc:date opf:event="publication">xxxx</dc:date>
<dc:publisher>xxxx</dc:publisher>
<meta name="cover" content="x"/>
</metadata>
And see the following output:
➜ ~ ruby test.rb
--- Before
<?xml version="1.0"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:identifier id="iden" opf:scheme="ISBN">xxxx</dc:identifier>
<dc:creator opf:role="aut" opf:file-as="Name">xxxx</dc:creator>
<dc:date opf:event="publication">xxxx</dc:date>
<dc:publisher>xxxx</dc:publisher>
<meta name="cover" content="x"/>
</metadata>
--- After
<?xml version="1.0"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:identifier id="iden">xxxx</dc:identifier>
<dc:creator>xxxx</dc:creator>
<dc:date>xxxx</dc:date>
<dc:publisher>xxxx</dc:publisher>
<meta name="cover" content="x"/>
</metadata>
Note If the Ruby version you are using doesn't support &.
you'll need to handle the namespace being potentially nil
.
Related Topics
Ruby Minitest: Suite- or Class- Level Setup
Why Do I Get an Error Installing the JSON Gem in Ubuntu
How to Implement This Post Request Using Httparty
Rails 4 Nested Attributes Not Saving
How to Use a Variable as Object Attribute in Rails
Rails: Encoding Woes with Serialized Hashes Despite Utf8
How to Pass Data from a Controller to a Model with Ruby on Rails
How to Make Httparty Ignore Ssl
Rspec Failing Error: Expected False to Respond to 'False'
What Exactly "Config.Assets.Debug" Setting Does
How to Extract the Code from a Proc Object
How to Test CSV File Download in Capybara and Rspec
How to Remove Installed Ri and Rdoc