How to Use Xpath on Xml Docs Having Default Namespace

How to use XPath on xml docs having default namespace

The XPath processing for a document that uses the default namespace (no prefix) is the same as the XPath processing for a document that uses prefixes:

For namespace qualified documents you can use a NamespaceContext when you execute the XPath. You will need to prefix the fragments in the XPath to match the NamespaceContext. The prefixes you use do not need to match the prefixes used in the document.

  • http://download.oracle.com/javase/6/docs/api/javax/xml/namespace/NamespaceContext.html

Here is how it looks with your code:

import java.util.Iterator;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

public class Demo {

public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");

XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new MyNamespaceContext());
NodeList nl = (NodeList) xPath.evaluate("/ns:root/ns:author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}

private static class MyNamespaceContext implements NamespaceContext {

public String getNamespaceURI(String prefix) {
if("ns".equals(prefix)) {
return "http://www.mydomain.com/schema";
}
return null;
}

public String getPrefix(String namespaceURI) {
return null;
}

public Iterator getPrefixes(String namespaceURI) {
return null;
}

}

}

Note:
I also used the corrected XPath suggested by Dennis.

The following also appears to work, and is closer to your original question:

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

public class Demo {

public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");

XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xPath.evaluate("/root/author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}

}

Java XPath: Queries with default namespace xmlns

In your Namespace context, bind a prefix of your choice (e.g. df) to the namespace URI in the document

xpath.setNamespaceContext( new NamespaceContext() {
public String getNamespaceURI(String prefix) {
switch (prefix) {
case "df": return "http://xml.sap.com/2002/10/metamodel/webdynpro";
...
}
});

and then use that prefix in your path expressions to qualify element names e.g. /df:ModelClass/df:ModelClass.Parent/df:Core.Reference[@type = 'Model']/@package.

Using Xpath With Default Namespace in C#

First - you don't need a navigator; SelectNodes / SelectSingleNode should suffice.

You may, however, need a namespace-manager - for example:

XmlElement el = ...; //TODO
XmlNamespaceManager nsmgr = new XmlNamespaceManager(
el.OwnerDocument.NameTable);
nsmgr.AddNamespace("x", el.OwnerDocument.DocumentElement.NamespaceURI);
var nodes = el.SelectNodes(@"/x:outerelement/x:innerelement", nsmgr);

how to retrieve XML data using XPath which has a default namespace in Java?

If the problem is that you're getting zero as the length of the result nodelist, have you tried changing

final String expression = "yfs:league";

to

final String expression = "//yfs:league";

?

It appears that the context for evaluating your XPath expressions, doc, is the root node of the document. dBuilder.parse(file) returns the document root node, not the outermost element (a.k.a. document element). Remember, in XPath, a root node is not an element. So doc
is not the yfs:fantasy_content element node but is its (invisible) parent.

In that context, the XPath expression "yfs:league" will only select an element that is a direct child of that root node, of which there is no yfs:league -- only yfs:fantasy_content.

How does XPath deal with XML namespaces?

Defining namespaces in XPath (recommended)

XPath itself doesn't have a way to bind a namespace prefix with a namespace. Such facilities are provided by the hosting library.

It is recommended that you use those facilities and define namespace prefixes that can then be used to qualify XML element and attribute names as necessary.


Here are some of the various mechanisms which XPath hosts provide for specifying namespace prefix bindings to namespace URIs.

(OP's original XPath, /IntuitResponse/QueryResponse/Bill/Id, has been elided to /IntuitResponse/QueryResponse.)

C#:

XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3");
XmlNodeList nodes = el.SelectNodes(@"/i:IntuitResponse/i:QueryResponse", nsmgr);

Java (SAX):

NamespaceSupport support = new NamespaceSupport();
support.pushContext();
support.declarePrefix("i", "http://schema.intuit.com/finance/v3");

Java (XPath):

xpath.setNamespaceContext(new NamespaceContext() {
public String getNamespaceURI(String prefix) {
switch (prefix) {
case "i": return "http://schema.intuit.com/finance/v3";
// ...
}
});
  • Remember to call
    DocumentBuilderFactory.setNamespaceAware(true).
  • See also:
    Java XPath: Queries with default namespace xmlns

JavaScript:

See Implementing a User Defined Namespace Resolver:

function nsResolver(prefix) {
var ns = {
'i' : 'http://schema.intuit.com/finance/v3'
};
return ns[prefix] || null;
}
document.evaluate( '/i:IntuitResponse/i:QueryResponse',
document, nsResolver, XPathResult.ANY_TYPE,
null );

Note that if the default namespace has an associated namespace prefix defined, using the nsResolver() returned by Document.createNSResolver() can obviate the need for a customer nsResolver().

Perl (LibXML):

my $xc = XML::LibXML::XPathContext->new($doc);
$xc->registerNs('i', 'http://schema.intuit.com/finance/v3');
my @nodes = $xc->findnodes('/i:IntuitResponse/i:QueryResponse');

Python (lxml):

from lxml import etree
f = StringIO('<IntuitResponse>...</IntuitResponse>')
doc = etree.parse(f)
r = doc.xpath('/i:IntuitResponse/i:QueryResponse',
namespaces={'i':'http://schema.intuit.com/finance/v3'})

Python (ElementTree):

namespaces = {'i': 'http://schema.intuit.com/finance/v3'}
root.findall('/i:IntuitResponse/i:QueryResponse', namespaces)

Python (Scrapy):

response.selector.register_namespace('i', 'http://schema.intuit.com/finance/v3')
response.xpath('/i:IntuitResponse/i:QueryResponse').getall()

PhP:

Adapted from @Tomalak's answer using DOMDocument:

$result = new DOMDocument();
$result->loadXML($xml);

$xpath = new DOMXpath($result);
$xpath->registerNamespace("i", "http://schema.intuit.com/finance/v3");

$result = $xpath->query("/i:IntuitResponse/i:QueryResponse");

See also @IMSoP's canonical Q/A on PHP SimpleXML namespaces.

Ruby (Nokogiri):

puts doc.xpath('/i:IntuitResponse/i:QueryResponse',
'i' => "http://schema.intuit.com/finance/v3")

Note that Nokogiri supports removal of namespaces,

doc.remove_namespaces!

but see the below warnings discouraging the defeating of XML namespaces.

VBA:

xmlNS = "xmlns:i='http://schema.intuit.com/finance/v3'"
doc.setProperty "SelectionNamespaces", xmlNS
Set queryResponseElement =doc.SelectSingleNode("/i:IntuitResponse/i:QueryResponse")

VB.NET:

xmlDoc = New XmlDocument()
xmlDoc.Load("file.xml")
nsmgr = New XmlNamespaceManager(New XmlNameTable())
nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3");
nodes = xmlDoc.DocumentElement.SelectNodes("/i:IntuitResponse/i:QueryResponse",
nsmgr)

SoapUI (doc):

declare namespace i='http://schema.intuit.com/finance/v3';
/i:IntuitResponse/i:QueryResponse

xmlstarlet:

-N i="http://schema.intuit.com/finance/v3"

XSLT:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:i="http://schema.intuit.com/finance/v3">
...

Once you've declared a namespace prefix, your XPath can be written to use it:

/i:IntuitResponse/i:QueryResponse


Defeating namespaces in XPath (not recommended)

An alternative is to write predicates that test against local-name():

/*[local-name()='IntuitResponse']/*[local-name()='QueryResponse']

Or, in XPath 2.0:

/*:IntuitResponse/*:QueryResponse

Skirting namespaces in this manner works but is not recommended because it

  • Under-specifies the full element/attribute name.

  • Fails to differentiate between element/attribute names in different
    namespaces (the very purpose of namespaces). Note that this concern could be addressed by adding an additional predicate to check the namespace URI explicitly1:

     /*[    namespace-uri()='http://schema.intuit.com/finance/v3' 
    and local-name()='IntuitResponse']
    /*[ namespace-uri()='http://schema.intuit.com/finance/v3'
    and local-name()='QueryResponse']

    1Thanks to Daniel Haley for the namespace-uri() note.

  • Is excessively verbose.

XPath Expression for getting default namespace

Your expression

/aaa/*[name()='bbb' and position()=1]/namespace::*

is correct and returns three namespace nodes. The problem might be in the way you are processing these nodes after they are returned. The expression should work in both XPath 1.0 and XPath 2.0, though I haven't checked it with the XPath engine built in to the JDK.

(Incidentally the notion that because you are using JDK 1.7 therefore you are using XPath 1.0 is a complete non sequitur, since there are several XPath 2.0 engines available for Java users).

To return only the namespace URI corresponding to the default namespace, use

/aaa/*[name()='bbb' and position()=1]/namespace::*[name()='']

Or indeed, since this query already assumes that the bbb element is in the default namespace, use

namespace-uri(/aaa/*[name()='bbb' and position()=1])

Default XML namespace, JDOM, and XPath

XPath 1.0 doesn't support the concept of a default namespace (XPath 2.0 does).
Any unprefixed tag is always assumed to be part of the no-name namespace.

When using XPath 1.0 you need something like this:

public static void main(String args[]) throws Exception {
SAXBuilder builder = new SAXBuilder();
Document d = builder.build("xpath.xml");
XPath xpath = XPath.newInstance("x:collection/x:dvd");
xpath.addNamespace("x", d.getRootElement().getNamespaceURI());
System.out.println(xpath.selectNodes(d));
}

XPath and namespace specification for XML documents with an explicit default namespace

Namespace definition without prefix (xmlns="...") is default namespace. In case of XML document having default namespace, the element where default namespace declared and all of it's descendant without prefix and without different default namespace declaration are considered in that aforementioned default namespace.

Therefore, in your case you need to use prefix registered for default namespace at the beginning of all elements in the XPath, for example :

/xmlns:doc//xmlns:b[@omegahat:status='foo']

UPDATE :

Actually I'm not a user of r, but looking at some references on net something like this may work :

getNodeSet(doc, "/ns:doc//ns:b[@omegahat:status='foo']", c(ns="http://something.org"))

xpath to select namespace from xml document

The namespace of the outermost element can be found using namespace-uri(/*).

Alternatively, the default namespace that's in scope for the outermost element is /*/namespace::*[name()=''].

These aren't the same thing. Consider

<p:root xmlns="a.ns" xmlns:p="b.ns"/>

The first expression will give you "b.ns", the second will give you "a.ns". It's not clear from your question which you want.

Note that namespaces are not attributes in the XDM data model, so you never access them using the attribute axis. @xmlns will therefore never work.

Applying xpath on xml with default namespace with XOM

Unprefixed names in XPath always mean "no namespace" - they don't respect the default namespace declaration. You need to use a prefix

Element rootElem = new Builder().build(xml).getRootElement();
xc = XPathContext.makeNamespaceContext(rootElem);
xc.addNamespace("ex", "http://www.edankert.com/examples/");
Nodes matchedNodes = rootElem.query("ex:cd/ex:artist", xc);
System.out.println(matchedNodes.size());

It doesn't matter that the XPath expression uses a prefix where the original document didn't, as long as the namespace URI that is bound to the prefix in the XPath namespace context is the same as the URI that is bound by xmlns in the document.



Related Topics



Leave a reply



Submit