Xml Parsing - Read a Simple Xml File and Retrieve Values

XML Parsing - Read a Simple XML File and Retrieve Values

Easy way to parse the xml is to use the LINQ to XML

for example you have the following xml file

<library>
<track id="1" genre="Rap" time="3:24">
<name>Who We Be RMX (feat. 2Pac)</name>
<artist>DMX</artist>
<album>The Dogz Mixtape: Who's Next?!</album>
</track>
<track id="2" genre="Rap" time="5:06">
<name>Angel (ft. Regina Bell)</name>
<artist>DMX</artist>
<album>...And Then There Was X</album>
</track>
<track id="3" genre="Break Beat" time="6:16">
<name>Dreaming Your Dreams</name>
<artist>Hybrid</artist>
<album>Wide Angle</album>
</track>
<track id="4" genre="Break Beat" time="9:38">
<name>Finished Symphony</name>
<artist>Hybrid</artist>
<album>Wide Angle</album>
</track>
<library>

For reading this file, you can use the following code:

public void Read(string  fileName)
{
XDocument doc = XDocument.Load(fileName);

foreach (XElement el in doc.Root.Elements())
{
Console.WriteLine("{0} {1}", el.Name, el.Attribute("id").Value);
Console.WriteLine(" Attributes:");
foreach (XAttribute attr in el.Attributes())
Console.WriteLine(" {0}", attr);
Console.WriteLine(" Elements:");

foreach (XElement element in el.Elements())
Console.WriteLine(" {0}: {1}", element.Name, element.Value);
}
}

Parsing Xml file in java to fetch the element matching given tag value

XPath is a very expressive API that can be used to select the elements.

/root/outer[name = "ghi"]/age

This article https://www.baeldung.com/java-xpath provides a pretty good overview and explanation of how to apply an XPath in Java.

Adjusting one of their code samples for your XPath:

String name = "ghe";

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(this.getFile());
XPath xPath = XPathFactory.newInstance().newXPath();

String expression = "/root/outer[name=" + "'" + name + "'" + "]/age";
node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE);

Parsing XML file using C#?

You can use XPath to find all nodes that match, for example:

XmlNodeList matches = xmlDoc.SelectNodes("proj[proj_title='heat_run']")

matches will contain all proj nodes that match the critera. Learn more about XPath: http://www.w3schools.com/xsl/xpath_syntax.asp

MSDN Documentation on SelectNodes

Parsing XML file in Swift (Xcode v 7.0.1) and retrieving values from dictionary


NOTE
I've put the whole thing in a gist which you can copy and paste into a playground.


Let's look at a simple example to get a start:

let xml = "<coord2 count=\"3\">"
+ "<markers>"
+ "<marker>"
+ "<item>marker1</item>"
+ "</marker>"
+ "<marker>"
+ "<item>marker2</item>"
+ "<lat>36</lat>"
+ "</marker>"
+ "</markers>"
+ "</coord2>"

A bit narrowed down, but Markers can have a item name (string) and lat value (int). A Coord2 will have an array of Markers, and a count (int) attribute.

To parse the above with custom classes, here's one approach.

First create a ParserBase class that does some ground work for us, namely accumulating foundCharacters so that it can be easily used by sub classes.
Also (more importantly) it has a parent property which is used to hold references to the parent container class [this is used for the way in which we will be parsing XML].

// Simple base class that is used to consume foundCharacters
// via the parser

class ParserBase : NSObject, NSXMLParserDelegate {

var currentElement:String = ""
var foundCharacters = ""
weak var parent:ParserBase? = nil

func parser(parser: NSXMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String]) {

currentElement = elementName
}

func parser(parser: NSXMLParser, foundCharacters string: String) {
self.foundCharacters += string
}

}

Since coord2 is our root tag, we will create a class that will map to that tag - it represents the root object, has an array of Markers, a count property, and is also the root delegate object for the XMLParser.

// Represents a coord2 tag
// It has a count attribute
// and a collection of markers

class Coord2 : ParserBase {

var count = 0
var markers = [Marker]()

override func parser(parser: NSXMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String]) {

print("processing <\(elementName)> tag from Coord")

if elementName == "coord2" {

// if we are processing a coord2 tag, we are at the root
// of this example
// extract the count value and set it
if let c = Int(attributeDict["count"]!) {
self.count = c
}
}

// if we found a marker tag, delegate further responsibility
// to parsing to a new instance of Marker

if elementName == "marker" {
let marker = Marker()
self.markers.append(marker)

// push responsibility
parser.delegate = marker

// let marker know who we are
// so that once marker is done XML processing
// it can return parsing responsibility back
marker.parent = self
}
}

}

The Marker class is as follows:

class Marker : ParserBase {

var item = ""
var lat = 0

func parser(parser: NSXMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?) {

print("processing <\(elementName)> tag from Marker")

// if we finished an item tag, the ParserBase parent
// would have accumulated the found characters
// so just assign that to our item variable
if elementName == "item" {
self.item = foundCharacters
}

// similarly for lat tags
// convert the lat to an int for example
else if elementName == "lat" {
if let l = Int(foundCharacters) {
self.lat = l
}
}

// if we reached the </marker> tag, we do not
// have anything further to do, so delegate
// parsing responsibility to parent
else if elementName == "marker" {
parser.delegate = self.parent
}

// reset found characters
foundCharacters = ""
}

}

Now on to parsing, extracting info, and printing something.

let xmlData = xml.dataUsingEncoding(NSUTF8StringEncoding)!
let parser = NSXMLParser(data: xmlData)

let coord = Coord2()
parser.delegate = coord

parser.parse()

print("coord has a count attribute of \(coord.count)")
print("coord has \(coord.markers.count) markers")

for marker in coord.markers {
print("marker item = \(marker.item) and lat = \(marker.lat)")
}

which outputs the following:

coord has a count attribute of 3
coord has 2 markers
marker item = marker1 and lat = 0
marker item = marker2 and lat = 36

Parsing XML and checking values against an elements value

If you println the size of the NodeList that is returned calling current.childNodes you will get 5. The first, third and last nodes are blank nodes. your data is in nodes 2 and 4. Therefore, you can replace your code with the following and it will work.

val xmlFile = File("Users.xml")
val doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile)
var list = doc.getElementsByTagName("user")
for (i in 0 until list.length) {
var current = list.item(i)
if (current.attributes.getNamedItem("id").nodeValue == username) {
println(current.childNodes.length)

for (j in 0 until current.childNodes.length) {
if (current.childNodes.item(j).nodeName == "password") {
return current.childNodes.item(j).textContent == password
}
}
}
}
return false

That said, I would be tempted to use a better XML parser, because this code is far more verbose than needed.

For example, with DOM4J, you can replace all of the above code with

val doc = SAXReader().read(File("users.xml"))
val users = doc.rootElement.elements("user")
for (user in users) {
if (user.attribute("id").value == username) {
return user.element("password").text == password
}

}
return false

Reading XML file and fetching its attributes value in Python

Here's an lxml snippet that extracts an attribute as well as element text (your question was a little ambiguous about which one you needed, so I'm including both):

from lxml import etree
doc = etree.parse(filename)

memoryElem = doc.find('memory')
print memoryElem.text # element text
print memoryElem.get('unit') # attribute

You asked (in a comment on Ali Afshar's answer) whether minidom (2.x, 3.x) is a good alternative. Here's the equivalent code using minidom; judge for yourself which is nicer:

import xml.dom.minidom as minidom
doc = minidom.parse(filename)

memoryElem = doc.getElementsByTagName('memory')[0]
print ''.join( [node.data for node in memoryElem.childNodes] )
print memoryElem.getAttribute('unit')

lxml seems like the winner to me.



Related Topics



Leave a reply



Submit