Php's Simplexml: How to Use Colons in Names

Simple XML - Dealing With Colons In Nodes

The solution is explained in this nice article. You need the children() method for accessing XML elements which contain a namespace. This code snippet is quoted from the article:

$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf'); 
foreach ($feed->item as $item) {
$ns_dc = $item->children('http://purl.org/dc/elements/1.1/');
echo $ns_dc->date;
}

PHP library for parsing XML with a colons in tag names?

Say you have some xml like this.

<xhtml:div>
<xhtml:em>italic</xhtml:em>
<date>2010-02-01 06:00</date>
</xhtml:div>

You can access 'em' like this: $xml->children('xhtml', true)->div->em;

however, if you want the date field, this: $xml->children('xhtml', true)->div->date; wont work, because you are stuck in the xhtml namespace.

you must execute 'children' again to get back to the default namespace:

$xml->children('xhtml', true)->div->children()->date;

Using PHP to include a colon within a XML child elements name

If an element name contains a : then the part before the : is the namespace prefix. If you are using namespace prefixes then you need to define the namespace somewhere in the document.

Check the manual of SimpleXmlElement::addChild(). You need to pass the namespace uri as the third element in order to make it work:

$img = $map->addChild($imagechild, '',  'http://your.namspace.uri/path');

I would encourage you to use the DOMDocument class in favour of the simple_xml extension. It can handle namespaces much more properly. Check this example:

Assuming you have this xml:

<?xml version="1.0"?>
<map>
</map>

And this PHP code:

$doc = new DOMDocument();
$doc->load("sitemap.xml");

$map = $doc->documentElement;

// Define the xmlns "image" in the root element
$attr = $doc->createAttribute('xmlns:image');
$attr->nodeValue = 'http://your.namespace.uri/path';
$map->setAttributeNode($attr);

// Create new elements
$loc = $doc->createElement('loc', 'your location comes here');
$image = $doc->createElement('image:image');
$imageloc = $doc->createElement('loc', 'your image location comes here');

// Add them to the tree
$map->appendChild($loc);
$image->appendChild($imageloc);
$map->appendChild($image);

// Save to file
file_put_contents('sitemap.xml', $doc->saveXML());

You'll get this output:

<?xml version="1.0"?>
<map xmlns:image="http://your.namespace.uri/path">
<loc>your location comes here</loc>
<image:image>
<loc>your image location comes here</loc>
</image:image>
</map>

How to escape colon when reading XML in PHP, simplexml

Your initial code:

$variable = $xml->rdf:RDF->Status->presence;

does not work because it is creating a syntax error:

Parse error: syntax error, unexpected ':' in /test.php on line 8

The colon in the property name is not valid. PHP's common way to work with that are curly braces:

$xml->{'rdf:RDF'}->Status->presence

As you then found out you get the undefined property notice:

Notice: Trying to get property of non-object in /test.php on line 8

That is first-hand because such a property does not exists, var_dump shows that:

var_dump($xml);

class SimpleXMLElement#1 (1) {
public $Status =>
class SimpleXMLElement#2 (2) {
public $statusCode =>
string(1) "1"
public $presence =>
array(13) {
[0] =>
string(1) "1"
...
}
}
}

However, apart from that, even if there would be a children with a namespace prefixed element name, it would not work that way. This would just never work, so always such a property is not defined.

However what the previous dump outlines is that there is the property you're looking for: $Status:

$variable = $xml->Status->presence;

So you were just looking in the wrong place. The var_dump($variable) is:

class SimpleXMLElement#4 (13) {
string(1) "1"
string(7) "Offline"
string(12) "Déconnecté"
string(7) "Offline"
string(15) "オフライン"
string(6) "離線"
string(6) "脱机"
string(7) "Offline"
string(7) "Offline"
string(12) "Non in linea"
string(12) "Desconectado"
string(15) "Niepodłączony"
string(7) "Offline"
}

PHP Parse XML with colon in tag name using children

First of all, you're using an undefined variable in your foreach loop. You've defined $clipinfo but you're trying to use $clipartinfo in your code.

Second, you're accessing the attributes incorrectly:

<media:thumbnail url="http://openclipart.org/image/foo.png"/>

You're trying to access the URL attribute. This needs to be done with attributes() method.

Change:

$thumb=$clipartinfo->children('media', true)->thumbnail['url'];

to:

$thumb = $clipinfo->children('media', true)->thumbnail->attributes()->url;

Hope this helps!

Adding Simple XML Element with PHP that has a Colon

In addition to the namespace prefix (the part before the colon), you must also include the corresponding namespace URI (as the third argument):

$record_xml->addAttribute(
'xsi:schemaLocation',
'http://abc.com file:///somepath/somename.xsd',
'http://www.w3.org/2001/XMLSchema-instance'
);

Reference - How do I handle Namespaces (Tags and Attributes with a Colon in their Name) in SimpleXML?

What are XML namespaces?

A colon (:) in a tag or attribute name means that the element or attribute is in an XML namespace. Namespaces are a way of combining different XML formats / standards in one document, and keeping track of which names come from which format. The colon, and the part before it, aren't really part of the tag / attribute name, they just indicate which namespace it's in.

An XML namespace has a namespace identifier, which is identified by a URI (a URL or URN). The URI doesn't point at anything, it's just a way for someone to "own" the namespace. For instance, the SOAP standard uses the namespace http://www.w3.org/2003/05/soap-envelope and an OpenDocument file uses (among others) urn:oasis:names:tc:opendocument:xmlns:meta:1.0. The example in the question uses the namespaces http://example.com and https://namespaces.example.org/two.

Within a document, or a section of a document, a namespace is given a local prefix, which is the part you see before the colon. For instance, in different documents, the SOAP namespace might be given the local prefix soap:, SOAP:, SOAP-ENV:, env:, or just ns1:. These names are linked back to the identifier of the namespace using a special xmlns attribute, e.g. xmlns:soap="http://www.w3.org/2003/05/soap-envelope". The choice of prefix in a particular document is completely arbitrary, and could change each time it was generated without changing the meaning.

Finally, there is a default namespace in each document, or section of a document, which is the namespace used for elements with no prefix. It is defined by an xmlns attribute with no :, e.g. xmlns="http://www.w3.org/2003/05/soap-envelope". In the example above, <list> is in the default namespace, which is defined as http://example.com.

Somewhat peculiarly, un-prefixed attributes are never in the default namespace, but in a kind of "void namespace", which the standard doesn't clearly define. See: XML Namespaces and Unprefixed Attributes

SimpleXML gives me an empty object; what's wrong?

If you use print_r, var_dump, or similar "dump structure" functions on a SimpleXML object with namespaces in, some of the contents will not display. It is still there, and can be accessed as described below.

How do you access namespaces in SimpleXML?

SimpleXML provides two main methods for using namespaces:

  • The ->children() method allows you to access child elements in a particular namespace. It effectively switches your object to look at that namespace, until you call it again to switch back, or to another namespace.
  • The ->attributes() method works in a similar way, but allows you to access attributes in a particular namespace.

For instance, the example above might become:

define('XMLNS_EG1', 'http://example.com');
define('XMLNS_EG2', 'https://namespaces.example.org/two');
define('XMLNS_SEQ', 'urn:example:sequences');

foreach ( $sx->children(XMLNS_EG1)->list->children(XMLNS_EG2)->item as $item ) {
echo 'Position: ' . $item->attributes(XMLNS_SEQ)->position . "\n";
echo 'Item: ' . (string)$item . "\n";
}

You can also select the initial namespace when you first parse the XML, using the $namespace_or_prefix parameter, which is the fourth parameter to simplexml_load_string, simplexml_load_file, or new SimpleXMLElement.

For instance, if we created the object this way, we wouldn't need the ->children(XMLNS_EG1) call to access the list element:

$sx = simplexml_load_string($xml, null, 0, XMLNS_EG1);

(Note that if the root element uses a default namespace rather than a prefix, SimpleXML will select it automatically; but since you can't predict which namespace will be the default in future, it's best to always include the $namespace_or_prefix parameter or initial ->children() call.)

Short-hand (not recommended)

As a short-hand, you can also pass the methods the local alias of the namespace, by giving the second parameter as true. Remember that this prefix could change at any time, for instance, a generator might assign prefixes ns1, ns2, etc, and assign them in a different order if the code changes slightly. Relying on the full namespace URIs is always the best approach.

Using this short-hand, the code would become:

foreach ( $sx->list->children('ns2', true)->item as $item ) {
echo 'Position: ' . $item->attributes('seq', true)->position . "\n";
echo 'Item: ' . (string)$item . "\n";
}

(This short-hand was added in PHP 5.2, and you may see really old examples using a more long-winded version using $sx->getNamespaces to get a list of prefix-identifier pairs. This is the worst of both worlds, as you're still hard-coding the prefix rather than the identifier.)



Related Topics



Leave a reply



Submit