Adding a Namespace When Using Simplexmlelement

adding a namespace when using SimpleXMLElement

SimpleXML has an unusual quirk where the namespace prefixes are filtered from the root element. I'm not sure why it does this.

However, a workaround I've used has been to basically prefix the prefix, so that the parser only removes the first ones, and leaves the second

$xmlTest = new SimpleXMLElement('<xmlns:ws:Test></xmlns:ws:Test>', LIBXML_NOERROR, false, 'ws', true);
$xmlTest->addAttribute('xmlns:xmlns:ws', 'http://url.to.namespace');
$xmlTest->addAttribute('xmlns:xmlns:xsi', 'http://www.w3.org/2001/XMLSchema-instance');

This seems to work for me, though I'm interested to understand why SimpleXML does this exactly.

SimpleXMLElement Access elements with namespace?

All you need is

$data = new SimpleXMLElement($xml);
$data->registerXPathNamespace('ns1','http://endpoint.websitecom/');
$part = $data->xpath("//ns1:return");
var_dump($part[0]->children("ns1",true));

Output

object(SimpleXMLElement)[3]
public 'campaignID' => string '0' (length=1)
public 'categoryID' => string '200230455' (length=9)
public 'categoryName' => string 'Promotion' (length=9)
public 'linkID' => string '10001599' (length=8)
public 'linkName' => string 'KFL-20% off No Min' (length=18)
public 'mid' => string '3071' (length=4)
public 'nid' => string '1' (length=1)
public 'clickURL' => string '
http://someurl
' (length=36)
public 'endDate' => string 'Oct 15, 2012' (length=12)
public 'height' => string '250' (length=3)
public 'iconURL' => string '
http://someurl
' (length=36)
public 'imgURL' => string '
http://someurl
' (length=36)
public 'landURL' => string '
http://someurl
' (length=36)
public 'serverType' => string '22' (length=2)
public 'showURL' => string '
http://someurl
' (length=36)
public 'size' => string '13' (length=2)
public 'startDate' => string 'Oct 14, 2012' (length=12)
public 'width' => string '300' (length=3)

PHP SimpleXMLElement - Initial declaration & namespaces

SimpleXML has an unusual quirk where the namespace prefixes are filtered from the root element. I'm not sure why it does this.

However, a workaround I've used has been to basically prefix the prefix, so that the parser only removes the first ones, and leaves the second

$xmlTest = new SimpleXMLElement('<xmlns:ws:Test></xmlns:ws:Test>', LIBXML_NOERROR, false, 'ws', true);
$xmlTest->addAttribute('xmlns:xmlns:ws', 'http://url.to.namespace');
$xmlTest->addAttribute('xmlns:xmlns:xsi', 'http://www.w3.org/2001/XMLSchema-instance');

This seems to work for me, though I'm interested to understand why SimpleXML does this exactly.

Source

PHP SimpleXMLElement addAttribute namespaces syntax

solution 1: add a prefix to the prefix

<?php
$node = new SimpleXMLElement('<Product/>');
$node->addAttribute("xmlns:xmlns:xsi", 'http://www.w3.org/2001/XMLSchema-instance');
$node->addAttribute("xmlns:xmlns:xsd", 'http://www.w3.org/2001/XMLSchema');
echo $node->asXML();

output:

<?xml version="1.0"?>
<Product xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"/>

note: this is a workaround and actually doesn't set the namespace for the attribute, but just quite enough if you are going to echo / save to file the result

solution 2: put namespace directly in the SimpleXMLElement constructor

<?php
$node = new SimpleXMLElement('<Product xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"/>');
echo $node->asXML();

output is the same as in solution 1

solution 3 (adds additional attribute)

<?php
$node = new SimpleXMLElement('<Product/>');
$node->addAttribute("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance", "xmlns");
$node->addAttribute("xmlns:xsd", 'http://www.w3.org/2001/XMLSchema', "xmlns");
echo $node->asXML();

output adds additional xmlns:xmlns="xmlns"

<?xml version="1.0"?>
<Product xmlns:xmlns="xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"/>

How to use namespaces when writing XML file with SimpleXML

Here is an example of how to do this using DOM:

<?php

$nsUrl = 'http://base.google.com/ns/1.0';

$doc = new DOMDocument('1.0', 'UTF-8');

$rootNode = $doc->appendChild($doc->createElement('rss'));
$rootNode->setAttribute('version', '2.0');
$rootNode->setAttributeNS('http://www.w3.org/2000/xmlns/', 'xmlns:g', $nsUrl);

$channelNode = $rootNode->appendChild($doc->createElement('channel'));
$channelNode->appendChild($doc->createElement('title', 'Removed'));
$channelNode->appendChild($doc->createElement('description', 'Removed'));
$channelNode->appendChild($doc->createElement('link', 'Removed'));

foreach ($products as $product) {
$itemNode = $channelNode->appendChild($doc->createElement('item'));
$itemNode->appendChild($doc->createElement('title'))->appendChild($doc->createTextNode($product['title']));
$itemNode->appendChild($doc->createElement('description'))->appendChild($doc->createTextNode($product['title']));
$itemNode->appendChild($doc->createElement('link'))->appendChild($doc->createTextNode($product['url']));
$itemNode->appendChild($doc->createElement('g:id'))->appendChild($doc->createTextNode($product['product_id']));
$itemNode->appendChild($doc->createElement('g:price'))->appendChild($doc->createTextNode($product['price_latest']));
$itemNode->appendChild($doc->createElement('g:brand'))->appendChild($doc->createTextNode($product['range']));
$itemNode->appendChild($doc->createElement('g:condition'))->appendChild($doc->createTextNode('new'));
$itemNode->appendChild($doc->createElement('g:image_link'))->appendChild($doc->createTextNode($product['image']));
}

echo $doc->saveXML();

See it working

Unable to add namespace to an attribute with PHP's SimpleXML

If you want to add an attribute from the namespace/prefix i to $node don't bother declaring the namespace beforehand. Just use the third parameter of addAttribute() to provide the namespace uri for the prefix you're using in the first parameter.

$node = new SimpleXMLElement('<root></root>');
$node->addAttribute("i:somename", "somevalue", 'http://www.w3.org/2001/XMLSchema-instance');
echo $node->asXml();

prints

<?xml version="1.0"?>
<root xmlns:i="http://www.w3.org/2001/XMLSchema-instance" i:somename="somevalue"/>

If the attribute itself isn't needed, you can then remove it with unset(), leaving the namespace declaration.

unset($node->attributes('i', TRUE)['somename']);

Resolve namespaces with SimpleXML regardless of structure or namespace

You want to use SimpleXMLElement to extract data from XML and convert it into an array.

This is generally possible but comes with some caveats. Before XML Namespaces your XML comes with CDATA. For XML to array conversion with Simplexml you need to convert CDATA to text when you load the XML string. This is done with the LIBXML_NOCDATA flag. Example:

$xml = simplexml_load_string($buffer, null, LIBXML_NOCDATA);
print_r($xml); // print_r shows how SimpleXMLElement does array conversion

This gives you the following output:

SimpleXMLElement Object
(
[@attributes] => Array
(
[version] => 2.0
)

[title] => Blah
[description] => Blah
)

As you can already see, there is no nice form to present the attributes in an array, therefore Simplexml by convention puts these into the @attributes key.

The other problem you have is to handle those multiple XML namespaces. In the previous example no specific namespace was used. That is the default namespace. When you convert a SimpleXMLElement to an array, the namespace of the SimpleXMLElement is used. As none was explicitly specified, the default namespace has been taken.

But if you specify a namespace when you create the array, that namespace is taken.

Example:

$xml = simplexml_load_string($buffer, null, LIBXML_NOCDATA, "http://base.google.com/ns/1.0");
print_r($xml);

This gives you the following output:

SimpleXMLElement Object
(
[id] => Blah
[product_type] => Blah
)

As you can see, this time the namespace that has been specified when the SimpleXMLElement was created is used in the array conversion: http://base.google.com/ns/1.0.

As you write you want to take all namespaces from the document into account, you need to obtain those first - including the default one:

$xml = simplexml_load_string($buffer, null, LIBXML_NOCDATA);
$namespaces = [null] + $xml->getDocNamespaces(true);

Then you can iterate over all namespaces and recursively merge them into the same array shown below:

$array = [];
foreach ($namespaces as $namespace) {
$xml = simplexml_load_string($buffer, null, LIBXML_NOCDATA, $namespace);
$array = array_merge_recursive($array, (array) $xml);
}
print_r($array);

This then finally should create and output the array of your choice:

Array
(
[@attributes] => Array
(
[version] => 2.0
)

[title] => Blah
[description] => Blah
[id] => Blah
[product_type] => Blah
)

As you can see, this is perfectly possible with SimpleXMLElement. However it's important you understand how SimpleXMLElement converts into an array (or serializes to JSON which does follow the same rules). To simulate the SimpleXMLElement-to-array conversion, you can make use of print_r for a quick output.

Note that not all XML constructs can be equally well converted into an array. That's not specifically a limitation of Simplexml but lies in the nature of which structures XML can represent and which structures an array can represent.

Therefore it is most often better to keep the XML inside an object like SimpleXMLElement (or DOMDocument) to access and deal with the data - and not with an array.

However it's perfectly fine to convert data into an array as long as you know what you do and you don't need to write much code to access members deeper down the tree in the structure. Otherwise SimpleXMLElement is to be favored over an array because it allows dedicated access not only to many of the XML feature but also querying like a database with the SimpleXMLElement::xpath method. You would need to write many lines of own code to access data inside the XML tree that comfortable on an array.

To get the best of both worlds, you can extend SimpleXMLElement for your specific conversion needs:

$buffer = <<<BUFFER
<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0" xmlns:g="http://base.google.com/ns/1.0">
...
<g:id><![CDATA[Blah]]></g:id>
<title><![CDATA[Blah]]></title>
<description><![CDATA[Blah]]></description>
<g:product_type><![CDATA[Blah]]></g:product_type>
</rss>
BUFFER;

$feed = new Feed($buffer, LIBXML_NOCDATA);
print_r($feed->toArray());

Which does output:

Array
(
[@attributes] => stdClass Object
(
[version] => 2.0
)

[title] => Blah
[description] => Blah
[id] => Blah
[product_type] => Blah
[@text] => ...
)

For the underlying implementation:

class Feed extends SimpleXMLElement implements JsonSerializable
{
public function jsonSerialize()
{
$array = array();

// json encode attributes if any.
if ($attributes = $this->attributes()) {
$array['@attributes'] = iterator_to_array($attributes);
}

$namespaces = [null] + $this->getDocNamespaces(true);
// json encode child elements if any. group on duplicate names as an array.
foreach ($namespaces as $namespace) {
foreach ($this->children($namespace) as $name => $element) {
if (isset($array[$name])) {
if (!is_array($array[$name])) {
$array[$name] = [$array[$name]];
}
$array[$name][] = $element;
} else {
$array[$name] = $element;
}
}
}

// json encode non-whitespace element simplexml text values.
$text = trim($this);
if (strlen($text)) {
if ($array) {
$array['@text'] = $text;
} else {
$array = $text;
}
}

// return empty elements as NULL (self-closing or empty tags)
if (!$array) {
$array = NULL;
}

return $array;
}

public function toArray() {
return (array) json_decode(json_encode($this));
}
}

Which is an adoption with namespaces of the Changing JSON Encoding Rules example given in SimpleXML and JSON Encode in PHP – Part III and End.

SimpleXML - add a new node using a namespace previously declared - how?

<?php
// test document, registrant as first/last element and somewhere in between
$xmlObj = new SimpleXMLElement('<epp>
<domain:create xmlns:domain="urn:someurn">
<domain:name></domain:name>
<domain:registrant></domain:registrant>
<domain:contact></domain:contact>
</domain:create>
<domain:create xmlns:domain="urn:someurn">
<domain:name></domain:name>
<domain:contact></domain:contact>
<domain:registrant></domain:registrant>
</domain:create>
<domain:create xmlns:domain="urn:someurn">
<domain:registrant></domain:registrant>
<domain:name></domain:name>
<domain:contact></domain:contact>
</domain:create>
</epp>');

foreach( $xmlObj->children("urn:someurn")->create as $create ) {
$registrant = $create->registrant;
insertAfter($registrant, 'domain:ns', 'some text');
}
echo $xmlObj->asXML();

function insertAfter(SimpleXMLElement $prevSibling, $qname, $val) {
$sd = dom_import_simplexml($prevSibling);
$newNode = $sd->ownerDocument->createElement($qname, $val);
$newNode = $sd->parentNode->insertBefore($newNode, $sd->nextSibling);
return simplexml_import_dom($newNode);
}

prints

<?xml version="1.0"?>
<epp>
<domain:create xmlns:domain="urn:someurn">
<domain:name/>
<domain:registrant/><domain:ns>some text</domain:ns>
<domain:contact/>
</domain:create>
<domain:create xmlns:domain="urn:someurn">
<domain:name/>
<domain:contact/>
<domain:registrant/><domain:ns>some text</domain:ns>
</domain:create>
<domain:create xmlns:domain="urn:someurn">
<domain:registrant/><domain:ns>some text</domain:ns>
<domain:name/>
<domain:contact/>
</domain:create>
</epp>

SimpleXML has declaration of xmlns:xmlns= - no way to remove

The problem with SimpleXML is that it's addAttribute function adds an attribute, not a namespace and although it seems like it does what you want, it's not meant to be used the way you are using it.

It's meant to add a value that's part of a particular namespace (specified as the third parameter), not to add the namespace itself. The reason why you end up with xmlns:xmlns is because SimpleXML found that you used the xmlns namespace when specifying the name xmlns:media for instance so it created an empty xmlns:xmlns.

Here are 2 solutions to your problem:

1. Specify in the namespaces in the constructor.

$rssXML = new SimpleXMLElement('<rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:dcterms="http://purl.org/dc/terms/" />');
$rssXML->addAttribute('version', '2.0');

2. Replace xmlns:xmlns="" using preg_replace

echo preg_replace('/xmlns:xmlns=""\s?/', '', $rssXML->asXML());


Related Topics



Leave a reply



Submit