Writing Some Characters Like '<' in an Xml File

writing some characters like '' in an xml file

Use

< for <

> for >

& for &

What characters do I need to escape in XML documents?

If you use an appropriate class or library, they will do the escaping for you. Many XML issues are caused by string concatenation.

XML escape characters

There are only five:

"   "
' '
< <
> >
& &

Escaping characters depends on where the special character is used.

The examples can be validated at the W3C Markup Validation Service.

Text

The safe way is to escape all five characters in text. However, the three characters ", ' and > needn't be escaped in text:

<?xml version="1.0"?>
<valid>"'></valid>

Attributes

The safe way is to escape all five characters in attributes. However, the > character needn't be escaped in attributes:

<?xml version="1.0"?>
<valid attribute=">"/>

The ' character needn't be escaped in attributes if the quotes are ":

<?xml version="1.0"?>
<valid attribute="'"/>

Likewise, the " needn't be escaped in attributes if the quotes are ':

<?xml version="1.0"?>
<valid attribute='"'/>

Comments

All five special characters must not be escaped in comments:

<?xml version="1.0"?>
<valid>
<!-- "'<>& -->
</valid>

CDATA

All five special characters must not be escaped in CDATA sections:

<?xml version="1.0"?>
<valid>
<![CDATA["'<>&]]>
</valid>

Processing instructions

All five special characters must not be escaped in XML processing instructions:

<?xml version="1.0"?>
<?process <"'&> ?>
<valid/>

XML vs. HTML

HTML has its own set of escape codes which cover a lot more characters.

How to write '&' in xml?

In a proper XML file, you cannot have a standalone & character unless it is an escape character. So if you need an XML node to contain good–bad, then it will have to be encoded as good&ndash;bad. There is no workaround as anything different would not be valid XML. The only way to make it work is to just write the XML file as a plain text how you want it, but then it could not be read by an XML parser as it is not proper XML.

Here's a code example of my suggested workaround (you didn't specify a language, so I am showing you in C#, but Java should have something similar):

using(var sw = new StreamWriter(stream))
{
// other code to write XML-like data
sw.WriteLine("<node>good–bad</node>");
// other code to write XML-like data
}

As you discovered, another option is to use the WriteRaw() method on XmlTextWriter (in C#) will write an unencoded string, but it does not change the fact it is not going to be a valid XML file when it is done.

But as I mentioned, if you tried to read this with an XML Parser, it would fail because &ndash is not a valid XML character entity so it is not valid XML.

is an HTML character entity, so escaping it in an XML should not normally be necessary.

In the XML language, & is the escape character, so & is appropriate string representation of &. You cannot use just a & character because the & character has a special meaning and therefore a single & character would be misinterpreted by the parser/

You will see similar behavior with the <, >, ", and' characters. All have meaning within the XML language so if you need to represent them as text in a document.

Here's a reference to all of the character entities in XML (and HTML) from Wikipedia. Each will always be represented by the escape character and the name (>, <, ", ')

How do I escape ampersands in XML so they are rendered as entities in HTML?

When your XML contains &amp;, this will result in the text &.

When you use that in HTML, that will be rendered as &.

Best way to encode text data for XML in Java?

Very simply: use an XML library. That way it will actually be right instead of requiring detailed knowledge of bits of the XML spec.

Avoid writing character to XML in python

If it were me, I'd use lxml's CDATA class.

However, if you wanted to stick with ElementTree you could probably redefine ET._escape_cdata and make sure the text doesn't start with <![CDATA[ and doesn't end with ]]> before escaping.

Example...

Python 3.#

import xml.etree.ElementTree as ET

def escape_cdata(text):
# escape character data
try:
if not text.startswith("<![CDATA[") and not text.endswith("]]>"):
if "&" in text:
text = text.replace("&", "&")
if "<" in text:
text = text.replace("<", "<")
if ">" in text:
text = text.replace(">", ">")
return text
except (TypeError, AttributeError):
ET._raise_serialization_error(text)

ET._escape_cdata = escape_cdata

map_elem = ET.Element("Map")

parameters = ET.SubElement(map_elem, "Parameters")
ET.SubElement(parameters, "Parameter", name="bounds").text = "-180,-85.05112877980659,180,85.05112877980659"
ET.SubElement(parameters, "Parameter", name="center").text = "0,0,2"
ET.SubElement(parameters, "Parameter", name="format").text = "png"
ET.SubElement(parameters, "Parameter", name="minzoom").text = "0"
ET.SubElement(parameters, "Parameter", name="maxzoom").text = "22"
ET.SubElement(parameters, "Parameter", name="scale").text = "1"
ET.SubElement(parameters, "Parameter", name="metatile").text = "2"
ET.SubElement(parameters, "Parameter", name="id").text = "<![CDATA[xyzvalue]]>"
ET.SubElement(parameters, "Parameter", name="_updated").text = "1552288036000"
ET.SubElement(parameters, "Parameter", name="name").text = "<![CDATA[xyzvalue]]>"
ET.SubElement(parameters, "Parameter", name="tilejson").text = "<![CDATA[2.0.0]]>"
ET.SubElement(parameters, "Parameter", name="scheme").text = "<![CDATA[xyz]]>"

tree = ET.ElementTree(map_elem)
tree.write("test.xml", xml_declaration=True, encoding='utf-8', method="xml")

XML Output (test.xml; pretty printed for readability)

<Map>
<Parameters>
<Parameter name="bounds">-180,-85.05112877980659,180,85.05112877980659</Parameter>
<Parameter name="center">0,0,2</Parameter>
<Parameter name="format">png</Parameter>
<Parameter name="minzoom">0</Parameter>
<Parameter name="maxzoom">22</Parameter>
<Parameter name="scale">1</Parameter>
<Parameter name="metatile">2</Parameter>
<Parameter name="id"><![CDATA[xyzvalue]]></Parameter>
<Parameter name="_updated">1552288036000</Parameter>
<Parameter name="name"><![CDATA[xyzvalue]]></Parameter>
<Parameter name="tilejson"><![CDATA[2.0.0]]></Parameter>
<Parameter name="scheme"><![CDATA[xyz]]></Parameter>
</Parameters>
</Map>

Update: Function for Python 2.7

def escape_cdata(text, encoding):
# escape character data
try:
if not text.startswith("<![CDATA[") and not text.endswith("]]>"):
if "&" in text:
text = text.replace("&", "&")
if "<" in text:
text = text.replace("<", "<")
if ">" in text:
text = text.replace(">", ">")
return text.encode(encoding, "xmlcharrefreplace")
except (TypeError, AttributeError):
ET._raise_serialization_error(text)

How to write character & in android strings.xml

Encode it:

&


Related Topics



Leave a reply



Submit