Format Xml String to Print Friendly Xml String

Format XML string to print friendly XML string

You will have to parse the content somehow ... I find using LINQ the most easy way to do it. Again, it all depends on your exact scenario. Here's a working example using LINQ to format an input XML string.

string FormatXml(string xml)
{
try
{
XDocument doc = XDocument.Parse(xml);
return doc.ToString();
}
catch (Exception)
{
// Handle and throw if fatal exception here; don't just ignore them
return xml;
}
}

[using statements are ommitted for brevity]

How to pretty print XML from Java?

Now it's 2012 and Java can do more than it used to with XML, I'd like to add an alternative to my accepted answer. This has no dependencies outside of Java 6.

import org.w3c.dom.Node;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;

import javax.xml.parsers.DocumentBuilderFactory;
import java.io.StringReader;

/**
* Pretty-prints xml, supplied as a string.
* <p/>
* eg.
* <code>
* String formattedXml = new XmlFormatter().format("<tag><nested>hello</nested></tag>");
* </code>
*/
public class XmlFormatter {

public String format(String xml) {

try {
final InputSource src = new InputSource(new StringReader(xml));
final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));

//May need this: System.setProperty(DOMImplementationRegistry.PROPERTY,"com.sun.org.apache.xerces.internal.dom.DOMImplementationSourceImpl");


final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
final LSSerializer writer = impl.createLSSerializer();

writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE); // Set this to true if the output needs to be beautified.
writer.getDomConfig().setParameter("xml-declaration", keepDeclaration); // Set this to true if the declaration is needed to be outputted.

return writer.writeToString(document);
} catch (Exception e) {
throw new RuntimeException(e);
}
}

public static void main(String[] args) {
String unformattedXml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><QueryMessage\n" +
" xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n" +
" xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n" +
" <Query>\n" +
" <query:CategorySchemeWhere>\n" +
" \t\t\t\t\t <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n" +
" </query:CategorySchemeWhere>\n" +
" </Query>\n\n\n\n\n" +
"</QueryMessage>";

System.out.println(new XmlFormatter().format(unformattedXml));
}
}

In C#, what is the best method to format a string as XML?


string unformattedXml = "<?xml version=\"1.0\"?><book><author>Lewis, C.S.</author><title>The Four Loves</title></book>";
string formattedXml = XElement.Parse(unformattedXml).ToString();
Console.WriteLine(formattedXml);

Output:

<book>
<author>Lewis, C.S.</author>
<title>The Four Loves</title>
</book>

The Xml Declaration isn't output by ToString(), but it is by Save() ...

  XElement.Parse(unformattedXml).Save(@"C:\doc.xml");
Console.WriteLine(File.ReadAllText(@"C:\doc.xml"));

Output:

<?xml version="1.0" encoding="utf-8"?>
<book>
<author>Lewis, C.S.</author>
<title>The Four Loves</title>
</book>

Python pretty print an XML given an XML string

Here's how to parse from a text string to the lxml structured data type.

Python 2:

from lxml import etree
xml_str = "<parent><child>text</child><child>other text</child></parent>"
root = etree.fromstring(xml_str)
print etree.tostring(root, pretty_print=True)

Python 3:

from lxml import etree
xml_str = "<parent><child>text</child><child>other text</child></parent>"
root = etree.fromstring(xml_str)
print(etree.tostring(root, pretty_print=True).decode())

Outputs:

<parent>
<child>text</child>
<child>other text</child>
</parent>

JAVA pretty print XML with properly formatted comments

I am afraid, that you cannot achieve this with settings.
Use a bit of brute force:

    return writer.getBuffer().toString().replaceAll("--><", "-->\n<");

What is the simplest way to get indented XML with line breaks from XmlDocument?

Based on the other answers, I looked into XmlTextWriter and came up with the following helper method:

static public string Beautify(this XmlDocument doc)
{
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true,
IndentChars = " ",
NewLineChars = "\r\n",
NewLineHandling = NewLineHandling.Replace
};
using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
doc.Save(writer);
}
return sb.ToString();
}

It's a bit more code than I hoped for, but it works just peachy.



Related Topics



Leave a reply



Submit