How to Pretty Print Xml from Java

How to pretty print XML from Java?

Now it's 2012 and Java can do more than it used to with XML, I'd like to add an alternative to my accepted answer. This has no dependencies outside of Java 6.

import org.w3c.dom.Node;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;

import javax.xml.parsers.DocumentBuilderFactory;
import java.io.StringReader;

/**
* Pretty-prints xml, supplied as a string.
* <p/>
* eg.
* <code>
* String formattedXml = new XmlFormatter().format("<tag><nested>hello</nested></tag>");
* </code>
*/
public class XmlFormatter {

public String format(String xml) {

try {
final InputSource src = new InputSource(new StringReader(xml));
final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));

//May need this: System.setProperty(DOMImplementationRegistry.PROPERTY,"com.sun.org.apache.xerces.internal.dom.DOMImplementationSourceImpl");


final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
final LSSerializer writer = impl.createLSSerializer();

writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE); // Set this to true if the output needs to be beautified.
writer.getDomConfig().setParameter("xml-declaration", keepDeclaration); // Set this to true if the declaration is needed to be outputted.

return writer.writeToString(document);
} catch (Exception e) {
throw new RuntimeException(e);
}
}

public static void main(String[] args) {
String unformattedXml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><QueryMessage\n" +
" xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n" +
" xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n" +
" <Query>\n" +
" <query:CategorySchemeWhere>\n" +
" \t\t\t\t\t <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n" +
" </query:CategorySchemeWhere>\n" +
" </Query>\n\n\n\n\n" +
"</QueryMessage>";

System.out.println(new XmlFormatter().format(unformattedXml));
}
}

Pretty print XML in java 8

I guess that the problem is related to blank text nodes (i.e. text nodes with only whitespaces) in the original file. You should try to programmatically remove them just after the parsing, using the following code. If you don't remove them, the Transformer is going to preserve them.

original.getDocumentElement().normalize();
XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//text()[normalize-space(.) = '']");
NodeList blankTextNodes = (NodeList) xpath.evaluate(original, XPathConstants.NODESET);

for (int i = 0; i < blankTextNodes.getLength(); i++) {
blankTextNodes.item(i).getParentNode().removeChild(blankTextNodes.item(i));
}

JAVA pretty print XML with properly formatted comments

I am afraid, that you cannot achieve this with settings.
Use a bit of brute force:

    return writer.getBuffer().toString().replaceAll("--><", "-->\n<");

how to generate formatted .xml file?

With some guessing and after looking at this question adding these lines
after obtaining the transformer might do the trick

transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");

pretty print XML file

You have tagged this XSLT, and if you apply the following XSLT stylesheet:

XSLT 1.0

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xalan"
exclude-result-prefixes="xalan">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" xalan:indent-amount="4"/>

<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

</xsl:stylesheet>

to your XML input, the result will be:

<?xml version="1.0" encoding="UTF-8"?>
<root>
<test>
<item0>a</item0>
<item1>b</item1>
</test>
</root>

Live demo: http://xsltransform.net/ncdD7mg

Note that the items are "pretty printed" as:

<item0>a</item0>

and not as shown in your post:

<item0>
a
</item0>

which would represent a change in the content payload of the XML.

How to prettify XML String in Java

You can use string manipulation to get close;

Let's say you have your structure in a string called xml;

<person>
<address>New York</address>
</person>
<person>
<address>Ottawa</address>
</person>

Then add a root element;

xml = "<myDummyRoot>" + xml + "</myDummyRoot>";

which gives the structure

<myDummyRoot>
<person>
<address>New York</address>
</person>
<person>
<address>Ottawa</address>
</person>
</myDummyRoot>

This is a valid XML document that should be possible to intent using the method linked, giving something like;

<myDummyRoot>
<person>
<address>New York</address>
</person>
<person>
<address>Ottawa</address>
</person>
</myDummyRoot>

A simple string replaceAll can then remove the root element again

xml = xml.replaceAll("</?myDummyRoot>", "");

...which should leave you with a readable XML document (although indented with some extra spacing).



Related Topics



Leave a reply



Submit