How to pretty print XML from Java?
Now it's 2012 and Java can do more than it used to with XML, I'd like to add an alternative to my accepted answer. This has no dependencies outside of Java 6.
import org.w3c.dom.Node;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.StringReader;
/**
* Pretty-prints xml, supplied as a string.
* <p/>
* eg.
* <code>
* String formattedXml = new XmlFormatter().format("<tag><nested>hello</nested></tag>");
* </code>
*/
public class XmlFormatter {
public String format(String xml) {
try {
final InputSource src = new InputSource(new StringReader(xml));
final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));
//May need this: System.setProperty(DOMImplementationRegistry.PROPERTY,"com.sun.org.apache.xerces.internal.dom.DOMImplementationSourceImpl");
final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
final LSSerializer writer = impl.createLSSerializer();
writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE); // Set this to true if the output needs to be beautified.
writer.getDomConfig().setParameter("xml-declaration", keepDeclaration); // Set this to true if the declaration is needed to be outputted.
return writer.writeToString(document);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
public static void main(String[] args) {
String unformattedXml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><QueryMessage\n" +
" xmlns=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message\"\n" +
" xmlns:query=\"http://www.SDMX.org/resources/SDMXML/schemas/v2_0/query\">\n" +
" <Query>\n" +
" <query:CategorySchemeWhere>\n" +
" \t\t\t\t\t <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n" +
" </query:CategorySchemeWhere>\n" +
" </Query>\n\n\n\n\n" +
"</QueryMessage>";
System.out.println(new XmlFormatter().format(unformattedXml));
}
}
Pretty print XML in java 8
I guess that the problem is related to blank text nodes (i.e. text nodes with only whitespaces) in the original file. You should try to programmatically remove them just after the parsing, using the following code. If you don't remove them, the Transformer
is going to preserve them.
original.getDocumentElement().normalize();
XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//text()[normalize-space(.) = '']");
NodeList blankTextNodes = (NodeList) xpath.evaluate(original, XPathConstants.NODESET);
for (int i = 0; i < blankTextNodes.getLength(); i++) {
blankTextNodes.item(i).getParentNode().removeChild(blankTextNodes.item(i));
}
JAVA pretty print XML with properly formatted comments
I am afraid, that you cannot achieve this with settings.
Use a bit of brute force:
return writer.getBuffer().toString().replaceAll("--><", "-->\n<");
how to generate formatted .xml file?
With some guessing and after looking at this question adding these lines
after obtaining the transformer might do the trick
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
pretty print XML file
You have tagged this XSLT, and if you apply the following XSLT stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xalan"
exclude-result-prefixes="xalan">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" xalan:indent-amount="4"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
to your XML input, the result will be:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<test>
<item0>a</item0>
<item1>b</item1>
</test>
</root>
Live demo: http://xsltransform.net/ncdD7mg
Note that the items are "pretty printed" as:
<item0>a</item0>
and not as shown in your post:
<item0>
a
</item0>
which would represent a change in the content payload of the XML.
How to prettify XML String in Java
You can use string manipulation to get close;
Let's say you have your structure in a string called xml
;
<person>
<address>New York</address>
</person>
<person>
<address>Ottawa</address>
</person>
Then add a root element;
xml = "<myDummyRoot>" + xml + "</myDummyRoot>";
which gives the structure
<myDummyRoot>
<person>
<address>New York</address>
</person>
<person>
<address>Ottawa</address>
</person>
</myDummyRoot>
This is a valid XML document that should be possible to intent using the method linked, giving something like;
<myDummyRoot>
<person>
<address>New York</address>
</person>
<person>
<address>Ottawa</address>
</person>
</myDummyRoot>
A simple string replaceAll can then remove the root element again
xml = xml.replaceAll("</?myDummyRoot>", "");
...which should leave you with a readable XML document (although indented with some extra spacing).
Related Topics
How to Take a Screenshot Using Java and Save It to Some Sort of Image
What Components Are MVC in Jsf MVC Framework
Deploying My Application At the Root in Tomcat
Sort Objects in Arraylist by Date
How to Put a Control in the Jtableheader of a Jtable
How to Implement a Tree Data-Structure in Java
How to Check If a File Exists in Java
How to Set the Environment Variables For Java in Windows
How to Unescape a Java String Literal in Java
What's the Difference Between ".Equals" and "=="
Java Arrays Printing Out Weird Numbers and Text
Why Is the Java Main Method Static
How to Sort by Two Fields in Java