Creating a simple XML file using python
These days, the most popular (and very simple) option is the ElementTree API,
which has been included in the standard library since Python 2.5.
The available options for that are:
- ElementTree (Basic, pure-Python implementation of ElementTree. Part of the standard library since 2.5)
- cElementTree (Optimized C implementation of ElementTree. Also offered in the standard library since 2.5. Deprecated and folded into the regular ElementTree as an automatic thing as of 3.3.)
- LXML (Based on libxml2. Offers a rich superset of the ElementTree API as well XPath, CSS Selectors, and more)
Here's an example of how to generate your example document using the in-stdlib cElementTree:
import xml.etree.cElementTree as ET
root = ET.Element("root")
doc = ET.SubElement(root, "doc")
ET.SubElement(doc, "field1", name="blah").text = "some value1"
ET.SubElement(doc, "field2", name="asdfasd").text = "some vlaue2"
tree = ET.ElementTree(root)
tree.write("filename.xml")
I've tested it and it works, but I'm assuming whitespace isn't significant. If you need "prettyprint" indentation, let me know and I'll look up how to do that. (It may be an LXML-specific option. I don't use the stdlib implementation much)
For further reading, here are some useful links:
- API docs for the implementation in the Python standard library
- Introductory Tutorial (From the original author's site)
- LXML etree tutorial. (With example code for loading the best available option from all major ElementTree implementations)
As a final note, either cElementTree or LXML should be fast enough for all your needs (both are optimized C code), but in the event you're in a situation where you need to squeeze out every last bit of performance, the benchmarks on the LXML site indicate that:
- LXML clearly wins for serializing (generating) XML
- As a side-effect of implementing proper parent traversal, LXML is a bit slower than cElementTree for parsing.
How to create XML file using Python?
You need to
- Use
ET.tostring()
to get your xml as a string - Use
xml.dom.minidom
to get a DOM object from the string - Use
toprettyxml()
of DOM object to get the xml as formatted string - Add encoding information to the xml declaration part by simply splitting and concating the formatted string
- Write the string to a file
Code:
import xml.etree.cElementTree as ET
import xml.dom.minidom
m_encoding = 'UTF-8'
root = ET.Element("data")
doc = ET.SubElement(root, "status", date="20210123")
ET.SubElement(doc, "name", name="john").text = "some value1"
ET.SubElement(doc, "class", name="abc").text = "some vlaue2"
dom = xml.dom.minidom.parseString(ET.tostring(root))
xml_string = dom.toprettyxml()
part1, part2 = xml_string.split('?>')
with open("FILE.xml", 'w') as xfile:
xfile.write(part1 + 'encoding=\"{}\"?>\n'.format(m_encoding) + part2)
xfile.close()
Output file
<?xml version="1.0" encoding="UTF-8"?>
<data>
<status date="20210123">
<name name="john">some value1</name>
<class name="abc">some vlaue2</class>
</status>
</data>
Best way to generate xml?
Using lxml:
from lxml import etree
# create XML
root = etree.Element('root')
root.append(etree.Element('child'))
# another child with text
child = etree.Element('child')
child.text = 'some text'
root.append(child)
# pretty string
s = etree.tostring(root, pretty_print=True)
print s
Output:
<root>
<child/>
<child>some text</child>
</root>
See the tutorial for more information.
Python create XML file
You are on the right track save for a few minor syntax and usage issues:
class
is a Python keyword, you can't use it as a function parameter name (which is essentially whatclass = 'playlistItem'
is doingdata-type
is not a valid variable name in Python, it will be evaluated asdata MINUS type
, consider using something likedataType
ordata_type
. There might be ways around this but, IMHO, that would make the code unnecessarily complicated without adding any value (please see Edit #1 on how to do this)
That being said, the following code snippet should give you something usable and you can move on from there. Please feel free to let me know if you need any additional help:
from lxml import etree
data_el = etree.Element('data')
# You can do this in a loop and keep adding new elements
# Note: A deepcopy will be required for subsequent items
li_el = etree.SubElement(data_el, "li", class_name = 'playlistItem', data_type = "local", data_mp3 = "PATH")
a_el = etree.SubElement(li_el, "a", class_name = 'playlistNotSelected', href='#')
print etree.tostring(data_el, encoding='utf-8', xml_declaration = True, pretty_print = True)
This will generate the following output (which you can write to a file):
<?xml version='1.0' encoding='utf-8'?>
<data>
<li class_name="playlistItem" data_mp3="PATH" data_type="local">
<a class_name="playlistNotSelected" href="#"/>
</li>
</data>
Edit #0:
Alternatively, you can also write to a file by converting it to an ElementTree
first, e.g.
# Replace sys.stdout with a file object pointing to your object file:
etree.ElementTree(data_el).write(sys.stdout, encoding='utf-8', xml_declaration = True, pretty_print = True)
Edit #1:
Since element attributes are dictionaries, you can use set
to specify arbitrary attributes without any restrictions, e.g.
li_el.set('class', 'playlistItem')
li_el.set('data-type', 'local')
generate a full XML file with python
Reference: http://lxml.de/tutorial.html#namespaces
You mustn't specify the namespace as a :
-qualified string. Instead you may use {http://url.url/url}tag
form, or QName
form.
Here is your program, using namespaces:
from lxml import etree as ET
NS_DC = "http://purl.org/dc/elements/1.1/"
NS_OPF = "http://www.idpf.org/2007/opf"
nsmap = {
"dc": NS_DC,
None: NS_OPF,
}
PACKAGE = ET.QName(NS_OPF, 'package')
METADATA = ET.QName(NS_OPF, 'metadata')
TITLE = ET.QName(NS_DC, 'title')
et = ET.Element(PACKAGE,
attrib={'version': "2.0",
'unique-identifier': "BookId"},
nsmap=nsmap)
md = ET.SubElement(et, METADATA)
au = ET.SubElement(md, TITLE)
au.text = "A Tale of Two Cities"
s = ET.tostring(et, pretty_print=True)
print(s.decode('utf-8'))
You might choose to use lxml.builder.ElementMaker
. Here is a program that creates a portion of your example.
Notes:
- Note the use of
dict
s to represent attribute names that are not valid Python names. - Note the use of
QName
to represent namespace-qualified names. - Note the use of
validator
to validate the resulting tree.
from lxml import etree as ET
from lxml.builder import ElementMaker
NS_DC = "http://purl.org/dc/elements/1.1/"
NS_OPF = "http://www.idpf.org/2007/opf"
SCHEME = ET.QName(NS_OPF, 'scheme')
FILE_AS = ET.QName(NS_OPF, "file-as")
ROLE = ET.QName(NS_OPF, "role")
opf = ElementMaker(namespace=NS_OPF, nsmap={"opf": NS_OPF, "dc": NS_DC})
dc = ElementMaker(namespace=NS_DC)
validator = ET.RelaxNG(ET.parse("opf-schema.xml"))
tree = (
opf.package(
{"unique-identifier": "uuid_id", "version": "2.0"},
opf.metadata(
dc.identifier(
{SCHEME: "uuid", "id": "uuid_id"},
"d06a2234-67b4-40db-8f4a-136e52057101"),
dc.creator({FILE_AS: "Homes, A. M.", ROLE: "aut"}, "A. M. Homes"),
dc.title("My Book"),
dc.language("en"),
),
opf.manifest(
opf.item({"id": "foo", "href": "foo.pdf", "media-type": "foo"})
),
opf.spine(
{"toc": "uuid_id"},
opf.itemref({"idref": "uuid_id"}),
),
opf.guide(
opf.reference(
{"href": "cover.jpg", "title": "Cover", "type": "cover"})
),
)
)
validator.assertValid(tree)
print(ET.tostring(tree, pretty_print=True).decode('utf-8'))
Creating an XML file using Python
Use lxml.builder, here's a tutorial too.
import lxml.builder as lb
from lxml import etree
y=lb.E.Title(lb.E.s(name="hello",adress="abcdef"),
lb.E.s(name="",adress=""),
rollid="1", mainid="1",teamid="1")
print etree.tostring(y, pretty_print=True)
>>>
<Title teamid="1" rollid="1" mainid="1">
<s adress="abcdef" name="hello"/>
<s adress="" name=""/>
</Title>
Python writing to an xml file
If you are wanting to replace a value within an existing XML file then use:
tree.write(xmlfile)
Currently you are simply overwriting your file entirely and using the incorrect method (open()
). tree.write()
is normally what you would want to use. It might look something like this:
tree = etree.parse(xmlfile, parser=XMLParser)
root = tree.getroot()
hardwareRevisionNode = root.find(".//hardwareRevision")
if hardwareRevisionNode.text == "5":
print "Old Tag: " + hardwareRevisionNode.text
hardwareRevisionNode.text = "DVT2"
print "New Tag: " + hardwareRevisionNode.text
tree.write(xmlfile)
↳ https://docs.python.org/2/library/xml.etree.elementtree.html
Python generating xml file by python
Use this ElementTree. An example of it is,
from xml.etree import ElementTree, cElementTree
from xml.dom import minidom
root = ElementTree.Element('root')
child1 = ElementTree.SubElement(root, 'doc')
child1_1 = ElementTree.SubElement(child1, 'Jankyframes')
child1_1.text = jank_perc
print ElementTree.tostring(root)
tree = cElementTree.ElementTree(root) # wrap it in an ElementTree instance, and save as XML
t = minidom.parseString(ElementTree.tostring(root)).toprettyxml() # Since ElementTree write() has no pretty printing support, used minidom to beautify the xml.
tree1 = ElementTree.ElementTree(ElementTree.fromstring(t))
tree1.write("filename.xml",encoding='utf-8', xml_declaration=True)
Related Topics
How to Load and Play a Video in Pygame
Getting List of Parameter Names Inside Python Function
Stacked Bar Chart with Centered Labels
How to Change the Font Size on a Matplotlib Plot
How to Handle the Window Close Event in Tkinter
Best Way to Find the Intersection of Multiple Sets
Pandas: Drop Consecutive Duplicates
Determine Function Name from Within That Function (Without Using Traceback)
Why Does (1 in [1,0] == True) Evaluate to False
Plotting with Seaborn Using the Matplotlib Object-Oriented Interface
Python: Why Is Functools.Partial Necessary
Can Not Click on a Element: Elementclickinterceptedexception in Splinter/Selenium
How to Capture Stdout Output from a Python Function Call