1. Introduction to common XML processing libraries
-
- Python standard library, lightweight, suitable for basic operations.
-
lxml
- Third-party high-performance library that supports XPath, XSLT and Schema verification.
-
/
- Standard library implementation, suitable for handling large XML or event-driven parsing.
-
Auxiliary tool library
-
xmltodict
(XML ↔ Dictionary),untangle
(XML → Python objects) etc.
-
2. Detailed explanation
1. Parsing XML
import as ET # parse from stringroot = ("<root><child>text</child></root>") # parse from filetree = ("") root = ()
2. Data query and modification
# traverse child elementsfor child in root: print(, ) # Find elementselement = ("child") = "new_text" # Modify text("attr", "value") # Modify properties
3. Generate and save XML
root = ("root") child = (root, "child", attrib={"key": "value"}) = "content" tree = (root) ("", encoding="utf-8", xml_declaration=True)
3. Advanced operations of lxml library
1. XPath query
from lxml import etree tree = ("") elements = ("//book[price>20]/title/text()") # Query the book title of price >20
2. XML Schema Verification
schema = (("")) parser = (schema=schema) try: ("", parser) except as e: print("Verification failed:", e)
4. XML and Excel transfer
1. XML → Excel (XLSX)
from openpyxl import Workbook import as ET tree = ("") root = () wb = Workbook() ws = (["Book Title", "author", "price"]) for book in ("book"): title = ("title").text author = ("author").text price = ("price").text ([title, author, price]) ("")
2. Excel (XLSX) → XML
from openpyxl import load_workbook import as ET def xlsx_to_xml(input_file, output_file): wb = load_workbook(input_file) ws = root = ("bookstore") headers = [() for cell in ws[1]] for row in ws.iter_rows(min_row=2, values_only=True): book = (root, "book") for idx, value in enumerate(row): tag = (book, headers[idx]) = str(value) (root).write(output_file, encoding="utf-8", xml_declaration=True) xlsx_to_xml("", "books_from_excel.xml")
5. XML and database interaction
Store to SQLite
import sqlite3 import as ET conn = ("") cursor = () ("CREATE TABLE IF NOT EXISTS books (title TEXT, author TEXT, price REAL)") tree = ("") for book in ("book"): title = ("title").text author = ("author").text price = float(("price").text) ("INSERT INTO books VALUES (?, ?, ?)", (title, author, price)) () ()
6. XML and JSON are transferred
1. XML → JSON
import xmltodict import json with open("") as f: xml_data = () data_dict = (xml_data) with open("", "w") as f: (data_dict, f, indent=4, ensure_ascii=False)
2. JSON → XML
import xmltodict import json with open("") as f: json_data = (f) xml_data = (json_data, pretty=True) with open("books_from_json.xml", "w") as f: (xml_data)
7. Performance optimization and FAQs
1. Performance recommendations
-
Small data: use
。
-
Large data: use
lxml
ofiterparse
Streaming analysis. -
Memory sensitive scenarios: use
Event-driven parsing.
2. Frequently Asked Questions
Handle namespaces:
# lxml examplenamespaces = {"ns": "/ns"} elements = ("//ns:book", namespaces=namespaces)
Special character processing:
= ("<Special Character>&")
Depend on installation
pip install lxml openpyxl xmltodict
Input and output examples
- XML file(
)
<bookstore> <book> <title>Pythonprogramming</title> <author>John</author> <price>50.0</price> </book> </bookstore>
- JSON file (
)
{ "bookstore": { "book": { "title": "Python Programming", "author": "John", "price": "50.0" } } }
Through this guide, you can quickly realize the interchange, storage and complex query operations between XML and other formats, meeting the diverse needs of data migration and interface development.
This is the end of this article about the full strategy for sharing Python XML automation processing. For more related content on Python XML automation processing, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!