Introduction to xmltodict
concept
- xmltodict is a module in Python used to process XML data.Convert XML data to dictionary, simplifies the XML parsing process, while retaining data structures for easy operation.
- Otherwise, you can also convert the dictionary back to XML format. This module provides an intuitive and concise interface when processing XML;
xmltodict
xmltodict module belongs to PythonThird-party library, additional download and installation are required, the command is as follows:
pip install xmltodict
Generate XML data
unparser function is used to convert Python dictionary to XML data, facilitate data storage and transmission;
The parameters are as follows:
- input_dict: Python dictionary to convert to XML.
- output (optional): The target of the output. Can be a string (default) or a file object.
- pretty (optional): Whether to beautify the output. Default is False.
- full_document (optional): Whether to output a complete XML document, including an XML declaration. Default is True.
import xmltodict # Python Dictionarydata = { 'persons': { 'person': [ { 'name': 'Zhang San', 'age': '18', 'gender': 'male', 'address': {'street': 'Pudong Avenue', 'district': 'Pudong New Area', 'city': 'Shanghai', 'state': 'China'} }, { 'name': 'Li Si', 'age': '20', 'gender': 'female', 'address': {'street': 'Blue Indigo Factory Road', 'district': 'Haidian District', 'city': 'Beijing', 'state': 'China'}} ] } } # Convert dictionary to XML dataxml_string = (data, pretty=True) # Print XML dataprint(xml_string) # <?xml version="1.0" encoding="utf-8"?> # <persons> # <person> # <name>Zhang San</name># <age>18</age> # <gender>Male</gender># <address> # <street>Pudong Avenue</street># <district>Pudong New District</district># <city>Shanghai</city># <state>China</state># </address> # </person> # <person> # <name>Li Si</name># <age>20</age> # <gender>Female</gender># <address> # <street>Landianchang Road</street># <district>Haidian District</district># <city>Beijing</city># <state>China</state># </address> # </person> # </persons>
Parse XML data
parse function parse XML data to Python dictionary, allowing you to access and manipulate XML data using Python syntax.
The parameters are as follows:
- xml_input: XML data to parse. Can be a string or a file object.
- encoding (optional): encoding of XML document. The default is None, which means using the encoding specified in the XML document.
- expat (optional): Custom XML parser. By default, the Python standard library is used.
- process_namespaces (optional): Whether to process namespaces. Default is False.
- namespace_separator (optional): The separator between the namespace and the label name when process_namespaces=True. The default is:.
- postprocessor (optional): A function that will be called after each element is parsed. This allows the user to modify the parsing results, for example, which can be used to convert data types or merge nodes. It receives three parameters: path, key, and value. path is the parent element path of the current element, key is the label name of the current element, and value is the value of the current element (maybe a dictionary of text, attributes, or child elements).
- dict_constructor (optional): The constructor used to create a dictionary. By default, xmltodict uses the built-in dict function to construct dictionaries. If you want to use other types of dictionaries (for example, to keep the elements in order), you can specify them via this parameter.
- xml_attribs (optional): Controls whether the parser should contain the attributes of the element. The default is True, which means that the attributes of the element will be included in the parsing result. If set to False, the attribute will be ignored and only the text content of the element and child elements will be included.
import xmltodict # XML Dataxml_string = ''' <persons> <person> <name>Zhang San</name> <age>18</age> <gender>Male</gender> <address> <street>Pudong Avenue</street> <district>Pudong New District</district> <city>Shanghai</city> <state>China</state> </address> </person> <person> <name>Li Si</name> <age>20</age> <gender>Female</gender> <address> <street>Landianchang Road</street> <district>Haidian District</district> <city>Beijing</city> <state>China</state> </address> </person> </persons> ''' # parse XML datadata = (xml_string) print(type(data), data) # <class 'dict'> {'persons': {'person': [{'name': 'Zhang San', 'age': '18', 'gender': 'male', 'address': {'street': 'Pudong Avenue', 'district': 'Pudong New District', 'city': 'Shanghai', 'state': 'China'}}, {'name': 'Li Si', 'age': '20', 'gender': 'female', 'address': {'street': 'Landianchang Road', 'district': 'Haidian District', 'city': 'Beijing', 'state': 'China'}}]}} # Access dataprint(data['persons']['person'][0]['name']) # Output: Zhang Sanprint(data['persons']['person'][1]['name']) # Output: Li Si
expand
1. Change the attribute prefix
The attr_prefix parameter specifies the key prefix of XML attributes when converted to a dictionary. The default value is '@'.
import xmltodict xml_string = ''' <persons> <person name="zhangsan" age="18" gender="male"> <address>Shanghai Pudong New District</address> </person> <person name="lisi" age="20" gender="female"> <address>Haidian District, Beijing</address> </person> </persons> ''' # data1 = (xml_string) print(data1) # Use the default attr_prefix='@' parameter value, and the output is as follows:# {'persons': # { # 'person': [ # {'@name': 'zhangsan', '@age': '18', '@gender': 'male', 'address': 'Shanghai Pudong New District'},# {'@name': 'lisi', '@age': '20', '@gender': 'female', 'address': 'Haidian District, Beijing'}# ] # } # } data2 = (xml_string, attr_prefix='attr_') print(data2) # Use the custom attr_prefix='attr_' parameter value, and the output is as follows:# {'persons': # { # 'person': [ # {'attr_name': 'zhangsan', 'attr_age': '18', 'attr_gender': 'male', 'address': 'Shanghai Pudong New District'},# {'attr_name': 'lisi', 'attr_age': '20', 'attr_gender': 'female', 'address': 'Haidian District, Beijing'}# ] # } # }
2. Remove whitespace characters from text values
The strip_whitespace parameter is used to control whether to remove whitespace characters from text values. The default value is True.
import xmltodict xml_string = ''' <person name="zhangsan"> <address> Shanghai Pudong New District</address> </person> ''' # strip_whitespace=True (default)data1 = (xml_string) print(data1) # {'person': {'@name': 'zhangsan', 'address': 'Shanghai Pudong New District'}} # strip_whitespace=False data2 = (xml_string, strip_whitespace=False) print(data2) # {'person': {'@name': 'zhangsan', 'address': 'Pudong New District, Shanghai', '#text': '\n \n'}}
3. Remove empty value tags
Use the postprocessor hook to specify a function that processes key and value values according to the expected logic;
import xmltodict xml_string = ''' <persons> <person> <name>Zhang San</name> <age>18</age> <gender>Male</gender> <address> <street></street> <direct desc="test">Pudong New District</direct> <city></city> <state>China</state> </address> </person> <person> <name>Li Si</name> <age>20</age> <gender>Female</gender> <address> <street></street> <direct desc="test"></direct> <city>Beijing</city> <state>China</state> </address> </person> </persons> ''' def _remove_empty(_, key, value): if value is None: return return key, value result1 = (xml_string) result2 = (xml_string, postprocessor=_remove_empty) print(result1['persons']['person'][0]['address']) # Output: {'street': None, 'district': {'@desc': 'test', '#text': 'Pudong New District'}, 'city': None, 'state': 'China'}print(result2['persons']['person'][0]['address']) # Output: {'district': {'@desc': 'test', '#text': 'Pudong New District'}, 'state': 'China'} print(result1['persons']['person'][1]['address']) # Output: {'street': None, 'district': {'@desc': 'test'}, 'city': 'Beijing', 'state': 'China'}print(result2['persons']['person'][1]['address']) # Output: {'district': {'@desc': 'test'}, 'city': 'Beijing', 'state': 'China'}
Summarize
The xmltodict module is a powerful tool for processing XML data.Combining the flexibility of XML and the simplicity of Python dictionary;
Whether it is necessary to parse complex XML documents or generate structured XML data,xmltodict can be competent in simple and intuitive ways;
By combining XML processing with Python dictionary operations, xmltodict greatly simplifies the processing process of XML data, allowing developers to focus more on the implementation of business logic.
This is the article about Python's example code that uses xmltodict to implement the conversion of dictionaries and xml to each other. For more related contents of Python dictionaries and xml to convert each other, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!