Checked some information and recorded it yourself
1. Simple analysis of small files
For XML parsing of small files, we can use the Go standard libraryencoding/xml
package to implement.
Suppose we have a name calledThe content of the file is as follows:
<?xml version="1.0" encoding="UTF-8"?> <config> <smtpServer>smtp.</smtpServer> <smtpPort>25</smtpPort> <sender>user@</sender> <senderPasswd>123456</senderPasswd> <receivers flag="true"> <age>16</age> <user>Mike_Zhang@</user> <user>test1@</user> <script> <![CDATA[ function matchwo(a,b) { if (a < b && a < 0) then { return 1; } else { return 0; } } ]]> </script> </receivers> </config>
CorrespondingThe file code is as follows:
package main import ( "fmt" "io/ioutil" "encoding/xml" ) // Define structures to map XML structurestype SConfig struct { XMLName `xml:"config"` SmtpServer string `xml:"smtpServer"` SmtpPort int `xml:"smtpPort"` Sender string `xml:"sender"` SenderPasswd string `xml:"senderPasswd"` Receivers SReceivers `xml:"receivers"` } type SReceivers struct { Age int `xml:"age"` Flag string `xml:"flag,attr"` User []string `xml:"user"` Script string `xml:"script"` } func readXml(path string) { // Read the file content directly, ioutil handles the opening and closing operations internally data, err := (path) if err!= nil { ("There was an error reading the file!", err) return } // Initialize structure variables v := SConfig{} err = (data, &v) if err!= nil { ("error: %v", err) return } // Print the parsed results ("SmtpServer : ", ) ("SmtpPort : ", ) ("Sender : ", ) ("SenderPasswd : ", ) (" : ", ) (" : ", ) (" : ", ) for i, element := range { (i, element) } } func main() { readXml("") }
After running this code, the output is as follows:
SmtpServer : smtp.
SmtpPort : 25
Sender : user@
SenderPasswd : 123456
: true
: 16
:function matchwo(a,b) {
if (a < b && a < 0) then {
return 1;
} else {
return 0;
}
}0 Mike_Zhang@
1 test1@
2. Large file analysis
When dealing with larger XML files, we can adopt streaming parsing to avoid loading the entire file into memory at once.
AlsoAs an example, the content remains unchanged.
The file code is as follows:
package main import ( "fmt" "encoding/xml" "bufio" "os" "io" ) // Define structures to map XML structurestype SConfig struct { XMLName `xml:"config"` SmtpServer string `xml:"smtpServer"` SmtpPort int `xml:"smtpPort"` Sender string `xml:"sender"` SenderPasswd string `xml:"senderPasswd"` Receivers SReceivers `xml:"receivers"` } type SReceivers struct { Age int `xml:"age"` Flag string `xml:"flag,attr"` User []string `xml:"user"` Script string `xml:"script"` } func readXml(path string) { // Open the file file, errOpen := (path) if errOpen!= nil { ("Exception of opening the file!", errOpen) return } defer () // Create a cached Reader reader := (file) decoder := (reader) for t, err := (); err == nil || err == ; t, err = () { switch token := t.(type) { case : name := (name) if name == "config" { // parse config var sConfig = SConfig{} configErr := (&sConfig, &token) if configErr!= nil { ("Parse error:") (configErr) } else { (sConfig) } return } } } } func main() { readXml("") }
The output result is:
config
{{ config} smtp. 25 user@ 123456 {16 true [Mike_Zhang@ test1@]function matchwo(a,b) {
if (a < b && a < 0) then {
return 1;
} else {
return 0;
}
}}}
3. Analysis of complex structures
For complex structure XML files, we can use third-party libraries/beevik/etree
to parse.
Suppose we have a name calledThe content of the file is as follows:
<bookstore xmlns:p="urn:schemas-books-com:prices"> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <p:price>30.00</p:price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <p:price>29.99</p:price> </book> <book category="WEB"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <p:price>49.99</p:price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <p:price>39.95</p:price> </book> </bookstore>
The file code is as follows:
package main import ( "fmt" "/beevik/etree" ) func readXml(path string) { doc := () if err := (path); err!= nil { panic(err) } root := ("bookstore") ("ROOT element:", ) for _, book := range ("book") { ("CHILD element:", ) if title := ("title"); title!= nil { lang := ("lang", "unknown") (" TITLE: %s (%s)\n", (), lang) } for _, attr := range { (" ATTR: %s=%s\n", , ) } } } func main() { readXml("") }
The output result is:
ROOT element: bookstore
CHILD element: book
TITLE: Everyday Italian (en)
ATTR: category=COOKING
CHILD element: book
TITLE: Harry Potter (en)
ATTR: category=CHILDREN
CHILD element: book
TITLE: XQuery Kick Start (en)
ATTR: category=WEB
CHILD element: book
TITLE: Learning XML (en)
ATTR: category=WEB
Use stream data to parse xml files. If the included tag exists, the fields will be read missing and cleared. This is as follows
package main import ( "encoding/xml" "fmt" "io" "os" ) // Define structure to map pre tags in HTML fragmentstype HTMLPre struct { XMLName `xml:"htmlpre"` Text string `xml:",innerxml"` } // Define structure to map Rule elementstype XCCDFRule struct { XMLName `xml:"Rule"` ID string `xml:"id,attr"` Title string `xml:"title"` Description string `xml:"description"` HTMLPre []HTMLPre `xml:"htmlpre"` } func parseRule(reader ) (*XCCDFRule, error) { decoder := (reader) = false = = var rule XCCDFRule for { token, err := () if err == { break } else if err!= nil { return nil, err } switch se := token.(type) { case : if == "Rule" { if err := (&rule, &se); err!= nil { return nil, err } // Further parse HTML content in description if err!= nil { return nil, err } return &rule, nil } } } return nil, ("Rule element not found") } func main() { file, err := ("your_file.xml") if err!= nil { ("Open file error:", err) return } defer () rule, err := parseRule(file) if err!= nil { ("Parse error:", err) return } ("Rule ID: %s\n", ) ("Rule Title: %s\n", ) ("Rule Description: %s\n", ) }
In this improved code:
- Defined
HTMLPre
Structure to map<htmlpre>
The content of the tag, including the text inside the tag (usingxml:",innerxml"
to get all XML content in the tag as a string). - exist
XCCDFRule
Added to the structureHTMLPre
Fields to store parsed<htmlpre>
Tag content list. - exist
parseRule
In the function, decodeRule
After the element, callparseDescription
Functions for further analysisdescription
HTML content in the field, extract<htmlpre>
The text in the tag is stored inin the list.
-
parseDescription
The function creates a new XML parser to parsedescription
Contents in strings, specifically searched<html:pre>
Tag and decode its contents.
Note that this is just a way to deal with it, depending on your actual needs, the code may need to be further tweaked and extended to handle more complex HTML nested structures or other types of content in XML. At the same time,"your_file.xml"
Replace with the actual XML file path.
This is the article about golang for xml file analysis. For more related golang xml file analysis content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!