outline:
Introduction
1. Related terms for XML documents
2. Related terms of DTD
Introduction
The most troublesome thing about getting started with XML is that there are a lot of new term concepts to understand. Since XML itself is also a brand new technology that is constantly developing and changing, and organizations and major network companies (Microsoft, IBM, SUN, etc.) are constantly introducing their own insights and standards, it is no surprise that new concepts are flying everywhere. There is a lack of authoritative institutions or organizations in China to formally name these terms. Most of the Chinese textbooks about XML you see are translated based on the author's own understanding. Some are correct and some are wrong, which further hinders our understanding and learning of these concepts.
The explanation of XML terms you will see below is also the author's own understanding and translation. Ajie describes it based on the XML 1.0 standard specifications and related formal documentation released by the W3C organization. It is ensured that these understandings are basically correct, at least not wrong. If you want to read and learn more, I have listed the source and links of the relevant resources in the last part of this article, which you can access directly. OK, let's move on to the topic:
1. Related terms for XML documents
What is an XML document? Just know the HTML original code file. XML documents are XML original code files written with XML identifiers. XML documents are also plain text files for ASCII, which you can create and modify using Notepad. The suffix of an XML document is .XML, for example. You can also open the .xml file directly with IE5.0 or above browsers, but what you see is the "XML original code" and the page content will not be displayed. You can save the following code as a try:
<?xml version="1.0" encoding="GB2312"?>
<myfile>
<title>XML Easy Learning Manual</title>
<author>ajie</author>
<email>ajie@</email>
<date>20010115</date>
</myfile>
The XML document contains three parts:
1. An XML document declaration;
2. A definition of document type;
3. Identify the content created with XML.
Give an example:
<?xml version="1.0"?>
<!DOCTYPE filelist SYSTEM "">
<filelist>
<myfile>
<title>QUICK START OF XML</title>
<author>ajie</author>
</myfile>
......
</filelist>
The first line <?xml version="1.0"?> is the declaration of an XML document. The second line indicates that this document is used to define the document type. The third line is the main part of the content.
Let's learn about the terms related to XML documents:
(element):
Elements are already understood in HTML, and they are the smallest unit that makes up HTML documents, and the same is true in XML. An element is defined by an identifier, including the start and end identifiers and the contents therein, like this: <author>ajie</author>
The only difference is: in HTML, the identifier is fixed, while in XML, the identifier needs to be created by you.
(Identification)
Identification is used to define elements. In XML, the identifier must appear in pairs, surrounding the data in the middle. The name of the identifier is the same as the name of the element. For example, an element like this:
<author>ajie</author>
Where <author> is the logo.
(property):
What are attributes? Look at this HTML code:<font color="red">word</font>. Among them, color is one of the properties of font.
An attribute is a further description and description of the identifier. An identifier can have multiple attributes, such as the attribute of font and size. Properties in XML are the same as those in HTML. Each property has its own name and value, and the property is part of the identification. For example:
<author sex="female">ajie</author>
Attributes in XML are also defined by yourself. We recommend that you try not to use attributes, but change attributes to child elements. For example, the above code can be changed to this:
<author>ajie
<sex>female</sex>
</author>
The reason is that attributes are not easy to expand and operate by programs.
(statement)
There is an XML declaration on the first line of all XML documents. This statement means that this document is an XML document, which XML version of it follows. An XML declaration statement looks like this:
<?xml version="1.0"?>
(File type definition)
DTD is used to define the relationship between elements, attributes and elements in XML documents.
DTD files can be used to detect whether the structure of the XML document is correct. But creating XML documents does not necessarily require DTD files. For detailed descriptions of DTD files, we will list the items separately below.
-formed XML (XML in good format)
A document that complies with XML syntax rules and adheres to XML specifications is called "good format". If all your identities strictly adhere to the XML specification, then your XML document does not necessarily need a DTD file to define it.
A well-formed document must start with an XML declaration, for example:
<?xml version="1.0" standalone="yes" encoding="UTF-8"?>
You must specify the XML version that the document complies with, currently 1.0; secondly, the document is "independent", which does not require a DTD file to verify whether the identifier is valid; thirdly, the language encoding used by the document is specified. The default is UTF-8. If you use Chinese, you need to set it to GB2312.
A well-formed XML document must have a root element, which is the first element created immediately after declaring. The other elements are child elements of this root element and belong to the group of root elements.
The content of a well-format XML document must be written in accordance with XML syntax. (We will explain the XML syntax in the next chapter)
XML (valid XML)
An XML document that complies with XML syntax rules and complies with the corresponding DTD file specification is called a valid XML document. Note that we compare "Well-formed XML" and "Valid
XML", the biggest difference is that one fully complies with the XML specification, and the other has its own "file type definition (DTD)".
The process of comparing and analyzing the XML document with its DTD file to see if it complies with the DTD rules is called validation. This process is usually handled by a software called parser.
A valid XML document must also start with an XML declaration, for example:
<?xml version="1.0" standalone="no" encode="UTF-8"?>
Unlike the above example, in the standalone (independent) property, "no" is set here because it must be used with the corresponding DTD. The definition method of the DTD file is as follows:
<!DOCTYPE type-of-doc SYSTEM/PUBLIC "dtd-name">
in:
"!DOCTYPE" means that you want to define a DOCTYPE;
"type-of-doc" is the name of the document type, defined by you, usually the same as the DTD file name;
Only one of the two parameters "SYSTEM/PUBLIC" is used. SYSTEM refers to the URL of a private DTD file used by a document, while PUBLIC refers to the URL of a document calling a public DTD file.
"dtd-name" is the URL and name of the DTD file. The suffix of all DTD files is ".dtd".
Let's use the above example, which should be written like this:
<?xml version="1.0" standalone="no" encode="UTF-8"?>
<!DOCTYPE filelist SYSTEM "">
2. Related terms of DTD
What is DTD, we have mentioned briefly above. DTD is an effective method to ensure the correct format of XML documents. You can compare XML documents and DTD files to see whether the document complies with the specifications and whether the elements and labels are used correctly. A DTD document contains: definition rules for elements, definition rules for relationships between elements, attributes that elements can use, and entities or symbol rules that can be used.
The DTD file is also an ASCII text file with the suffix name .dtd. For example:.
Why use DTD files? My understanding is that it satisfies network sharing and data interaction. The biggest benefit of using DTD is the sharing of DTD files. (It is the PUBLIC property in the DTD description above). For example, if two people in the same industry and different regions use the same DTD file as document creation specification, their data will be easily exchanged and shared. Others online want to supplement data, and they only need to create documents based on the public DTD specifications and can join them immediately.
At present, there are a large number of written DTD files that can be used. For different industries and applications, these DTD files have established common element and label rules. You don't need to recreate yourself, just add the new logo you need to on their basis.
Of course, if you want, you can create your own DTD, which may work more perfectly with your documentation. Establishing your own DTD is also a very simple thing. Generally, you only need to define 4-5 elements.
There are two ways to call DTD files:
1. DTD directly included in XML documents
You just need to insert some special instructions into the DOCTYPE statement, like this:
We have an XML document:
<?xml version="1.0" encoding="GB2312"?>
<myfile>
<title>XML Easy Learning Manual</title>
<author>ajie</author>
</myfile>
We can insert the following code after the first line:
<!DOCTYPE myfile [
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ENTITY copyright "Copyright 2001, Ajie.">
]>
2. Call independent DTD files
Save the DTD document as a file of .dtd and then call it in the DOCTYPE declaration line, for example, save the following code as
<!ELEMENT myfile (title, author)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
Then call it in the XML document and insert it after the first line:
<!DOCTYPE myfile SYSTEM "">
We can see that the call of js in DTD documents is similar to that in HTML. Regarding how to write DTD documents, we will introduce them together with the syntax of XML documents in the next chapter.
Let’s learn about DTD terms below:
(planning)
schema is a description of data rules. schema does two things:
a. It defines the relationship between element data types and elements;
b. It defines the type of content that an element can contain.
DTD is a schema about XML documents.
Tree(document tree)
"Document Tree" has been mentioned in the previous second chapter. It is an image representation of the hierarchical structure of document elements. A document structure tree contains a root element, which is the top-level element (that is the first element immediately after the XML declaration statement). See example:
<?xml version="1.0"?>
<filelist>
<myfile>
<title>...</title>
<author>...</author>
</myfile>
</filelist>
The above example is arranged in three-level structures into a "tree", where <filelist> is the root element. In XML and DTD files, the first one defines the root element.
Element (parent element)/Child Element (child element)
A parent element refers to an element containing other elements, and the contained element is called its child element. Look at the "structure tree" above, where <myfile> is the parent element, <title>, <author> is its child element, and <myfile> is the child element of <filelist>. The last level element like <title> that does not contain any child elements, we also call it "page element".
(Solution software)
Parser is a tool software that checks whether XML documents comply with DTD specifications.
XML parsers have developed into two categories: one is "non-confirm class paeser", which only detects whether the document complies with XML syntax rules and whether the document tree is created with element identifiers. Another type is "confirm class paeser", which not only detects document syntax and structure tree, but also compares whether the element identifier you use complies with the corresponding DTD file specifications.
Parser can be used independently or can be part of editing software or browser. In the following related resource list, I listed some of the most popular parsers.
Well, through the third chapter, we have learned some basic terms of XML and DTD, but we don’t know how to write these files yet and what kind of syntax to follow. In the next chapter, we will focus on the syntax for writing XML and DTD documents. Please continue browsing, thank you!