SoFunction
Updated on 2025-04-08

XML Simple Tutorial Three

The future of XML
Now you already know XML. Indeed, the structure is a bit complicated, and there are various DTDs
Options to define what a document can contain. But that's not all.

Consider an industry where data exchange is important to it, such as banks. Bank uses all
The right system is used to track internal transactions, but if they use a common one on the web
XML format, then they must describe transaction information to another institution or application
order (such as Quicken or MS Money). Of course, they can also be on the web page.
Indicates data. FYI: This mark does not exist. It is called OFEX, open financial transactions
Format (Open Financial Exchange).

In some cases, if IE 4 on the PC encounters a <SOFTPKG> tag, a
Functions will be started to give the user the chance to update the installed software. If you use
It's Windows 98, you may have seen this, but don't know it's a
XML applications.

Here we have three XML applications that look like they were looking at in the 70s with Andy Grove
The addition machine, typewriter and pencil you come to are different. But it should be with the final appearance on the PC
Similar to the application, the benefits of XML can be generally described as: "When you use people
Good things happen when classes and machine-readable tags describe your data. ”

What are these good things? I have no idea. But I don't know what's on my PC
What will a generation of programs look like? As long as the data is marked in this way, it can be generated
Different applications.

Are you starting to think about how much it will expand?

We have a lot of practical applications of XML to talk about, and I will talk about it in the near future
Ours. Since we are all netizens, we will be XSL (extended style language -
eXtensible Style Language).

By the way, this recipe is indeed from my mother, and it is outstanding. if you
Use it, add half a cup of crushed coconut.


I write this because I sincerely care about your opinion of me. My concern is: If you have read the XML introduction I wrote and then be ready to start writing your own XML document. So you start looking for an established DTD to represent your information. You found one, as follows:

 

<!ATTLIST fn

%;

value CDATA #FIXED "TEXT">

<!ENTITY % "

CDATA #REQUIRED

ENTITY #REQUIRED">


You'll think that Jay must be an idiot. He didn't say anything about ATTLIST and ENTITY - whatever they were.

So let's talk about this and be a little patient first.

The above lines may not look good, but in fact they are nothing. They are used in DTD to define properties and entities in XML documents. Those who know HTML will be clear about this. Attributes are entries with HTML tags that are used to describe tags more accurately. In the frequently occurring <img src="" height="20" width="20">, there are two attributes: height and width. As you will see later, using properties in XML documents is very similar to that.

There is nothing new to the entity. If you have used &, you have mastered the most basic things. A string surrounded by & and semicolons is used to represent another or a set of characters. (Here is a complete list of ISO entities.)

Of course, there are other functions in XML properties and entities. This inevitably involves introducing grammar, although not too many. Once you know this, you will use XML documents with no effort.

Simplify recipes

If you have read the XML introduction I wrote, you will remember the ingredients in the recipe represented by simple tags, such as <item>2 cups flour</item>. After writing that post, I roamed the internet and found another XML document about the recipe. The recipe elements are as follows:


<ingredient quantity="2" units="cups">flour</ingredient>

This approach has one practical benefit: it makes it easier to control the data. In the first method, the <item> tag is used to hold a bunch of different information. If I wanted to extract a list of components without requiring the amount of each component, I wouldn't do that.

I can use the following structure to obtain similar functions:


<item>flour

<quantity>2</quantity>

<units>cups</units>


This can be handled, but there are two problems: First, the item element contains mixed content: text and other tags. I quickly discovered that this structure should be avoided as much as possible. Secondly, the marker has almost no independent meaning. It's hard to imagine just units and not actual components. These entries can be described briefly, and I would rather treat them as attributes.

The first thing to note is the attribute names, quantity and units only make sense if processed by applications that can translate them.

Before being included in a valid document, DTD should be told to allow it. For the above ingredients element, we only include the following code in the DTD:

 

<!ELEMENT ingredient #PCDATA>

<!ATTLIST ingredient quantity CDATA #REQUIRED>

<!ATTLIST ingredient units CDATA #REQUIRED>


The first line looks familiar - a standard element definition that can be seen in any DTD. Each ATTLIST line contains the following information in turn:


<!ATTLIST ingredient quantity CDATA #REQUIRED>

This is the element that attributes depend on.

<!ATTLIST ingredient quantity CDATA #REQUIRED>

The attribute name is defined here.

<!ATTLIST ingredient quantity CDATA #REQUIRED>

Here is the property type. CDATA stands for character data. It means that the processor can get text within the properties.

<!ATTLIST ingredient quantity CDATA #REQUIRED>

The last part defines the default value of the attribute. The actual value can be used, such as 3. In this way, the attribute value of the blank length in XML will be 3. The entered value will override the default value.

In the example above I did not set a specific number, but instead used the XML keyword #REQUIRED. It tells the processor that the secondary attribute must contain a value. If blank, the document will not be processed.

There are two other keywords for the default value. The first one is #FIXED - if the attribute value remains the same throughout the document. Suppose I define the tagging attribute of an image, and all images have the same size, such as 100*50 pixels, and you can define the attribute in DTD like this:


<!ATTLIST picture length CDATA #FIXED "100 px">

<!ATTLIST picture width CDATA #FIXED "50 px">

 

Another keyword is #IMPLIED, which means that the attribute can contain values ​​or be empty.

Let's take a look at the attribute types below.

If you decide to write DTD yourself, you may need a book that explains all the combinations of XML in the ATTLIST statement. But if you borrow DTD, you may only know CDATA and the other three attributes.

The first one is ID. It requires that the value of the attribute is not repeated in the document. Anyone who has used a database knows the necessity of a unique identifier. The DTD ATTLIST statement looks like this:

<!ATTLIST element_name attribute_name ID #REQUIRED>

It's hard to imagine an ID attribute type without the #REQUIRED default value. If that happens, any duplicate or empty ID will force the processor to return an error. The ID must start with a letter or an underscore and cannot contain any spaces.

The NMTOKEN type also uses the naming rules above. But repetition is allowed. It is used as a guarantee for passing data to applications. Most programming languages, including Java and JavaScript, cannot have spaces in module names. In most cases, it is best to ensure that the properties comply with their rules.

Finally, there is the enum type, which does not require specific keywords. Instead, use the "|" symbol to contain the value in brackets, for example:

<!ATTLIST sibling (brother | sister) #REQUIRED>

If there are limited possible attribute values, this way can be used.

I won’t think today’s course is boring, so let’s continue reading!