Document format error
My Mom's list contains dozens of recipes, even hundreds. If a result
Faulty, it will be very difficult to troubleshoot - you will look for missing marks row by line
symbol. If you use several layers of nesting, it will be difficult to find errors.
But it can be found very good help. Analyzer - XML code and report format errors
The app is available for free online. The best of them is Lark, its work
The author is the most popular editor and advocated by Tim Bray - the XML specification.
One of the smart people.
I used Lark to analyze the following code. Note "chocolate chips" and its close
There is an error in the position where the tag appears in the </ingredients> tag:
<?xml version="1.0"?>
<list>
<recipe>
<author>Carol Schmidt</author>
<recipe_name>Chocolate Chip Bars</recipe_name>
<meal>Dinner
<course>Dessert</course>
</meal>
<ingredients>
<item>2/3 C butter</item>
<item>2 C brown sugar</item>
<item>1 tsp vanilla</item>
<item>1 3/4 C unsifted all-purpose flour</item>
<item>1 1/2 tsp baking powder</item>
<item>1/2 tsp salt</item>
<item>3 eggs</item>
<item>1/2 C chopped nuts</item>
<item>
</ingredients>2 cups (12-oz pkg.) semi-sweet choc.
chips</item>
<directions>
Preheat overn to 350 degrees. Melt butter;
combine with brown sugar and vanilla in large mixing bowl.
Set aside to cool. Combine flour, baking powder, and salt; set aside.
Add eggs to cooled sugar mixture; beat well. Stir in reserved dry
ingredients, nuts, and chips.
Spread in greased 13-by-9-inch pan. Bake for 25 to 30 minutes
until golden brown; cool. Cut into squares.
</directions>
</recipe>
</list>
Here are the results returned by the analyzer:
Error Report
Line 17, column 22: Encountered </ingredients> expected </item>
... assumed </item>
Line 18, column 36: Encountered </item> with no start-tag.
With this information, finding the error will not become a problem. Then the validity of XML files
What does it mean?
Achieve effectiveness
Finally, we will include information in a well-organized XML document. Actually, we
There are a lot to do - there are still crises lurking - although XML files are well organized,
But key information may also be lost. Take a look at the following example:
<recipe>
<author>Carol Schmidt</author>
<recipe_name>Chocolate Chip Bars</recipe_name>
<meal>Dinner <course>Dessert</course> </meal>
<ingredients> </ingredients>
<directions>Melt butter; combine with, etc. ... </directions>
</recipe>
This recipe does not include ingredient and because it is well organized,
The Lark analyzer won't find any problems either. Have managed even the most kind database
Everyone knows the common mistakes we humans make: if we have the opportunity, we will lose the key
Message and add useless nonsense. This is why the inventor of XML introduced DTD -
Document Type Definition. DTD provides a kind of insurance
Proving XML is more or less the way you think.
Let's take a look at a DTD used on the recipe.
<!DOCTYPE list [
<!ELEMENT recipe (recipe_name, author, meal, ingredients, directions)>
<!ELEMENT ingredients (item+)>
<!ELEMENT meal (#PCDATA, course?)>
<!ELEMENT item (#PCDATA, sub_item*)>
<!ELEMENT recipe_name (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT course (#PCDATA)>
<!ELEMENT item (#PCDATA)>
<!ELEMENT subitem (#PCDATA)>
<!ELEMENT directions (#PCDATA)>
]>
These codes don't seem friendly at first, but when you break it down, you can see it.
The meaning of Let's explain in detail:
<!DOCTYPE list [
This line means that the included in the square brackets is a document with the root element <list>
DTD. As we mentioned before, the root element contains all other elements.
<!ELEMENT recipe (recipe_name, meal, ingredients, directions)>
This line defines the <recipe> tag. Parentheses mean that the four types of markers must be
It must appear in the <recipe> tag in order.
<!ELEMENT meal (#PCDATA, course?)>
This line needs detailed explanation. I defined the following structure:
<meal>Here the meal name is mandatory
<course>One course name may appear, but it is not
mandatory</course>
</meal>
I do this because, as I think, lunch doesn't have to be specific to a certain dish, but
It is dinner that may point to appetizers, main courses and after-meal desserts. By specifying
#PCDATA - represents the analyzed character data (i.e. non-binary data)
Implement this function. Here #PCDATA is text - for example, "dinner".
The question mark after "course" indicates that the <course> tag will appear in <meal>
Inside the tag.
Now let's look at the next line:
<!ELEMENT ingredients (item+)>
The plus sign here means that at least one pair of <item> tags should appear in <ingredients>
Inside the tag.
The last line we are interested in is:
<!ELEMENT item (#PCDATA, sub_item*)>
I use sub_item* as a security measure. In addition to requiring text for each item
In addition, I want to calculate the number of contents for each item. The asterisk means the <item> mark
There can be a number of sub-entries in the character. I don't need Chocolate Chip Bars recipe
Any sub-entry, but it can be used when its composition is complicated.
Now let's put these together and see what we can get.
Complete example of DTD
Here is a complete example. I added another recipe to the file and
DTD commented. You can notice that I used sub-entries in the second recipe.
<?xml version="1.0"?>
<!--This starts the DTD. The first four lines address document structure-->
<!DOCTYPE list ][
<!ELEMENT recipe (recipe_name, author, meal, ingredients,directions)>
<!ELEMENT ingredients (item+)>
<!ELEMENT meal (#PCDATA, course?)>
<!ELEMENT item (#PCDATA, sub_item*)>
<!--These are the remaining elements of the recipe tag -->
<!ELEMENT recipe_name (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT directions (#PCDATA)>
<!--The remaining element of the meal tag -->
<!ELEMENT course (#PCDATA)>
<!--The remaining element of the item tag -->
<!ELEMENT sub_item (#PCDATA)>
]>
<?xml version="1.0"?>
<list>
<recipe>
<author>Carol Schmidt</author>
<recipe_name>Chocolate Chip Bars</recipe_name>
<meal>Dinner
<course>Dessert</course>
</meal>
<ingredients>
<item>2/3 C butter</item>
<item>2 C brown sugar</item>
<item>1 tsp vanilla</item>
<item>1 3/4 C unsifted all-purpose flour</item>
<item>1 1/2 tsp baking powder</item>
<item>1/2 tsp salt</item>
<item>3 eggs</item>
<item>1/2 C chopped nuts</item>
<item>2 cups (12-oz pkg.) semi-sweetchoc. chips</item>
</ingredients>
<directions>
Preheat oven to 350 degrees. Melt butter;
combinewith brown sugar and vanilla in large mixing bowl.
Set aside to cool. Combine flour, baking powder, andsalt;
set eggs to cooled sugar mixture; beat well.
Stir in reserved dry ingredients, nuts, and chips.
Spread in greased 13-by-9-inch pan.
Bake for 25 to 30minutes until golden brown; cool.
Cut into squares.
</directions>
</recipe>
<recipe>
<recipe_name>Pasta with tomato Sauce</recipe_name>
<meal>Dinner
<course>Entree</course>
</meal>
<ingredients>
<item>1 lb spaghetti</item>
<item>1 16-oz can diced tomatoes</item>
<item>4 cloves garlic</item>
<item>1 diced onion</item>
<item>Italian seasoning
<sub_item>oregano</sub_item>
<sub_item>basil</sub_item>
<sub_item>crushed red pepper</sub_item>
</item>
</ingredients>
<directions>
Boil pasta. Sauté garlic and onion.
Add hot.
</directions>
</recipe>
</list>
Since there is a DTD, the document will be checked to meet the restrictions imposed by the DTD. In other words
Say, we must ensure the validity of the document.
To achieve this we need another tool: the effectiveness analyzer. Microsoft
MSXML, a Java-based program, works very well with easy use. The above
After the document is checked by this program, no errors are found. But if I check one
The recipe that does not contain an entry in the ingredient tag will return the following information:
ingredients is not complete. Expected elements [item].