SoFunction
Updated on 2025-04-08

XML Simple Tutorial 4

entity

There are five predefined XML entities that HTML coders should be familiar with. The characters &, <,>, " and ' in XML documents are represented as &, @lt;,>, and &apos; respectively.

XML extends the functionality of entities to a large extent - allowing entities to be defined in DTD for use in the rest of the document. For example, I need to use the phrase "Wired Digital" frequently in XML documents, which can be expressed in DTD like this:

<!ENTITY wd "Wired Digital">

This way when I use this phrase, I can type in &wd;. This can avoid misspelling and repeated typing into the same information. Entities can function as macros in word processors.

The replaced text can be arbitrarily long, but if it is really long, you may want to store the information in another file. This can be done by external entity reference - use the keyword SYSTEM in the entity name and file URL:

<!ENTITY text SYSTEM "">

These features are powerful, but have one disadvantage: they cannot be extended within a DTD. To implement the extension function, a special tool called parameter entities is needed. It is implemented in the entity definition by inserting "%" before the entity name. Once defined, parameter definitions can be achieved by surrounding parameter names with a percent and semicolon.

Why do this? Take a look at the following code:


<!ELEMENT vCard (%;, (%; | %; |

%; | %; | %; |

%; | %;)*)>


This code comes from a common XML business card draft. When defining the root element, the authors found it easier to separate the information into different parameter entities. If we look at one of the entities, we will know why. Take a look at the following entities:


<!ENTITY % "

(nickname | photo | bday)">


If each entity is represented by such a long string, the element definition will be difficult to read.

Now you can read some DTDs.

Jay Greenspan ISO-8859-1 Entities
Named
Entity Numeric
Entity Glyph Description
&#00;-
unused
horizontaltab
linefeed
unused
space
! ! exclamationmark
" " " doublequotationmark
# # numbersign
$ $ dollarsign
% % percentsign
& & & amperstand
' ' apostrophe
( ( leftparenthesis
) ) rightparenthesis
* * asterisk
+ + plussign
, , comma
- - hyphen
. . period(fullstop)

Named
Entity Numeric
Entity Glyph Description

/ / solidus(slash)
0-
9 digits0-9
: : colon
; ; semicolon
< < < less-thansign
= = equalssign
> > > greater-thansign
? ? questionmark
@ @ commercialat
A-
Z letters A-Z
[ ][ leftsquarebracket
\ \ reversesolidus(backslash)
] ] rightsquarebracket
^ ^ caret
_ _ horizontalbar(underscore)
` ` acuteaccent
a-
z letters a-z
{ { leftcurlybrace
| | verticalbar

Named
Entity Numeric
Entity Glyph Description
} } rightcurlybrace
~ ~ tilde
-
Ÿ unused
    non-breakingspace
¡ ¡ invertedexclamation
¢ ¢ centsign
£ £ poundsterling
¤ ¤ generalcurrencysign
¥ ¥ yensign
¦ ¦ brokenverticalbar
§ § sectionsign
¨ ¨ umlaut(dieresis)
© © © copyright
ª ª feminineordinal
« « leftanglequote, guillemotleft
¬ ¬ notsign
­ ­ softhyphen
® ® registeredtrademark
¯ ¯ macronaccent

Named
Entity Numeric
Entity Glyph Description
° ° degreesign
± ± plusorminus
² ² superscripttwo
³ ³ superscriptthree
´ ´ acuteaccent
µ µ microsign
¶ ¶ paragraphsign
· · middledot
¸ ¸ cedilla
¹ ¹ superscriptone
º º masculineordinal
» » rightanglequote, guillemotright
¼ ¼ one-fourth
½ ½ one-half
¾ ¾ three-fourths
¿ ¿ invertedquestionmark
À À À uppercaseA, graveaccent
Á Á Á uppercaseA, acuteaccent
   uppercaseA, circumflexaccent

Named
Entity Numeric
Entity Glyph Description
à à à uppercaseA, tilde
Ä Ä Ä uppercaseA, dieresisorumlautmark
Å Å Å uppercaseA, ring
Æ Æ Æ uppercaseAEdipthong(ligature)
Ç Ç Ç uppercaseC, cedilla
È È È uppercaseE, graveaccent
É É É uppercaseE, acuteaccent
Ê Ê Ê uppercaseE, circumflexaccent
Ë Ë Ë uppercaseE, dieresisorumlautmark
Ì Ì Ì uppercaseI, graveaccent
Í Í Í uppercaseI, acuteaccent
Î Î Î uppercaseI, circumflexaccent
Ï Ï Ï uppercaseI, dieresisorumlautmark
Ð Ð Ð uppercaseEth, Icelandic
Ñ Ñ Ñ uppercaseN, tilde
Ò Ò Ò uppercaseO, graveaccent
Ó Ó Ó uppercaseO, acuteaccent
Ô Ô Ô uppercaseO, circumflexaccent
Õ Õ Õ uppercaseO, tilde

Named
Entity Numeric
Entity Glyph Description
Ö Ö Ö uppercaseO, dieresisorumlautmark
× × multiplysign
Ø Ø Ø uppercaseO, slash
Ù Ù Ù uppercaseU, graveaccent
Ú Ú Ú uppercaseU, acuteaccent
Û Û Û uppercaseU, circumflexaccent
Ü Ü Ü uppercaseU, dieresisorumlautmark
Ý Ý Ý uppercaseY, acuteaccent
Þ Þ Þ uppercaseTHORN, Icelandic
ß ß ß lowercasesharps, German(szligature)
à à à lowercasea, graveaccent
á á á lowercasea, acuteaccent
â â â lowercasea, circumflexaccent
ã ã ã lowercasea, tilde
ä ä ä lowercasea, dieresisorumlautmark
å å å lowercasea, ring
æ æ æ lowercaseaedipthong(ligature)
ç ç ç lowercasec, cedilla
è è è lowercasee, graveaccent

Named
Entity Numeric
Entity Glyph Description
é é é lowercasee, acuteaccent
ê ê ê lowercasee, circumflexaccent
ë ë ë lowercasee, dieresisorumlautmark
ì ì ì lowercasei, graveaccent
í í í lowercasei, acuteaccent
î î î lowercasei, circumflexaccent
ï ï ï lowercasei, dieresisorumlautmark
ð ð ð lowercaseeth, Icelandic
ñ ñ ñ lowercasen, tilde
ò ò ò lowercaseo, graveaccent
ó ó ó lowercaseo, acuteaccent
ô ô ô lowercaseo, circumflexaccent
õ õ õ lowercaseo, tilde
ö ö ö lowercaseo, dieresisorumlautmark
÷ ÷ divisionsign
ø ø ø lowercaseo, slash
ù ù ù lowercaseu, graveaccent
ú ú ú lowercaseu, acuteaccent
û û û lowercaseu, circumflexaccent

Named
Entity Numeric
Entity Glyph Description
ü ü ü lowercaseu, dieresisorumlautmark
ý ý ý lowercasey, acuteaccent
þ þ þ lowercasethorn, Icelandic
ÿ ÿ ÿ lowercasey, dieresisorumlautmark