SoFunction
Updated on 2025-04-14

WAP website building WML language grammar basic tutorial page 4/6


2.3 Basic knowledge of WML language
In the previous section, we introduced the basic structure of WML programs. Next, we introduced the basic knowledge of the WML language, mainly including the character set, variables, data types and basic components of WML programs.
2.3.1 WML character set and encoding
WML uses XML's character set, that is, the general character set ISO/IEC-1062., which is the unified character encoding standard Unicode 2.0. At the same time, WML also supports subsets of character sets in other series, such as UTF-8, ISO-8859-1, or UCS-2, etc. in:
UTF-8 refers to the transformation format 8 (Transformation Format 8) of the universal character set UCS (Universal Character Set), which mainly transmits the conversion encoding of the international character set. UTF-8 adopts 8-bit encoding of UCS characters, providing a very secure encoding format, which can effectively avoid eavesdropping, interception and illegal decryption during data transmission. At the same time, UTF-8 is fully compatible with 7-bit ACSII code and will not affect the programs implemented in this type of encoding; its encoding rules are very strict, which can effectively avoid synchronous transmission errors, and it also supports other character sets to provide sufficient space.
The ISO-8859-1 character set is an extended set of the ACSII character set formulated by the International Organization for Standardization ISO (International Standardization Organization), which can represent characters in all Western European languages. Like ISO Latin-1, ISO-8859-1 is very similar to the character set of the American National Standards Institute (American National Standards Institute), which is commonly used in Windows environments, and does not need to be distinguished in most cases. The HTTP protocol uses the ISOLatin-1 character set without being specified. Therefore, in order to represent non-ACSII (non-ACSII) characters in the WML page, developers need to use the corresponding ISO Latin-1 encoded characters.
UCS-2 is a 2-byte (i.e., 16-bit) encoding standard for the custom universal multi-8-bit encoding character set (Universal Multiple-Octer Coded Character Set) in ISO 1062. The character encoding value is equal to the standard encoding value of Unicode characters.
WML documents can be encoded using any character encoding standard defined by the HTML 2.0 specification. Generally speaking, the character encoding of WML documents needs to be converted into another encoding format to adapt to the character standards used by the mobile browser of WAP users. Otherwise, the mobile browser will not be able to display the characters in the WML page. However, some character information may be lost during encoding conversion. Therefore, if the encoding conversion of WML documents is performed on the user side, some result information may be lost and cannot be browsed by the user. Therefore, if necessary, we should try to complete the encoding conversion before the WML page is delivered to the user's browser.
To solve this problem, on the one hand, we need to supplement the WML data type for the web server so that the server can accurately transmit this data. On the other hand, we need to formulate the principles of encoding and conversion.
2.3.2 Basic rules for using WML characters
WML is a relatively strict language, and the corresponding rules must be followed in character use. These basic rules mainly include the following aspects:
1) Case sensitive. In WML, both the tag elements and the content of the attribute are case sensitive, which inherits the strict characteristics of XML, and any case errors can lead to access errors.
Generally speaking, all tags, properties, regulations and enums of WML and their acceptable values ​​must be lowercase, and Card's names and variables can be uppercase and lowercase, but it is case sensitive. The name of the parameter and the value of the parameter are case sensitive, such as variable1, Variable1 and variable1 are all different parameters. 2) Spaces. For consecutive null characters, the program only needs one space when running. There cannot be spaces between attribute names, symbols (=) and values.
3) Tags. The values ​​of attributes in the tag must be enclosed in double quotes (") or single quotes ('). For labels that do not appear in pairs, a slash (/) must be added before the sign (>). For example, the new line label must be written as <br/> to be correct.
4) Content not displayed. In WML, characters not displayed mainly include line breaks, carriage return, spaces and horizontal tabs, and their 8-bit hexadecimal codes are 10, 13, 32 and 9 respectively.
When the program is executed, WML will ignore all more than one undisplayed characters, that is, WML will convert one or more consecutive line breaks, carriage returns, horizontal tabs and spaces into one empty one.
5) Reserve characters. These are some special characters in WML, such as less than sign (<), greater than sign (>), single quote "'", double quote """, and sum (&).
6) Display Chinese characters. If you want the WML program to be able to display Chinese characters when executing, you only need to use encoding to specify the Chinese character set at the beginning of the program. For example: <?xml version="1.0" encoding="gb2312">.
Note: The form and method of specifying a Chinese character set may vary depending on the development tool or WAP phone.
Previous page123456Next pageRead the full text