Before we talk about how .net randomly generates Chinese characters, let me tell you about the composition and principles of Chinese character encoding.
1. Chinese character encoding principle
How to randomly generate Chinese characters? Where did Chinese characters come from? Is there a background data table that stores all the required Chinese characters, using the program
Just take out a few Chinese characters randomly? Use the background database to save all Chinese characters and get them randomly when using them.
It’s also a way, but there are so many Chinese characters, how can I make them? In fact, you can do all this without using any background database.
To know how to generate Chinese characters, you must first understand the coding principles of Chinese characters.
In 1980, in order to make each Chinese character have a national unified code, my country promulgated the first national standard for Chinese character encoding: GB2312-80 "Information Exchange for Chinese
The basic collection of characters encoding characters, referred to as GB2312 for short, this character collection is from Chinese information processing technology in my country
The foundation of development is also a unified standard for all Chinese character systems in China. Later, the national standard GB18030-2000 was announced, "Basic character set for Chinese characters encoding for information exchange
Set expansion, referred to as GB18030, if you are involved in coding and localization during programming, friends should be familiar with GB18030
Know. This is the most important Chinese character encoding standard in my country after GB2312-1980 and GB13000-1993, and it is also the basis that my country's computer systems must follow in the future.
One of the sexual standards.
Currently in the Chinese WINDOWS operating system, the default code page in .NET programming is GB18030 Simplified Chinese. But in fact, if Chinese characters are generated
It is enough to use the GB2312 character set for the certificate code. Except for the Chinese characters we usually know
, also contains many Chinese characters that we don’t know and rarely see in normal times. If there are many Chinese characters that we don't know in the generated Chinese character verification code, let us enter it, for
It is not a good thing for friends who use pinyin input method. Wubi users can barely type it based on the appearance of Chinese characters.
hehe! Therefore, we don’t need to use all Chinese characters in the GB2312 character set.
Chinese characters can be represented by location codes. See, these two tables are the same, but one uses hexadecimal partitions and the other uses the location.
numeric position representation. For example, the hexadecimal position code of the word "good" is ba c3, the first two digits are the area, the last two digits represent the position, ba is in the 26th area, and the "good" is here
The 35th position of the Chinese character in the district is the c3 position, so the numerical code is 2635. This is the location principle of GB2312 Chinese characters. According to the "Chinese Character Location Code Table", we can find
Area 15, that is, Area AF, did not have Chinese characters before, only a few symbols, and Chinese characters start from Area 16, B0, this is
Why does the GB2312 character set start from area 16?
2. Analysis of the principle of Chinese character encoding processing.
Can be used in .Netto handle encoding in all languages. The namespace contains many encoded classes for operation and conversion. That
In-houseEncodingClasses are classes that focus on Chinese character encoding. By querying the Encoding class in the .NET document
Methods We can find that all related to literal encoding are byte arrays, and there are two very useful methods:
()Method encodes all or part of the specified String or character array into a byte array
()Method decodes the specified byte array into a string.
That's right, we can encode Chinese characters into byte arrays through these two methods. Also, knowing that the byte array encoding of Chinese character GB2312 can decode the byte array into Chinese character characters. After encoding the word "good" into a byte array
Encoding gb=("gb2312");
object[] bytes= ("good");
I found that I got a byte array of length 2 bytes, using
string lowCode = (bytes[0], 16); //Fetch the content of element 1 encoded (two-digit hexadecimal)
string highCode = (bytes[1], 16);//Fetch the element 2 encoded content (two-digit hexadecimal)
Later, I found that the content of the byte array bytes 1 hexadecimal code turned out to be {ba,c3}, which happened to be the hexadecimal position code of the word "good" (see area
bitcode table).
Therefore, we can randomly generate a hexadecimal byte array of length 2, usingGetString The () method decodes it to get Chinese characters. No
For the generation of Chinese Chinese character verification codes, there were no Chinese characters in the 15th area, that is, AF area, before, only
A small number of symbols, Chinese characters start from B0 in District 16, and Chinese characters start from location D7 and are complicated Chinese characters that are difficult to see, so these must be discharged. So follow
The first bit range of the machine-generated Chinese character hexadecimal position code is between B, C, and D. If the first bit is D, the second bit
The location code cannot be a hexadecimal number after 7. Let’s look at the location code table and find that the first and last positions of each area are empty, without Chinese characters, so it is random
If the third digit of the generated position code is A, the fourth digit cannot be 0; if the third digit is F, the fourth digit is
It can't be F.
Okay, after knowing the principle, the program to randomly generate Chinese characters will come out. The following is the C# console code for generating 4 random Chinese characters:
/// <summary>
/// Randomly generate Chinese characters
/// </summary>
/// <param name="strlength">Length (4 digits)</param>
/// <returns></returns>
public string CreateCode(int strlength)
{
//Define a string array to store the components of Chinese character encoding
string[] r = new String[16] { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "b", "c", "d", "e", "f" };
Random rnd = new Random();
//Define an object array to use
object[] bytes = new object[strlength];
/**/
/* Each loop generates a hexadecimal byte array containing two elements at a time and puts it into the bject array
Each Chinese character consists of four location codes
The first bit of the position code and the second bit of the position code are used as the first element of the byte array
The third bit of the position code and the fourth bit of the position code are used as the second element of the byte array
*/
for (int i = 0; i < strlength; i++)
{
//The first position of the area code
int r1 = (11, 14);
string str_r1 = r[r1].Trim();
//The second position code
rnd = new Random(r1 * unchecked((int)) + i);//Replace the seed of the random number generator to avoid duplicate values
int r2;
if (r1 == 13)
r2 = (0, 7);
else
r2 = (0, 16);
string str_r2 = r[r2].Trim();
//The third position code
rnd = new Random(r2 * unchecked((int)) + i);
int r3 = (10, 16);
string str_r3 = r[r3].Trim();
//The fourth position code
rnd = new Random(r3 * unchecked((int)) + i);
int r4;
if (r3 == 10)
{
r4 = (1, 16);
}
else if (r3 == 15)
{
r4 = (0, 15);
}
else
{
r4 = (0, 16);
}
string str_r4 = r[r4].Trim();
//Define the random Chinese character position code generated by two byte variable storage
byte byte1 = (str_r1 + str_r2, 16);
byte byte2 = (str_r3 + str_r4, 16);
//Storing two byte variables in byte array
byte[] str_r = new byte[] { byte1, byte2 };
//Put the byte array of a Chinese character generated into the object array
(str_r, i);
}
//Get GB2312 encoded page (table)
Encoding gb = ("gb2312");
//Decode Chinese characters based on the byte array encoded by Chinese characters
string str1 = ((byte[])(bytes[0], typeof(byte[])));
string str2 = ((byte[])(bytes[1], typeof(byte[])));
string str3 = ((byte[])(bytes[2], typeof(byte[])));
string str4 = ((byte[])(bytes[3], typeof(byte[])));
string txt = str1 + str2 + str3 + str4;
return txt;
}
The above code implements random generation of Chinese characters. One of the above points needs to be explained that the code can only run under the Chinese version of Windows, because the character set with GB is required if your computer is an operating system in other languages.