SoFunction
Updated on 2025-03-02

Solution to UTF8 to GB2312 garbled code problem

I recently did a small project and encountered such problems. I will record it and it can be regarded as a summary.
This project is divided into two parts, one is news data collection, the other is the review of information collection, and finally the XML file is generated.

After the data collected is edited by the user, an ACCESS file must be exported and then imported into the information audit system. The field type that stores news information in the ACCESS library is ntext type, and the corresponding fields of varchar(max) type in the audit system library. After importing, it is found that some whitespace characters will appear garbled, which is represented by question marks (?). In fact, after the subsequent test, this is not a blank (space) character, but a special character. What should I do? After several tests, it was found that the varchar(max) type must be changed to nvarchar(max) type, so that the imported data will no longer have such problems.

However, during the subsequent test, you will find that after changing the imported collection information (through the .net program editing function), there is a garbled problem in this piece of information in the database. After research, it is found that this problem will not occur if you write in the insert statement in this way. For example, insert into table name (news) values(N'"+ updated value +""), why add N? You will understand after going to Baidu.

At this point, I finally got comfort in my heart, but the subsequent problems made people feel depressed. . . . . .
The reviewed information must generate XML-type files, and the XML must be encoded in GB2312. Because many websites collected in the news websites use UTF8 encoding, so garbled code appears in the process of conversion (it is still the special "blank" character). What should I do? The online introduction is enough to convert UTF8 into GB2312, but in fact, I found that the problem still cannot be solved. In the morning, in order to solve this problem, there was still no way. When I was depressed, I suddenly thought of using the VS debugging function to see what this special character is. Finally, I read out the value of this field in the database and then convert it into a character array. (); I looked one by one and found that the character that caused the garbled code was ' ' Note the blank in the quotes. This is not a space, but a special character that cannot be recognized in GB2312. At this time, I suddenly thought that can the value of this character be replaced directly with a space? Take action immediately, and sure enough, the garbled problem was solved. I really want to be depressed, this one wasted half of the time.

Note that you must use the debugged value (because this is the special character that really causes garbled code), and paste it in the form during debugging.
Copy the codeThe code is as follows:

content = (" ", " ");