SoFunction
Updated on 2025-04-09

Solution to request GBK encoding web page garbled code using RestSharp in .net6 environment

public IActionResult GetHiPda()
        {
            cookies = @"__utmz=128828693.1622702936.1.=(direct)|utmccn=(direct)|utmcmd=(none); cdb_cookietime=2592000; cdb_auth=fd05ACWP2GIZl8k0oqBaZUtQ8WjXIxIXESeqpdSfAzikXEX4tYdJM%2B4FIBRY7jXLyGQs0yjP3K2kgFK6MFe6fcJkrIH5; smile=1D1; discuz_fastpostrefresh=0; __utmc=128828693; cdb_visitedfid=2D6; cdb_sid=0ZwKQ7; __utma=128828693.1700824799.1622702936.1623767772.1623808037.73; __utmt=1; __utmb=128828693.1.10.1623808037; checkpm=1";
            string url = @"/forum/?fid=2";
            var client = new RestClient(url);
            var request = new RestRequest();
            ("cookie", cookies);
            var response = (request);
            ();
            var data=(, ("gbk"));
            return Content(data);
        }

Introducing NuGet package:

Encoding in CLR is used for conversion between bytes and characters.

Encoding in CLR is in the namespace. It is an abstract class, so it cannot be instantiated directly. It mainly has the following derived classes: ASCIIEnding, UnicodeEncoding, UTF32Encoding, UTF7Encoding, UTF8Encoding. You can choose a suitable Encoding according to your needs for encoding and decoding. You can also call the static properties of Encoding ASCII, Unicode, UTF32, UTF7, UTF8 to construct an Encoding. Where Unicode represents 16-bit Encoding. Calling a static property is the same as instantiating a subclass, as shown in the following code.

1 Encoding encodingUTF8 = Encoding.UTF8;

2 Encoding encodingUTF8 = new UTF8Encoding(true);

Here are some simple descriptions of these types:

  • ASCII encoding Encode 16-bit characters into ASCII code, and can only convert 16 characters with a value less than Ox0080, and be converted into a single byte, that is, one character corresponds to one byte. This kind of encoding can be used when all characters are within the ASCII range (0X00~0X7F). It is very fast and suitable for characters in the United Kingdom and the United States. This kind of encoding is very limited, and Chinese characters will be converted into garbled code. The CLR corresponds to ASCIIEndoing.
  • UTF-16 Each character is encoded into 2 bytes. It will not have any impact on the characters and will not involve compression processing. The performance is very good because the characters in the CLR are also 16-bit Unicode. Correspond to UnicodeEncoding in CLR.
  • UTF-32 uses 4 bytes to encode into one character. From a memory point of view, it is not a high-performance encoding scheme, because the first character is 4 bytes, which particularly occupies memory, so it is rarely used for encoding and decoding of files and network streams. The corresponding UTF32Encoding is in the CLR.
  • UTF-8 characters with values ​​below Ox0080 are compressed into one character, that is, ASCII code; characters with values ​​between 0X0080--0X07FF are converted into 2 characters, suitable for use in Europe and the Middle East. 0X0800 or above is converted into 3 characters, suitable for characters in East Asia. The proxy item is converted into 4 bytes. Therefore, it is a very popular encoding that works on the internet. It is not efficient in handling characters above 0X0800 UTF-16. The corresponding UTF8Encoding is in the CLR.
  • UTF-7 This is usually used in old systems, and the system was represented by 7-bit values ​​at that time. Currently, it has been eliminated by the Unicode Association. The corresponding UTF7Encoding is in the CLR.

The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.