SoFunction
Updated on 2025-04-09

Implementation of C# string to unicode characters

1. Get the Unicode value of each character in the string

usecharImplicit conversion of type orConvert.ToInt32Methods can obtain the Unicode value of a character.

Sample code:

using System;

class Program
{
    static void Main()
    {
        string input = "Hello";
        foreach (char c in input)
        {
            int unicodeValue = c; // Implicitly convert to Unicode value            ($"character: {c}, Unicode value: {unicodeValue}");
        }
    }
}

Output:

Character: H, Unicode Value: 72
Character: e, Unicode Value: 101
Character: l, Unicode Value: 108
Character: l, Unicode Value: 108
Character: o, Unicode Value: 111
Character:  , Unicode Value: 32
Character: You, Unicode Value: 20320
Character: OK, Unicode value: 22909

2. Format Unicode values ​​as \u escape characters

If you need to format Unicode values ​​as\uEscape characters at the beginning (e.g.\u0041Indicates charactersA), can be usedToString("X4")Convert Unicode value to a 4-bit hexadecimal string.

Sample code:

using System;

class Program
{
    static void Main()
    {
        string input = "Hello";
        foreach (char c in input)
        {
            int unicodeValue = c;
            string unicodeEscape = $"\\u{unicodeValue:X4}"; // Format as \uHHHH            ($"character: {c}, Unicode 转义character: {unicodeEscape}");
        }
    }
}

Output:

Characters: H, Unicode escape characters: \u0048
Characters: e, Unicode escape characters: \u0065
Characters: l, Unicode escape characters: \u006C
Characters: l, Unicode escape characters: \u006C
Characters: o, Unicode escape characters: \u006F
Character:  , Unicode escape character: \u0020
Character: You, Unicode escape characters: \u4F60
Character: OK, Unicode escape characters: \u597D

3. Convert the overall string to Unicode escape characters

If you need to convert the entire string to Unicode escape character format, you can iterate over the string and splice the results.

Sample code:

using System;
using ;

class Program
{
    static void Main()
    {
        string input = "Hello";
        StringBuilder unicodeBuilder = new StringBuilder();

        foreach (char c in input)
        {
            int unicodeValue = c;
            ($"\\u{unicodeValue:X4}");
        }

        string unicodeString = ();
        (unicodeString); // Output: \u0048\u0065\u006C\u006C\u006F\u0020\u4F60\u597D    }
}

4. Process Surrogate Pair (Proxy Pair)

For some Unicode characters (such as emojis or certain special characters), they may be composed of twocharValue (called proxy pair) represents. Need to useandchar.ConvertToUtf32To deal with it.

Sample code:

using System;
using ;

class Program
{
    static void Main()
    {
        string input = "Hello 😊Hello";
        StringBuilder unicodeBuilder = new StringBuilder();

        for (int i = 0; i < ; i++)
        {
            if ((input, i))
            {
                // Handle proxy pairs                int codePoint = char.ConvertToUtf32(input, i);
                ($"\\U{codePoint:X8}"); // Use \U to represent 8-bit hexadecimal                i++; // Skip the next char            }
            else
            {
                // Handle ordinary characters                int unicodeValue = input[i];
                ($"\\u{unicodeValue:X4}");
            }
        }

        string unicodeString = ();
        (unicodeString); // Output: \u0048\u0065\u006C\u006C\u006F\u0020\U0001F60A\u0020\u4F60\u597D    }
}

5. Summary

  • usecharImplicit conversion orConvert.ToInt32Gets the Unicode value of the character.
  • useToString("X4")Format Unicode values ​​as\uHHHHEscape characters.
  • For proxy pair characters, usechar.ConvertToUtf32and\UHHHHHHHHFormat.
  • By traversing the string and splicing the results, you can convert the entire string into Unicode escape character format.

With these methods, you can easily convert strings to Unicode characters or escape character formats in C#.

This is the end of this article about the implementation of C# string to unicode characters. For more related content on C# string to unicode characters, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!