string is a very special data type. It is both a primitive type and a reference type. At compile and runtime, .Net has done some optimization work on it. Formal these optimizations can sometimes confuse programmers and make strings look difficult to figure out. This article has four sections in total, let’s talk about the strange side of string.
one. Constant string
To have a more comprehensive understanding of the stirng type, you must first be clear about the value types and reference types in .Net.
In C#, the following data types are value types: bool, byte, char, enum, sbyte, and numeric types (including nullable types)
The following data types are reference types: class, interface, delegate, object, stirring
Have you seen it? The stirng we are going to discuss is very prominent. A variable declared as a string type stored in the heap is a complete reference type. Then many students will have questions about the following code. Does the string type also "touch the whole body"? Let's first look at the mystery of the following three lines of code:
string a = "str_1"; string b = a; a = "str_2";
Don’t say it’s boring, this must be made clear! In the above code, there is a hidden secret on line 3: its function can be understood as new creation, rather than a modification to the variable "a". Here is the IL code that illustrates this:
.maxstack 1 .locals init ([0] string a,[1] string b) IL_0000: nop IL_0001: ldstr "str_1" IL_0006: stloc.0 IL_0007: ldloc.0 IL_0008: stloc.1 IL_0009: ldstr "str_2" IL_000e: stloc.0 //above2Line correspondence C#code a = "str_2";IL_0015: ret
It can be seen that the ldstr instruction creates the string "str_1" and associates it to the variable "a"; lines 7 and 8 directly pop the value at the top of the stack and associates it to the variable "b"; 9 and 10 creates the string "str_2" from ldstr and is associated in the variable "a" (it does not modify the old value of the variable a as we imagined, but produces a new string);
In C#, if you instantiate a class with the new keyword, the corresponding one is completed by the IL instruction newobj; and when creating a string, it is completed by the ldstr instruction. When seeing the ldstr instruction, we can think that IL hopes to create a new string. (Note: IL wants to create a string, and whether it is created in the end must be determined by the string's residency mechanism at runtime. This will be introduced in the following chapter.)
Therefore, the third line of C# code (a = "str_2";) looks like changing the old value of variable a "str_1", but in fact, a new string "str_2" is created, and then the pointer of variable a is pointed to the memory address of "str_2", while "str_1" is still not affected in memory, so the value of variable b has not changed at all - this is the constant of string. Students, you must keep this in mind. In .Net, once a string type object is created, it cannot be modified! Operations including ToUpper, SubString, Trim and other operations will generate new strings in memory.
This section focuses on review: Due to the constantness of the Stirng type, students often misunderstand that although string is a reference type, it often shows the characteristics of value. This is because they do not understand the constantness of string and is not the "characteristic of value" at all. For example:
string a = "str_1"; a = "str_2";
This will create two strings "str_1" and "str_2" in memory, but only "str_2" is used, and "str_1" will not be modified or disappeared, which wastes memory resources. This is why it is recommended to use StringBuilder when doing a lot of string operations.
two. Residence of strings in .Net (important)
In the first section, we talk about the constantness of strings, which introduces us to another important feature of strings: string residency.
In some aspects, it is the constant of strings that creates the string residency mechanism and also opens the door to facilitate the thread synchronization of strings (the same string object can be accessed in different application domains, so the resided string is process-level, and garbage collection cannot release these string objects, and these objects will be released only if the process ends).
We use the following 2 lines of code to illustrate the residency of strings:
string a = "str_1"; string b = "str_1";
Please think about it, how many string objects will be generated in memory by these 2 lines of code? You might think that 2 are generated: Since 2 variables are declared, line 1 of the program will generate "str_1" in memory for reference by variable a; line 2 will generate a new string "str_1" for reference by variable b, but is this really the case? Let's use the ReferenceEquals method to see the memory reference addresses of variables a and b:
string a = "str_1"; string b = "str_1"; (ReferenceEquals(a,b)); //Compare whether a and b are from the same memory reference//Output:True
Ha, have you seen it? We use the ReferenceEquals method to compare a and b. Although we declared 2 variables, they actually come from the same memory address! This means that string b = "str_1"; does not generate a new string in memory at all.
This is because, when processing strings in .Net, there is a very important mechanism called the string retention mechanism. Since string is a type that is used frequently in programming, CLR only allocates memory once for the same string. CLR maintains a special data structure inside. We call it a string pool. It can be understood as a HashTable. This HashTable maintains some strings used in the program. The HashTable Key is the value of the string, and Value is the memory address of the string. Generally speaking, if a variable of type string is created in a program, the CLR will first traverse the string with the same Hash Code in the HashTable. If it is found, it will directly return the address of the string to the corresponding variable. If it does not exist, a new string object will be created in memory.
Therefore, these 2 lines of code only produce 1 string object in memory, and variable b and a share "str_1" in memory.
OK, let’s understand the following three lines of code based on the string constancy mentioned in Section 1 and the residency mechanism mentioned in Section 2:
string a = "str_1"; //Declare the variable a and point the pointer of variable a to the address of the newly generated "str_1" in memorya = "str_2"; //CLR will first traverse the string pool to see if "str_2" already exists. If not, create a new "str_2", and modify the pointer of the variable a to point to the "str_2" memory address, and "str_1" remains unchanged. (Constant string)string c = "str_2"; //CLRFirst, it will traverse in the string pool"str_2"Whether it already exists,If there is,Then directly transfer the variablecpointer to"str_2"Address。(String reside)
So what if you create a string dynamically? Will strings still reside?
We explain the performance of the residency mechanism when dynamically creating strings:
(1).Connection of string constants
string a = "str_1" + "str_2"; string b = "str_1str_2"; (ReferenceEquals(a,b)); //Compare whether a and b are from the same memory reference//Output :True
IL code:
.maxstack 1 .locals init ([0] string a,[1] string b) IL_0000: nop IL_0001: ldstr “str_1str_2” IL_0006: stloc.0 IL_0007: ldstr “str_1str_2” IL_000c: stloc.1 IL_000d: ret
Lines 1 and 6 correspond to c# code string a = "str_1" + "str_2"; 7 and 8 correspond to c# string b = "str_1str_2"; It can be seen that when string constant connection is connected, the compiler has calculated the result of string constant connection before the program is compiled into IL code. The ldstr instruction directly handles the string value calculated by the compiler, so in this case, the string resident mechanism is effective!
(2).Connection of string variables
string a = "str_1"; string b = a + "str_2"; string c = "str_1str_2"; (ReferenceEquals(b,c)); //Output:False
IL code:
.maxstack 2 .locals init ([0] string a, [1] string b, [2] string c) IL_0000: nop IL_0001: ldstr “str_1” IL_0006: stloc.0 IL_0007: ldloc.0 IL_0008: ldstr “str_2” IL_000d: call string [mscorlib]::Concat(string,string) IL_0012: stloc.1 IL_0013: ldstr “str_1str_2” IL_0018: stloc.2 IL_0019: ret
Among them, lines 1 and 6 correspond to string a = "str_1"; lines 7, 8 and 9 correspond to string b = a + "str_2"; IL uses the Concat method to connect strings, and lines 13 and 18 correspond to string c = "str_1str_2"; it can be seen that when string variables are connected, IL uses the Concat method to generate the final connection result at runtime, so in this case, the string retention mechanism is invalid!
(3). Explicit instantiation
string a = "a"; string b = new string('a',1); (ReferenceEquals(a, b)); //Output False
IL code:
.maxstack 3 .locals init ([0] string a,[1] string b) IL_0000: nop IL_0001: ldstr "a" IL_0006: stloc.0 IL_0007: ldc. 97 IL_0009: ldc.i4.1 IL_000a: newobj instance void [mscorlib]::.ctor(char, int32) IL_000f: stloc.1 IL_0010: ret
This situation is easier to understand. IL uses newobj to instantiate a string object, and the residency mechanism is invalid. From the line of code, we can see that the string type is actually implemented by char[]. The birth of a string is never as simple as we think. Only by cooperating the stack and heap at the same time will a string be born. This will be introduced in Section 4.
Of course, when the string reside mechanism is invalid, we can easily use it to manually reside in the string pool, such as the following code:
string a = "a"; string b = new string('a',1); (ReferenceEquals(a, (b))); //Output:True (Program returnsTure,Description variables"a"and"b"From the same memory address。)
three. Interesting comparison operation
In the first and second sections, we introduce the constancy and residency of strings respectively. If this classmate feels that he has mastered the above content, then check his learning results in the third section! The following 10 simple codes will compare value comparisons with address references to illustrate what was mentioned in the previous two sections. You can also use these codes to test your understanding of string.
Code 1:
string a = "str_1"; string b = "str_1"; ((b)); (ReferenceEquals(a,b)); //Output: True (Equals compares the value of string object)//Output:True (ReferenceEqualsCompare references to string objects,Due to the string residency mechanism,aandbThe same reference)
Code 2:
string a = "str_1str_2"; string b = "str_1"; string c = "str_2"; string d = b + c; ((d)); (ReferenceEquals(a, d)); //Output: True (Equals compares the value of string object)//Output:False(ReferenceEqualsCompare references to string objects,Due to variabledThe value is the result of the variable connection,Invalid string resident mechanism)
Code Three:
string a = "str_1str_2"; string b = "str_1" + "str_2"; ((b)); (ReferenceEquals(a, b)); //Output: True (Equals compares the value of string object)//Output:True (ReferenceEqualsCompare references to string objects,Due to variablebThe value of the constant connection is the result of,String residency mechanism is effective。If variablebThe value of“constant+variable”The way to get,The string resides is invalid)
Code 4:
string a = "str_1"; string b = (a); ((b)); (ReferenceEquals(a, b)); //Output: True (Equals compares the value of string object)//Output:False(ReferenceEqualsCompare references to string objects,CopyThe operation has produced a newstringObject)
Code 5:
string a = "str_1"; string b = (a); b = (b); ((b)); (ReferenceEquals(a, b)); //Output: True (Equals compares the value of string object)//Output:True (ReferenceEqualsCompare references to string objects,Implement string residency)
Code 6:
string a = "str_1"; string b = (a); string c = "str_1"; ((object)a == (object)b); ((object)a == (object)c); //Output: False(When both sides are reference types, the referenced addresses are compared, so a and b are different references)//Output:True (“==”When both sides are reference types,Compare the referenced address,soaandcThe same reference)(original:ReferenceEqualsCompare references to string objects,aandcDue to the string residency mechanism,Quote the same)
Code 7:
string a = "str_1"; string c = "str_1"; (a == c); //Output: True (We mentioned just now that when "==" is a reference type on both sides, the reference address is compared; if it is a value type, the reference and value need to be compared. string is a reference type, so does the above code compare the addresses of variables a and c or the addresses and values? The answer is:Comparison of address and value!BecausestringWhen comparing types,“==”Have been reloaded as“Equals”It's,so,Even though you're using it“==”Comparison of two reference types,But in fact, it's using it“Equals”Compare their addresses and values!(Compare the address first,Compare the value without waiting))
Code 8:
string a = "a"; string b = new string('a', 1); ((b)); (ReferenceEquals(a, b)); //Output: True (Equals compare values, the values of a and b are the same)//Output:False(ReferenceEqualsCompare references to string objects)
Code Nine:
string a = "a"; string b = new string('a', 1); (((b))); (ReferenceEquals(a, (b))); //Output: True (Equals comparison value, whether Intern or not)//Output:True (ReferenceEqualsCompare references to string objects,InternAlreadybResides in the string pool)
Code 10:
string a = "str"; string b = "str_2".Substring(0,3); ((b)); (ReferenceEquals(a, b)); //Output: True (Equals compare values, the values of a and c are the same)//Output:False(ReferenceEqualsCompare references to string objects,SubstringThe operation produces a new string object)
Four. Art Sea Pipe
This section will mainly introduce some common questions about string.
(1) The difference between "string = " and "new stirng()"
string test = "a"; string test = new string('a', 1);
The effects of the above two lines of code are the same. The difference is that the time of loading "a" is different: the "a" in the first line is a constant, which is already placed in a place called a constant pool during the compilation period. The constant pool usually loads some data determined during the compilation period, such as classes, interfaces, etc.; while the second line is a string object with a value of "a" generated by the runtime CLR in the heap, so the latter has no string reside.
(2). The difference between string and String
The big name of String is that when compiled into IL code, string and will generate exactly the same code: (ps: long and System.Int64, float and etc. also have this feature)
C# code:
string str_test = "test"; Str_test = "test";
IL code:
// Code size 14 (0xe).maxstack 1 .locals init ([0] string str_test,[1] string Str_test) IL_0000: nop IL_0001: ldstr "test" IL_0006: stloc.0 IL_0007: ldstr "test" IL_000c: stloc.1 IL_000d: ret
Therefore, the difference between the two is not the underlying layer, but the fact that string is a primitive type similar to int; System. String is the basic type of framework class library (FCL), and there is a direct correspondence between the two.
(3).StringBuilder
StringBuilder provides efficient way to create strings. The string represented by StringBuilder is variable (non-constant). When you need to use "+" to connect string variables in multiple places, it is recommended to use StringBuilder to complete it, and finally call its ToString() method to output. When the ToString() method of StringBuilder is called, StringBuilder will return a string field reference maintained by it internally. If StringBuilder is modified again, it will create a new string. At this time, the new string is modified, and the string that has been returned will not change.
StringBuilder has two more important internal fields, which you need to master:
m_MaxCapacity: The maximum capacity of StringBuilder, which specifies that it can be placed at most
The default value is. m_MaxCapacity cannot be changed once specified.
m_StringValue: A character array string maintained by StringBuilder can actually be understood as a string. The Tostring() method overridden by StringBuilder returns this field.
This is the end of this article about the detailed explanation of C# String string cases. For more detailed explanation of C# strings, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!