为什么字符串的行为类似于ValueType

问题描述:

执行这段代码后,我感到困惑,其中的字符串似乎就像值类型一样.我想知道赋值运算符是否正在对字符串的相等运算符之类的值进行运算.

I was perplexed after executing this piece of code, where strings seems to behave as if they are value types. I am wondering whether the assignment operator is operating on values like equality operator for strings.

这是我测试此行为的代码段.

Here is the piece of code I did to test this behavior.

using System;

namespace RefTypeDelimma
{
    class Program
    {
        static void Main(string[] args)
        {
            string a1, a2;

            a1 = "ABC";
            a2 = a1; //This should assign a1 reference to a2
            a2 = "XYZ";  //I expect this should change the a1 value to "XYZ"

            Console.WriteLine("a1:" + a1 + ", a2:" + a2);//Outputs a1:ABC, a2:XYZ
            //Expected: a1:XYZ, a2:XYZ (as string being a ref type)

            Proc(a2); //Altering values of ref types inside a procedure 
                      //should reflect in the variable thats being passed into

            Console.WriteLine("a1: " + a1 + ", a2: " + a2); //Outputs a1:ABC, a2:XYZ
            //Expected: a1:NEW_VAL, a2:NEW_VAL (as string being a ref type)
        }

        static void Proc(string Val)
        {
            Val = "NEW_VAL";
        }
    }
}

在上面的代码中,如果我使用自定义类而不是字符串,则会得到预期的行为.我怀疑这与字符串不变性有关吗?

In the above code if I use a custom classes instead of strings, I am getting the expected behavior. I doubt is this something to do with the string immutability?

对此表示欢迎.

   a2 = "XYZ";

这是编译器提供的语法糖.此语句的更准确表示是:

That's syntax sugar, provided by the compiler. A more accurate representation of this statement would be:

   a2 = CreateStringObjectFromLiteral("XYZ")

解释了a2如何简单地获取对新字符串对象的引用并回答您的问题.实际代码非常通用,因此经过高度优化.IL中提供了专用的操作码:

which explains how a2 simply gets a reference to a new string object and answers your question. The actual code is highly optimized because it is so common. There's a dedicated opcode available for it in IL:

   IL_0000:  ldstr      "XYZ"

字符串文字被收集到程序集内的表中.这使JIT编译器可以非常有效地实现赋值语句:

String literals are collected into a table inside the assembly. Which allows the JIT compiler to implement the assignment statement very efficiently:

   00000004  mov         esi,dword ptr ds:[02A02088h] 

单条机器码指令无法胜任.更重要的是:一个非常显着的结果是字符串对象不存在于堆中.垃圾收集器不会因为它识别出字符串引用的地址不在堆中而烦恼它.因此,您甚至不必支付收集费用.不能打败.

A single machine code instruction, can't beat that. More so: one very notable consequence is that the string object doesn't live on the heap. The garbage collector doesn't bother with it since it recognizes that the address of the string reference isn't located in the heap. So you don't even pay for collection overhead. Can't beat that.

还请注意,此方案很容易实现字符串实习.编译器只是为相同的文字生成相同的LDSTR参数.

Also note that this scheme easily allows for string interning. The compiler simply generates the same LDSTR argument for an identical literal.