


I was perplexed after executing this piece of code, where strings seems to behave as if they are value types. I am wondering whether the assignment operator is operating on values like equality operator for strings.


Here is the piece of code I did to test this behavior.

using System;

namespace RefTypeDelimma
    class Program
        static void Main(string[] args)
            string a1, a2;

            a1 = "ABC";
            a2 = a1; //This should assign a1 reference to a2
            a2 = "XYZ";  //I expect this should change the a1 value to "XYZ"

            Console.WriteLine("a1:" + a1 + ", a2:" + a2);//Outputs a1:ABC, a2:XYZ
            //Expected: a1:XYZ, a2:XYZ (as string being a ref type)

            Proc(a2); //Altering values of ref types inside a procedure 
                      //should reflect in the variable thats being passed into

            Console.WriteLine("a1: " + a1 + ", a2: " + a2); //Outputs a1:ABC, a2:XYZ
            //Expected: a1:NEW_VAL, a2:NEW_VAL (as string being a ref type)

        static void Proc(string Val)
            Val = "NEW_VAL";


In the above code if I use a custom classes instead of strings, I am getting the expected behavior. I doubt is this something to do with the string immutability?


   a2 = "XYZ";


That's syntax sugar, provided by the compiler. A more accurate representation of this statement would be:

   a2 = CreateStringObjectFromLiteral("XYZ")


which explains how a2 simply gets a reference to a new string object and answers your question. The actual code is highly optimized because it is so common. There's a dedicated opcode available for it in IL:

   IL_0000:  ldstr      "XYZ"


String literals are collected into a table inside the assembly. Which allows the JIT compiler to implement the assignment statement very efficiently:

   00000004  mov         esi,dword ptr ds:[02A02088h] 


A single machine code instruction, can't beat that. More so: one very notable consequence is that the string object doesn't live on the heap. The garbage collector doesn't bother with it since it recognizes that the address of the string reference isn't located in the heap. So you don't even pay for collection overhead. Can't beat that.


Also note that this scheme easily allows for string interning. The compiler simply generates the same LDSTR argument for an identical literal.