In C#, why is String a reference type that behaves like a value type?

A String is a reference type even though it has most of the characteristics of a value type such as being immutable and having == overloaded to compare the text rather than making sure they reference the same object.

Why isn't string just a value type then?


Strings aren't value types since they can be huge, and need to be stored on the heap. Value types are (in all implementations of the CLR as of yet) stored on the stack. Stack allocating strings would break all sorts of things: the stack is only 1MB, you'd have to box each string, incurring a copy penalty, you couldn't intern strings, and memory usage would balloon, etc...

(Edit: Added clarification about value type storage being an implementation detail, which leads to this situation where we have a type with value sematics not inheriting from System.ValueType. Thanks Ben.)


It is not a value type because performance (space and time!) would be terrible if it were a value type and its value had to be copied every time it were passed to and returned from methods, etc.

It has value semantics to keep the world sane. Can you imagine how difficult it would be to code if

string s = "hello";
string t = "hello";
bool b = (s == t);

set b to be false ? Imagine how difficult coding just about any application would be.


The distinction between reference types and value types are basically a performance tradeoff in the design of the language. Reference types have some overhead on construction and destruction and garbage collection, because they are created on the heap. Value types on the other hand have overhead on method calls (if the data size is larger than a pointer), because the whole object is copied rather than just a pointer. Because strings can be (and typically are) much larger than the size of a pointer, they are designed as reference types. Also, as Servy pointed out, the size of a value type must be known at compile time, which is not always the case for strings.

The question of mutability is a separate issue. Both reference types and value types can be either mutable or immutable. Value types are typically immutable though, since the semantics for mutable value types can be confusing.

Reference types are generally mutable, but can be designed as immutable if it makes sense. Strings are defined as immutable because it makes certain optimizations possible. For example, if the same string literal occurs multiple times in the same program (which is quite common), the compiler can reuse the same object.

So why is "==" overloaded to compare strings by text? Because it is the most useful semantics. If two strings are equal by text, they may or may not be the same object reference due to the optimizations. So comparing references are pretty useless, while comparing text are almost always what you want.

Speaking more generally, Strings has what is termed value semantics . This is a more general concept than value types, which is a C# specific implementation detail. Value types have value semantics, but reference types may also have value semantics. When a type have value semantics, you can't really tell if the underlying implementation is a reference type or value type, so you can consider that an implementation detail.

链接地址: http://www.djcxy.com/p/79184.html

上一篇: 不安全的c#中的32.dll比使用.net套接字更快?

下一篇: 在C#中,为什么String是一个类似于值类型的引用类型?