coalescing operator custom implicit conversion behaviour
Note: this appears to have been fixed in Roslyn
This question arose when writing my answer to this one, which talks about the associativity of the null-coalescing operator.
Just as a reminder, the idea of the null-coalescing operator is that an expression of the form
x ?? y
first evaluates x
, then:
x
is null, y
is evaluated and that is the end result of the expression x
is non-null, y
is not evaluated, and the value of x
is the end result of the expression, after a conversion to the compile-time type of y
if necessary Now usually there's no need for a conversion, or it's just from a nullable type to a non-nullable one - usually the types are the same, or just from (say) int?
to int
. However, you can create your own implicit conversion operators, and those are used where necessary.
For the simple case of x ?? y
x ?? y
, I haven't seen any odd behaviour. However, with (x ?? y) ?? z
(x ?? y) ?? z
I see some confusing behaviour.
Here's a short but complete test program - the results are in the comments:
using System;
public struct A
{
public static implicit operator B(A input)
{
Console.WriteLine("A to B");
return new B();
}
public static implicit operator C(A input)
{
Console.WriteLine("A to C");
return new C();
}
}
public struct B
{
public static implicit operator C(B input)
{
Console.WriteLine("B to C");
return new C();
}
}
public struct C {}
class Test
{
static void Main()
{
A? x = new A();
B? y = new B();
C? z = new C();
C zNotNull = new C();
Console.WriteLine("First case");
// This prints
// A to B
// A to B
// B to C
C? first = (x ?? y) ?? z;
Console.WriteLine("Second case");
// This prints
// A to B
// B to C
var tmp = x ?? y;
C? second = tmp ?? z;
Console.WriteLine("Third case");
// This prints
// A to B
// B to C
C? third = (x ?? y) ?? zNotNull;
}
}
So we have three custom value types, A
, B
and C
, with conversions from A to B, A to C, and B to C.
I can understand both the second case and the third case... but why is there an extra A to B conversion in the first case? In particular, I'd really have expected the first case and second case to be the same thing - it's just extracting an expression into a local variable, after all.
Any takers on what's going on? I'm extremely hesistant to cry "bug" when it comes to the C# compiler, but I'm stumped as to what's going on...
EDIT: Okay, here's a nastier example of what's going on, thanks to configurator's answer, which gives me further reason to think it's a bug. EDIT: The sample doesn't even need two null-coalescing operators now...
using System;
public struct A
{
public static implicit operator int(A input)
{
Console.WriteLine("A to int");
return 10;
}
}
class Test
{
static A? Foo()
{
Console.WriteLine("Foo() called");
return new A();
}
static void Main()
{
int? y = 10;
int? result = Foo() ?? y;
}
}
The output of this is:
Foo() called
Foo() called
A to int
The fact that Foo()
gets called twice here is hugely surprising to me - I can't see any reason for the expression to be evaluated twice.
Thanks to everyone who contributed to analyzing this issue. It is clearly a compiler bug. It appears to only happen when there is a lifted conversion involving two nullable types on the left-hand side of the coalescing operator.
I have not yet identified where precisely things go wrong, but at some point during the "nullable lowering" phase of compilation -- after initial analysis but before code generation -- we reduce the expression
result = Foo() ?? y;
from the example above to the moral equivalent of:
A? temp = Foo();
result = temp.HasValue ?
new int?(A.op_implicit(Foo().Value)) :
y;
Clearly that is incorrect; the correct lowering is
result = temp.HasValue ?
new int?(A.op_implicit(temp.Value)) :
y;
My best guess based on my analysis so far is that the nullable optimizer is going off the rails here. We have a nullable optimizer that looks for situations where we know that a particular expression of nullable type cannot possibly be null. Consider the following naive analysis: we might first say that
result = Foo() ?? y;
is the same as
A? temp = Foo();
result = temp.HasValue ?
(int?) temp :
y;
and then we might say that
conversionResult = (int?) temp
is the same as
A? temp2 = temp;
conversionResult = temp2.HasValue ?
new int?(op_Implicit(temp2.Value)) :
(int?) null
But the optimizer can step in and say "whoa, wait a minute, we already checked that temp is not null; there's no need to check it for null a second time just because we are calling a lifted conversion operator". We'd them optimize it away to just
new int?(op_Implicit(temp2.Value))
My guess is that we are somewhere caching the fact that the optimized form of (int?)Foo()
is new int?(op_implicit(Foo().Value))
but that is not actually the optimized form we want; we want the optimized form of Foo()-replaced-with-temporary-and-then-converted.
Many bugs in the C# compiler are a result of bad caching decisions. A word to the wise: every time you cache a fact for use later, you are potentially creating an inconsistency should something relevant change . In this case the relevant thing that has changed post initial analysis is that the call to Foo() should always be realized as a fetch of a temporary.
We did a lot of reorganization of the nullable rewriting pass in C# 3.0. The bug reproduces in C# 3.0 and 4.0 but not in C# 2.0, which means that the bug was probably my bad. Sorry!
I'll get a bug entered into the database and we'll see if we can get this fixed up for a future version of the language. Thanks again everyone for your analysis; it was very helpful!
UPDATE: I rewrote the nullable optimizer from scratch for Roslyn; it now does a better job and avoids these sorts of weird errors. For some thoughts on how the optimizer in Roslyn works, see my series of articles which begins here: https://ericlippert.com/2012/12/20/nullable-micro-optimizations-part-one/
This is most definitely a bug.
public class Program {
static A? X() {
Console.WriteLine("X()");
return new A();
}
static B? Y() {
Console.WriteLine("Y()");
return new B();
}
static C? Z() {
Console.WriteLine("Z()");
return new C();
}
public static void Main() {
C? test = (X() ?? Y()) ?? Z();
}
}
This code will output:
X()
X()
A to B (0)
X()
X()
A to B (0)
B to C (0)
That made me think that the first part of each ??
coalesce expression is evaluated twice. This code proved it:
B? test= (X() ?? Y());
outputs:
X()
X()
A to B (0)
This seems to happen only when the expression requires a conversion between two nullable types; I've tried various permutations with one of the sides being a string, and none of them caused this behaviour.
If you take a look at the generated code for the Left-grouped case it actually does something like this ( csc /optimize-
):
C? first;
A? atemp = a;
B? btemp = (atemp.HasValue ? new B?(a.Value) : b);
if (btemp.HasValue)
{
first = new C?((atemp.HasValue ? new B?(a.Value) : b).Value);
}
Another find, if you use first
it will generate a shortcut if both a
and b
are null and return c
. Yet if a
or b
is non-null it re-evaluates a
as part of the implicit conversion to B
before returning which of a
or b
is non-null.
From the C# 4.0 Specification, §6.1.4:
S?
to T?
: null
( HasValue
property is false
), the result is the null
value of type T?
. S?
to S
, followed by the underlying conversion from S
to T
, followed by a wrapping (§4.1.10) from T
to T?
. This appears to explain the second unwrapping-wrapping combination.
The C# 2008 and 2010 compiler produce very similar code, however this looks like a regression from the C# 2005 compiler (8.00.50727.4927) which generates the following code for the above:
A? a = x;
B? b = a.HasValue ? new B?(a.GetValueOrDefault()) : y;
C? first = b.HasValue ? new C?(b.GetValueOrDefault()) : z;
I wonder if this is not due to the additional magic given to the type inference system?
链接地址: http://www.djcxy.com/p/9938.html上一篇: PHP三元运算符vs null合并运算符
下一篇: 合并运算符自定义隐式转换行为