I\'m surprised by C# compiler behavior in the following example:
int i = 1024;
uint x = 2048;
x = x+i; // A error CS0266: Cannot implicitly convert type
This is a manifestation of overload resolution for numeric types
Numeric promotion consists of automatically performing certain implicit conversions of the operands of the predefined unary and binary numeric operators. Numeric promotion is not a distinct mechanism, but rather an effect of applying overload resolution to the predefined operators. Numeric promotion specifically does not affect evaluation of user-defined operators, although user-defined operators can be implemented to exhibit similar effects.
http://msdn.microsoft.com/en-us/library/aa691328(v=vs.71).aspx
If you have a look at
long operator *(long x, long y);
uint operator *(uint x, uint y);
from that link, you see those are two possible overloads (the example refers to operator *
, but the same is true for operator +
).
The uint
is implicitly converted to a long
for overload resolution, as is int
.
From uint to long, ulong, float, double, or decimal.
From int to long, float, double, or decimal.
http://msdn.microsoft.com/en-us/library/aa691282(v=vs.71).aspx
What is the motivation for this decision?
It would likely take a member of the design team to answer that aspect. Eric Lippert, where are you? :-) Note though that @Nicolas's reasoning below is very plausible, that both operands are converted to the "smallest" type that can contain the full range of values for each operand.
i int i = 1024;
uint x = 2048;
// Technique #1
x = x + Convert.ToUInt32(i);
// Technique #2
x = x + checked((uint)i);
// Technique #3
x = x + unchecked((uint) i);
// Technique #4
x = x + (uint)i;
The short answer is "because the Standard says that it shall be so", which see the informative §14.2.5.2 of ISO 23270. The normative §13.1.2. (Implicit numeric conversions) says:
The implicit numeric conversions are:
...
- From
int
tolong
,float
,double
, ordecimal
.- From
uint
tolong
,ulong
,float
,double
, ordecimal
....
Conversions from
int
,uint
,long
orulong
tofloat
and fromlong
orulong
todouble
can cause a loss of precision, but will never cause a loss of magnitude. The other implicit numeric conversions never lose any information. (emph. mine)
The [slightly] longer answer is that you are adding two different types: a 32-bit signed integer and a 32-bit unsigned integer:
So the types aren't compatable, since an int
can't contain any arbitrary uint
and a uint
can't contain any arbitrary int
. They are implicitly converted (a widening conversion, per the requirement of §13.1.2 that no information be lost) to the next largest type that can contain both: a long
in this case, a signed 64-bit integer, which has the domain -9,223,372,036,854,775,808 (0x8000000000000000) — +9,223,372,036,854,775,807 (0x7FFFFFFFFFFFFFFF).
Edited to note: Just as an aside, Executing this code:
var x = 1024 + 2048u ;
Console.WriteLine( "'x' is an instance of `{0}`" , x.GetType().FullName ) ;
does not yield a long
as the original poster's example. Instead, what is produced is:
'x' is an instance of `System.UInt32`
This is because of constant folding. The first element in the expression, 1024
has no suffix and as such is an int
and the second element in the expression 2048u
is a uint
, according to the rules:
- If the literal has no suffix, it has the first of these types in which its value can be represented:
int
,uint
,long
,ulong
.- If the literal is suffixed by
U
oru
, it has the first of these types in which its value can be represented:uint
,ulong
.
And since the optimizer knows what the values are, the sum is precomputed and evaluated as a uint
.
Consistency is the hobgoblin of little minds.
Why int + int = int and uint + uint = uint, but int + uint = long? What is the motivation for this decision?
The way the question is phrased implies the presupposition that the design team wanted int + uint to be long, and chose type rules to attain that goal. That presupposition is false.
Rather, the design team thought:
As well as many other considerations such as whether the design works for or against debuggable, maintainable, versionable programs, and so on. (I note that I was not in the room for this particular design meeting, as it predated my time on the design team. But I have read their notes and know the kinds of things that would have concerned the design team during this period.)
Investigating these questions led to the present design: that arithmetic operations are defined as int + int --> int, uint + uint --> uint, long + long --> long, int may be converted to long, uint may be converted to long, and so on.
A consequence of these decisions is that when adding uint + int, overload resolution chooses long + long as the closest match, and long + long is long, therefore uint + int is long.
Making uint + int have some more different behavior that you might consider more sensible was not a design goal of the team at all because mixing signed and unsigned values is first, rare in practice, and second, almost always a bug. The design team could have added special cases for every combination of signed and unsigned one, two, four, and eight byte integers, as well as char, float, double and decimal, or any subset of those many hundreds of cases, but that works against the goal of simplicity.
So in short, on the one hand we have a large amount of design work to make a feature that we want no one to actually use easier to use at the cost of a massively complicated specification. On the other hand we have a simple specification that produces an unusual behavior in a rare case we expect no one to encounter in practice. Given those choices, which would you choose? The C# design team chose the latter.
I think the behavior of the compiler is pretty logical and expected.
In the following code:
int i;
int j;
var k = i + j;
There is an exact overload for this operation, so k
is int
. Same logic applies when adding two uint
, two byte
or what have you. The compiler's job is easy here, its happy because the overload resolution finds an exact match. There is a pretty good chance that the person writing this code expects k
to be an int
and is aware that the operation can overflow in certain circumstances.
Now consider the case you are asking about:
uint i;
int j;
var k = i + j;
What does the compiler see? Well it sees an operation that has no matching overload; there is no operator +
overload that takes an int
and a uint
as its two operands. So the overload resolution algorithm goes ahead and tries to find an operator overload that can be valid. This means it has to find an overload where the types involved can "hold" the original operands; that is, both i
and j
have to be implicitly convertible to said type(s).
The compiler can't implicitly convert uint
to int
because such conversion doesn't exsit. It cant implicitly convert int
to uint
either because that conversion also doesn't exist (both can cause a change in magnitude). So the only choice it really has is to choose the first broader type that can "hold" both operand types, which in this case is long
. Once both operands are implicitly converted to long
k
being long
is obvious.
The motivation of this behavior is, IMO, to choose the safest available option and not second guess the dubious coder's intent. The compiler can not make an educated guess as to what the person writing this code is expecting k
to be. An int
? Well, why not an uint
? Both options seem equally bad. The compiler chooses the only logical path; the safe one: long
. If the coder wants k
to be either int
or unit
he only has to explicitly cast one of the operands.
And last, but not least, the C# compiler's overload resolution algorithm does not consider the return type when deciding the best overload. So the fact that you are storing the operation result in a uint
is completely irrelevant to the compiler and has no effect whatsoever in the overload resolution process.
This is all speculation on my part, and I may be completely wrong. But it does seem logical reasoning.
The numerical promotion rules for C# are loosely based upon those of Java and C, which work by identifying a type to which both operands can be converted and then making the result be the same type. I think such an approach was reasonable in the 1980s, but newer languages should set it aside in favor of one that looks at how values are used (e.g. If I were designing a language, then given Int32 i1,i2,i3; Int64 l;
a compiler would process i4=i1+i2+i3;
using 32-bit math [throwing an exception in case of overflow] would would process l=i1+i2+i3;
with 64-bit math.) but the C# rules are what they are and don't seem likely to change.
It should be noted that the C# promotion rules by definition always select the overloads which are deemed "most suitable" by the language specification, but that doesn't mean they're really the most suitable for any useful purpose. For example, double f=1111111100/11111111.0f;
would seem like it should yield 100.0, and it would be correctly computed if both operands were promoted to double
, but the compiler will instead convert the integer 1111111100 to float
yielding 1111111040.0f, and then perform the division yielding 99.999992370605469.