How Is the Boxing/Unboxing Behavior of Nullable<T> Possible

How is the boxing/unboxing behavior of NullableT possible?

There are two things going on:

1) The compiler treats "null" not as a null reference but as a null value... the null value for whatever type it needs to convert to. In the case of a Nullable<T> it's just the value which has False for the HasValue field/property. So if you have a variable of type int?, it's quite possible for the value of that variable to be null - you just need to change your understanding of what null means a little bit.

2) Boxing nullable types gets special treatment by the CLR itself. This is relevant in your second example:

    int? i = new int?();
object x = i;

the compiler will box any nullable type value differently to non-nullable type values. If the value isn't null, the result will be the same as boxing the same value as a non-nullable type value - so an int? with value 5 gets boxed in the same way as an int with value 5 - the "nullability" is lost. However, the null value of a nullable type is boxed to just the null reference, rather than creating an object at all.

This was introduced late in the CLR v2 cycle, at the request of the community.

It means there's no such thing as a "boxed nullable-value-type value".

Boxing / Unboxing Nullable Types - Why this implementation?

I remember this behavior was kind of last minute change. In early betas of .NET 2.0, Nullable<T> was a "normal" value type. Boxing a null valued int? turned it into a boxed int? with a boolean flag. I think the reason they decided to choose the current approach is consistency. Say:

int? test = null;
object obj = test;
if (test != null)
Console.WriteLine("test is not null");
if (obj != null)
Console.WriteLine("obj is not null");

In the former approach (box null -> boxed Nullable<T>), you wouldn't get "test is not null" but you'd get "object is not null" which is weird.

Additionally, if they had boxed a nullable value to a boxed-Nullable<T>:

int? val = 42;
object obj = val;

if (obj != null) {
// Our object is not null, so intuitively it's an `int` value:
int x = (int)obj; // ...but this would have failed.
}

Beside that, I believe the current behavior makes perfect sense for scenarios like nullable database values (think SQL-CLR...)



Clarification:

The whole point of providing nullable types is to make it easy to deal with variables that have no meaningful value. They didn't want to provide two distinct, unrelated types. An int? should behaved more or less like a simple int. That's why C# provides lifted operators.

So, when unboxing a value type into a nullable version, the CLR must allocate a Nullable<T> object, initialize the hasValue field to true, and set the value field to the same value that is in the boxed value type. This impacts your application performance (memory allocation during unboxing).

This is not true. The CLR would have to allocates memory on stack to hold the variable whether or not it's nullable. There's not a performance issue to allocate space for an extra boolean variable.

Boxing/Unboxing and Nullable?

What it's saying is that if you do:

int? x = 5;
object y = x; // Boxing

You end up with a boxed int, not a boxed Nullable<int>. Similarly if you do:

int? x = null; // Same as new Nullable<int>() - HasValue = false;
object y = x; // Boxing

Then y ends up as a null reference.

CIL - Boxing/Unboxing vs Nullable

The new syntax in C# 7 is doing type checking and type conversion at once. In older versions, this was usually done in two possible ways.

if(o is T)
//use (T)o

 

T t = o as T;
if(t != null)
//use t

For reference types, the first one has a redundant conversion, because is is compiled to isinst and a conditional branch, as you can see from your CIL instructions used. The second code is identical to the first in terms of CIL, minus the additional (T)o cast (compiled to castclass).

For value types, the second options can only be done with a nullable type, and I also think it is actually somewhat slower than the first one (a structure has to be created).

I have compiled the following method to CIL:

static void C<T>(object o) where T : struct
{
T? t = o as T?;
if(t != null)
Console.WriteLine("Argument is {0}: {1}", typeof(T), t);
}

Producing this code:

.method private hidebysig static void  C<valuetype .ctor ([mscorlib]System.ValueType) T>(object o) cil managed
{
// Code size 48 (0x30)
.maxstack 3
.locals init (valuetype [mscorlib]System.Nullable`1<!!T> V_0)
IL_0000: ldarg.0
IL_0001: isinst valuetype [mscorlib]System.Nullable`1<!!T>
IL_0006: unbox.any valuetype [mscorlib]System.Nullable`1<!!T>
IL_000b: stloc.0
IL_000c: ldloca.s V_0
IL_000e: call instance bool valuetype [mscorlib]System.Nullable`1<!!T>::get_HasValue()
IL_0013: brfalse.s IL_002f
IL_0015: ldstr "Argument is {0}: {1}"
IL_001a: ldtoken !!T
IL_001f: call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle)
IL_0024: ldloc.0
IL_0025: box valuetype [mscorlib]System.Nullable`1<!!T>
IL_002a: call void [mscorlib]System.Console::WriteLine(string,
object,
object)
IL_002f: ret
}

This is exactly the code as in the question, except the call to GetValueOrDefault, because I don't obtain the actual value of the nullable instance.

Nullable types cannot be boxed or unboxed directly, only via their underlying value, or as a normal null. The first isinst ensures that other types won't produce an exception (I suppose isinst !!T could also be used), only a null reference instead. The unbox.any opcode then forms a nullable instance from the reference, which is then used as usual. The instruction can also be written as a null check and forming the nullable instance on its own, but it's shorter this way.

The C# 7 uses the second way for is T t, hence it has no other choice than using the nullable type, if T is a value type. Why does it not choose the former option? I can only guess that it can have some substantial differences in terms of semantics or implementation, variable allocation etc. Therefore, they choose to be consistent with the implementation of the new construct.

For comparison, here's what is produced when I change T : struct to T : class in the method above (and T? to T):

.method private hidebysig static void  C<class T>(object o) cil managed
{
// Code size 47 (0x2f)
.maxstack 3
.locals init (!!T V_0)
IL_0000: ldarg.0
IL_0001: isinst !!T
IL_0006: unbox.any !!T
IL_000b: stloc.0
IL_000c: ldloc.0
IL_000d: box !!T
IL_0012: brfalse.s IL_002e
IL_0014: ldstr "Argument is {0}: {1}"
IL_0019: ldtoken !!T
IL_001e: call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle)
IL_0023: ldloc.0
IL_0024: box !!T
IL_0029: call void [mscorlib]System.Console::WriteLine(string,
object,
object)
IL_002e: ret
}

Again fairly consistent with the original method.

Making value type variables nullable by using ?: does it imply boxing?

int? is suggar for Nullable<T> See documentation. If we look at the signature of this we see:

public struct Nullable<T> where T : struct
{
...
public override bool Equals(object other)
{
if (!this.hasValue)
return other == null;
return other != null && this.value.Equals(other);
}

Since it is a struct the value will not be boxed.

If you need to compare values, n.Equals(1) would cause boxing of the argument. I cannot find any documentation about the equality operator ==, but I think it would be fairly safe to assume that it should not cause boxing.

Reflection and NullableT

It's because PropertyInfo.GetValue's return type is object, which means that all value types are boxed.

Now, nullable types are boxed differently than other types:

  • a nullable value with HasValue == false will be boxed to the null reference, and
  • a nullable value with HasValue == true will be boxed like the underlying non-nullable type. The nullability information is "lost".


Is it baked into the language and only available to the Nullable type [...]

Exactly, (only) the Nullable struct gets special treatment by the CLR.

The reason for this special treatment is explained in detail in the following question. Basically, it boils down to wanting myNullable == null and myBoxedNullable == null to behave consistently:

  • Boxing / Unboxing Nullable Types - Why this implementation?

Is it possible to cheat C# compiler to box NullableT struct ant not its value?

I am conjecturing that the question you are trying to ask is:

Can you box a Nullable<int> value without producing a null reference or a boxed int value, but instead an actually boxed Nullable<int>?

No.

Unboxing Null-Object to primitive type results in NullPointerException, fine?

According to the Java language specification, unboxing happens via calling Number.longValue(), Number.intValue(), etc. There is no special byte code magic happening, it's exactly the same as if you call those methods manually. Thus, the NullPointerException is the natural result of unboxing a null (and in fact mandated by the JLS).

Throwing a different exception would require checking for null twice during every unboxing conversion (once to determine whether to throw the special exception, and once implicitly when the method is actually called). I suppose the language designers didn't think it useful enough to warrant that.



Related Topics



Leave a reply



Submit