Generics: Casting and Value Types, Why Is This Illegal

Generics: casting and value types, why is this illegal?

Why is this a compile time error?

The problem is that every possible combination of value types has different rules for what a cast means. Casting a 64 bit double to a 16 bit int is completely different code from casting a decimal to a float, and so on. The number of possibilities is enormous. So think like the compiler. What code is the compiler supposed to generate for your program?

The compiler would have to generate code that starts the compiler again at runtime, does a fresh analysis of the types, and dynamically emits the appropriate code.

That seems like perhaps more work and less performance than you expected to get with generics, so we simply outlaw it. If what you really want is for the compiler to start up again and do an analysis of the types, use "dynamic" in C# 4; that's what it does.

And why is this a runtime error?

Same reason.

A boxed int may only be unboxed to int (or int?), for the same reason as above; if the CLR tried to do every possible conversion from a boxed value type to every other possible value type then essentially it has to run a compiler again at runtime. That would be unexpectedly slow.

So why is it not an error for reference types?

Because every reference type conversion is the same as every other reference type conversion: you interrogate the object to see if it is derived from or identical to the desired type. If it's not, you throw an exception (if doing a cast) or result in null/false (if using the "as/is" operators). The rules are consistent for reference types in a way that they are not for value types. Remember reference types know their own type. Value types do not; with value types, the variable doing the storage is the only thing that knows the type semantics that apply to those bits. Value types contain their values and no additional information. Reference types contain their values plus lots of extra data.

For more information see my article on the subject:

http://ericlippert.com/2009/03/03/representation-and-identity/

Why Java couldn't figure out some obvious illegal casts when type parameters are involved?

Explanation

Java, based on its current rule set (see the JLS) has to treat the method content and its call-site separately.



Method content

The cast

(Integer) obj

has to be allowed at compile-time since T could be an Integer. After all, a call like

f(4)

should succeed and be allowed.

Java is not allowed to take in the call-site of the method into consideration. Also, this would imply that Java would have to scan all call-sites but this is impossible since that would also include possible future call-sites that have not been written yet or are included later on, in case you are writing a library.



Call-site

The call-site also has to be legal since Java is not allowed to take the method-content into consideration.

The signature demands T (extends Object) and String fullfills that. So it is allowed to be called like that.

If Java would also check the content, imagine you would hide the cast 3 levels depper in some other method calls. Then Java not only has to check fs code but also the code of those methods and possibly all their if statements to check if the line with the bad cast is even reached.
It is NP-hard to prove that with 100% certanity at compile-time, hence it is also not part of the rule set.



Why?

While we saw that such situations are not always easy to detect and that actually proving it for all possible situations might even be impossible (NP-hard), Java designers could certainly have added some weaker rules that cover the dangerous situations partially.

Also, there are actually some similar situations in which Java will help you out with weaker rules. For example a cast like

House a = new House();
Dog b = (Dog) a;

is forbidden because Java can prove easily that the types are completely unrelated. But as soon as the setup becomes more complex, with types coming from other methods, generics, Java can not easily check it anymore.

All in all, you will have to ask a Java Language Designer for the precise reasons in the decision making. It is what it is.



Static code analysis

What you have here is typically the job of a static code analyzer (like most IDEs already include). They will actually scan through your code, all usages and so on and try to figure out if your current code flow has possible issues.

It is important to note that this also includes a lot of false-positives, as we just learned that not all of such usages might actually be wrong, some dangerous setups might be intended.



Appendix: Comments

Based on the discussion in the comments, let me stress out the fact that your particular example is indeed simple to prove to be wrong. So the call-site could easily be forbidden in this particular case (any static code analyzer will happilly throw a warning at you for this code).

However, we can do very simple modifications to the code already that demonstrates why it is so difficult to actually prove errors when connecting the call-site with the content of the method.

So the tldr is that almost all real code situations require much more efforts for a tool to prove with 100% that the call is incorrect. Also, it is much more difficult to program this and it can not always be ensured that there are no false-positives. Which is why such stuff is typically not done by a compiler but rather by static code analyzers.

The two common examples for this are method nesting and code branching.

Nesting

Imagine you hide the cast (Integer) obj a level depper, in another method:

public static void main(String[] args) {
f("hello");
}

public static <T> void f(T obj) {
g(obj);
}

public static <T> void g(T obj) {
Integer i = (Integer) obj;
}

In order to prove this, Java would now have to connect the call-site from main to the content in f, to the call-site in g. If you add more levels of nesting this quickly gets out of control and needs a recursive deep analysis before anything can be proven.

Branching

Another very common but difficult situation for the compiler is if you hide the bad cast in a branched code flow:

public static void main(String[] args) {
f("hello");
}

public static <T> void f(T obj) {
if (isMonday()) {
Integer a = (Integer) obj;
} else {
String b = (String) obj;
}
}

Now, Java would need knowledge about what isMonday() returns at compile-time, which is simply impossible.

But if Java flags this, it would be bad. Because, what if we ensure externally that we only launch the program mondays? It should work then.

Why is this assignment illegal in Java generics?

Because, with:

ArrayList<? extends Number> list = new ArrayList<Number>(); //OK

you are defining the concrete object of type ArrayList generalized with upper bounded wildcard <? extends Number>.

Type of your wildcard will match any sub-type of Number, which means that you can assign to your list variable, any ArrayList specialized with any type which extends Number:

list = new ArrayList<Float>(); //will work fine
list = new ArrayList<Double>(); //will work fine
list = new ArrayList<String>(); //will NOT work as String does not extend Number

The only caveat is the Capture Problem. You won't be able to add Number extender instances in your list.


In here, however:

ArrayList<T extends Number> list = new ArrayList<Number>(); //Syntax error

you have a syntax error in your object declaration. You are using bounded type parameter to declare your variable; however, bounded type parameter is used to define a generic class/type or generic method.

Type parameter T, or any (preferably capital) letter, is used to declare a generic type parameter when you define your class (or method).

For example, your upper-bounded generic type in this definition:

public class YourClass<T extends Number> {
....
}

will become a real type, at run-time, when you will declare your YourClass type by providing some real type, as a generic type argument, like this:

YourClass<Integer> ints; //will work fine
YourClass<Double> doubles; //will work fine
YourClass<Float> floats; //will work fine
YourClass<String> strings; //will NOT work, as String does not extend Number

Pay attention, that here as well, T extends Number matches any type that extends Number.

Understanding Illegal Generic Cast

MyType<Base> and MyType<Derived> do not have any inheritance relationship, even if Derived derives from Base. The two generic types are just two different types.

One way to come around this problem is to have a non-generic interface as base interface:

public interface IEntityService
{
void DoSomething(object item);
}

public interface IEntityService<T> : IEntityService
{
void DoSomething(T item);
}

This pattern is used in the .Net Class Library (for e.g. IEnumerable/IEnumerable<T>, IList/IList<T>).

If you know that your interface uses the generic type (T) only for outputs, you can use the out keyword IMyInterface<out T>. You can then provide a more derived type for T. This is called covariance. The return values of the methods will then yield a more derived type as expected by the consumer and this is ok.

If it uses the generic type only for inputs, use the in keyword IMyInterface<in T>. You can then provide a less derived type for T. This is called contravariance. The input arguments of the methods will then get a more derived type as expected and this is ok.

Java generics - Why is this assignment in the constructor illegal?

Remove the <T> from in front of the constructor. Java thinks you are trying to create a new generic. It thinks the T you have in the class statement is a different T then you have in the constructor. In Java's eyes you have T1 and T2. You are trying to assign T2 to a T1 variables. Even though they may be identical in methods, inheritance, etc... they are two distinct generics.

This is how Java interprets what you've written.

public class Container<T1> {
private T1 content;
private T1 defaultValue;

public <T2> Container(T2 defaultValue){
//Compiler error - incompatible types: T cannot be converted to T.
this.defaultValue = defaultValue;
}
}

What you meant to write was this. You don't need to specify T inside < > anywhere since it's included in the class syntax.

public class Container<T> {
private T content;
private T defaultValue;

public Container(T defaultValue){
this.defaultValue = defaultValue;
}
}

Why is this generic assignment illegal?

This is a tricky but interesting thing about wildcard types that you have run into! It is tricky but really logical when you understand it.

The error has to do with the fact that the wildcard ? extends Number does not refer to one single concrete type, but to some unknown type. Thus two occurrences of ? extend Number don't necessarily refer to the same type, so the compiler can't allow the assignment.

Detailed explanation

  1. The right-hand-side in the assignment, tt.getList(), does not get the type List<List<? extends Number>>. Instead each use of it is assigned by the compiler a unique generated capture type, for exampled called List<List<capture#1 extends Number>>.

  2. The capture type List<capture#1 extends Number> is a subtype of List<? extends Number>, but it is not type same type! (This is to avoid mixing different unknown types together.)

  3. The type of the left-hand-side in the assignment is List<List<? extends Number>>. This type does not allow subtypes of List<? extends Number> to be the element type of the outer list, thus the return type of getList can't be used as the element type.

  4. The type List<? extends List<? extends Number>> on the other hand does allow subtypes of List<? extends Number> as the element type of the outer list. So that is the right fix for the problem.

Motivation

The following example code demonstrates why the assignment is illegal. Through a sequence of steps we end up with a List<Integer> which actually contains Floats!

class Generic<T> {
private List<List<T>> list = new ArrayList<>();

public List<List<T>> getList() {
return list;
}
}

// Start with a concrete type, which will get corrupted later on
Generic<Integer> genInt = new Generic<>();

// Add a List<Integer> to genInt.list. This is not necessary for the
// main example but migh make things a little clearer.
List<Integer> ints = List.of(1);
genInt.getList().add(ints);

// Assign to a wildcard type as in the question
Generic<? extends Number> genWild = genInt;

// The illegal assignment. This doesn't compile normally, but we force it
// using an unchecked cast to see what would happen IF it did compile.
List<List<? extends Number>> list =
(List<List<? extends Number>>) (Object) genWild.getList();

// This is the crucial step:
// It is legal to add a List<Float> to List<List<? extends Number>>.
// list refers to genInt.list, which has type List<List<Integer>>.
// Heap pollution occurs!
List<Float> floats = List.of(1.0f);
list.add(floats);

// notInts in reality is the same list as floats!
List<Integer> notInts = genInt.getList().get(1);

// This statement reads a Float from a List<Integer>. A ClassCastException
// is thrown. The compiler must not allow us to end up here without any
// previous type errors or unchecked cast warnings.
Integer i = notInts.get(0);

The fix that you discovered was to use the following type for list:

List<? extends List<? extends Number>> list = tt.getList();

This new type shifts the type error from the assignment of list to the call to list.add(...).

The above illustrates the whole point of wildcard types: To keep track of where it is safe to read and write values without mixing up types and getting unexpected ClassCastExceptions.

General rule of thumb

There is a general rule of thumb for situations like this, when you have nested type arguments with wildcards:

If the inner types have wildcards in them, then the outer types often need wildcards also.

Otherwise the inner wildcard can't "take effect", in the way you have seen.

References

The Java Tutorial contains some information about capture types.

This question has answers with general information about wildcards:

What is PECS (Producer Extends Consumer Super)?

Safely cast a Generic type in C#

The as operator is reserved for reference types. If T is always a reference type, the solution is to add a class constraint to TryGetAs<T>. If T can also be a value type, this is not an option. In this case you can use the is operator:

public bool TryGetAs<T>(out T value) where T : IObject
{
if(m_obj is T)
{
value = (T)m_obj;
return true;
}
else
{
value = default(T);
return false;
}
}

In C# 7.0 you can simplify it like this (and improve performance since you don't need an is cast and then another type cast):

public bool TryGetAs<T>(out T value) where T : IObject
{
if(m_obj is T tValue)
{
value = tValue;
return true;
}
else
{
value = default(T);
return false;
}
}


Related Topics



Leave a reply



Submit