Generic Return Type Upper Bound - Interface VS. Class - Surprisingly Valid Code

Generic return type upper bound - interface vs. class - surprisingly valid code

CharSequence is an interface. Therefore even if SomeClass does not implement CharSequence it would be perfectly possible to create a class

class SubClass extends SomeClass implements CharSequence

Therefore you can write

SomeClass c = getCharSequence();

because the inferred type X is the intersection type SomeClass & CharSequence.

This is a bit odd in the case of Integer because Integer is final, but final doesn't play any role in these rules. For example you can write

<T extends Integer & CharSequence>

On the other hand, String is not an interface, so it would be impossible to extend SomeClass to get a subtype of String, because java does not support multiple-inheritance for classes.

With the List example, you need to remember that generics are neither covariant nor contravariant. This means that if X is a subtype of Y, List<X> is neither a subtype nor a supertype of List<Y>. Since Integer does not implement CharSequence, you cannot use List<Integer> in your doCharSequence method.

You can, however get this to compile

<T extends Integer & CharSequence> void foo(List<T> list) {
doCharSequence(list);
}

If you have a method that returns a List<T> like this:

static <T extends CharSequence> List<T> foo() 

you can do

List<? extends Integer> list = foo();

Again, this is because the inferred type is Integer & CharSequence and this is a subtype of Integer.

Intersection types occur implicitly when you specify multiple bounds (e.g. <T extends SomeClass & CharSequence>).

For further information, here is the part of the JLS where it explains how type bounds work. You can include multiple interfaces, e.g.

<T extends String & CharSequence & List & Comparator>

but only the first bound may be a non-interface.

Why can this generic method with a bound return any type?

This is actually a legitimate type inference*.

We can reduce this to the following example (Ideone):

interface Foo {
<F extends Foo> F bar();

public static void main(String[] args) {
Foo foo = null;
String baz = foo.bar();
}
}

The compiler is allowed to infer a (nonsensical, really) intersection type String & Foo because Foo is an interface. For the example in the question, Integer & IElement is inferred.

It's nonsensical because the conversion is impossible. We can't do such a cast ourselves:

// won't compile because Integer is final
Integer x = (Integer & IElement) element;

Type inference basically works with:

  • a set of inference variables for each of a method's type parameters.
  • a set of bounds that must be conformed to.
  • sometimes constraints, which are reduced to bounds.

At the end of the algorithm, each variable is resolved to an intersection type based on the bound set, and if they're valid, the invocation compiles.

The process begins in 8.1.3:

When inference begins, a bound set is typically generated from a list of type parameter declarations P1, ..., Pp and associated inference variables α1, ..., αp. Such a bound set is constructed as follows. For each l (1 ≤ l ≤ p):

  • […]

  • Otherwise, for each type T delimited by & in a TypeBound, the bound αl <: T[P1:=α1, ..., Pp:=αp] appears in the set […].

So, this means first the compiler starts with a bound of F <: Foo (which means F is a subtype of Foo).

Moving to 18.5.2, the return target type gets considered:

If the invocation is a poly expression, […] let R be the return type of m, let T be the invocation's target type, and then:

  • […]

  • Otherwise, the constraint formula ‹R θ → T› is reduced and incorporated with [the bound set].

The constraint formula ‹R θ → T› gets reduced to another bound of R θ <: T, so we have F <: String.

Later on these get resolved according to 18.4:

[…] a candidate instantiation Ti is defined for each αi:

  • Otherwise, where αi has proper upper bounds U1, ..., Uk, Ti = glb(U1, ..., Uk).

The bounds α1 = T1, ..., αn = Tn are incorporated with the current bound set.

Recall that our set of bounds is F <: Foo, F <: String. glb(String, Foo) is defined as String & Foo. This is apparently a legitimate type for glb, which only requires that:

It is a compile-time error if, for any two classes (not interfaces) Vi and Vj, Vi is not a subclass of Vj or vice versa.

Finally:

If resolution succeeds with instantiations T1, ..., Tp for inference variables α1, ..., αp, let θ' be the substitution [P1:=T1, ..., Pp:=Tp]. Then:

  • If unchecked conversion was not necessary for the method to be applicable, then the invocation type of m is obtained by applying θ' to the type of m.

The method is therefore invoked with String & Foo as the type of F. We can of course assign this to a String, thus impossibly converting a Foo to a String.

The fact that String/Integer are final classes is apparently not considered.


* Note: type erasure is/was completely unrelated to the issue.

Also, while this compiles on Java 7 as well, I think it's reasonable to say we needn't worry about the specification there. Java 7's type inference was essentially a less sophisticated version of Java 8's. It compiles for similar reasons.


As an addendum, while strange, this will likely never cause a problem that was not already present. It's rarely useful to write a generic method whose return type is solely inferred from the return target, because only null can be returned from such a method without casting.

Suppose for example we have some map analog which stores subtypes of a particular interface:

interface FooImplMap {
void put(String key, Foo value);
<F extends Foo> F get(String key);
}

class Bar implements Foo {}
class Biz implements Foo {}

It's already perfectly valid to make an error such as the following:

FooImplMap m = ...;
m.put("b", new Bar());
Biz b = m.get("b"); // casting Bar to Biz

So the fact that we can also do Integer i = m.get("b"); is not a new possibility for error. If we were programming code like this, it was already potentially unsound to begin with.

Generally, a type parameter should only be solely inferred from the target type if there is no reason to bound it, e.g. Collections.emptyList() and Optional.empty():

private static final Optional<?> EMPTY = new Optional<>();

public static<T> Optional<T> empty() {
@SuppressWarnings("unchecked")
Optional<T> t = (Optional<T>) EMPTY;
return t;
}

This is A-OK because Optional.empty() can neither produce nor consume a T.

Java generic method: inconsistency when inferring upper bound on return type from argument type

Col<Baz> col = Test.unsafe (new Foo ());

Test.unsafe infers T as Object in this case; since everything extends Object, the bounds are satisfied.

Col<Baz> col = Test.safe (new Tag<Foo> ());

T can't be inferred to be Object in this case: it's Foo, because you've said that it's Foo, without any bounds. Similarly, V is exactly Baz. Since Baz doesn't extend Foo, it's a compiler error.

Difference between using a generic type and using its upper bound type directly as the parameter type in a function

The example of run is a great one. If you define this function with Any? as the receiver of both run and its block parameter (let's call this anyRun), you will receive an object of type Any? inside the receiver, and you'll only be able to call toString and similar methods on it, any only after a smart cast:

foo.anyRun {
this?.toString() // only basic Any? methods visible here, since `this` is of type Any?
this?.hashCode()
}

This will happen regardless of what the type of foo is.

On the other hand, with the original generic implementation, you'll get back your instance with its original type inside the lambda, e.g.:

"hello".run {
this.length // String methods and properties available on this
}


Related Topics



Leave a reply



Submit