Mutually Recursive Generic Enums

Recursive Swift Either Enum with Generic Associated Values

So I discovered that Swift 2 introduced support for recursive enums ages ago - source! While this is great, I didn't manage to find any SO questions for this specific problem i.e. Generic Recursive Enums that had also been answered (here). I will summarise the final (partial solution) code now.

It turns out that I needed to describe what the problem is in terms of data structures. Essentially we want .left to contain either a value of type Left or a RecursiveEither<Left, Right>. With this simple idea, we can create two enums - one for the wrapper enum, and one for the nested enum which can take a Value or another wrapper.

enum RecursiveEither<Left, Right> {
case left(ValueOrLeftOrRight<Left, Left, Right>)
case right(ValueOrLeftOrRight<Right, Left, Right>)

var left: Left? {
guard case .left(let leftie) = self else { return nil }
return leftie.left
}

var right: Right? {
guard case .right(let rightie) = self else { return nil }
return rightie.right
}
}

enum ValueOrLeftOrRight<Value, Left, Right> {
case value(Value)
indirect case left(ValueOrLeftOrRight<Left, Left, Right>)
indirect case right(ValueOrLeftOrRight<Right, Left, Right>)

var left: Left? {
switch self {
case .value(let left): return left as? Left
case .left(let content): return content.left
default: return nil
}
}

var right: Right? {
switch self {
case .value(let right): return right as? Right
case .right(let content): return content.right
default: return nil
}
}
}

The call site is nice with this design:

let e = RecursiveEither<Int, Int>.left(.left(.left(.left(.value(3)))))

That being said, there are still limitations to this answer - assuming Left and Right won't change even in nested Either. Also, it's possible to design this differently with only one indirect case to wrap the Either type in another associated value - that might be more memory efficient since fewer frames get pushed onto the call stack at runtime.

I have posted a gist with attempts on this problem.

If anyone has suggestions to improve this implementation, feel free to add to this.

Why does declaration order matter for generic members?

This is a subtle side-effect of how the F# type inference works. I do not think there is a better workaround for the issue than reordering your definitions.

To provide more background, the members of a class are automatically treated as mutually recursive (meaning that they can call each other). Unlike with multiple types or multiple functions in a module, you do not need to declare this explicitly using the rec keyword.

However, the issue is not limited to classes. You can get exactly the same behaviour with simple functions. The simplest possible example is:

let rec test() = 
f 3 // Warning: causes code to be less generic
f "boo" // Error: string does not match int
and f (arg:'a) = ()

In the opposite order, this works just fine:

let rec f (arg:'a) = ()
and test() =
f 3
f "boo"

The issue is that the type-checker analyzes the code from top to bottom. In the first case, it:

  • Sees f 3 and infers that f is of type int -> unit
  • Sees f "boo" and reports a type error
  • Sees f (arg:'a) and realizes it prematurely used a more specific type than needed (and reports the various warnings).

In the second case, it:

  • Sees f (arg:'a) and infers that f is of type 'a -> unit
  • Sees f 3 and f "boo" and uses appropriate type instantiation

The main reason for why the type checker cannot be more clever is that the type checker only does "generalization" (i.e. figures out that a function is generic) after it analyzes the whole body of the function (or the whole recursive block). If it encounters a more specific use inside the body, it never gets to this generalization step.

Why are recursive struct types illegal in Rust?

Data inside structs and enums (and tuples) is stored directly inline inside the memory of the struct value. Given a struct like

struct Recursive {
x: u8,
y: Option<Recursive>
}

let's compute the size: size_of::<Recursive>(). Clearly it has 1 byte from the x field, and then the Option has size 1 (for the discriminant) + size_of::<Recursive>() (for the contained data), so, in summary, the size is the sum:

size_of::<Recursive>() == 2 + size_of::<Recursive>()

That is, the size would have to be infinite.

Another way to look at it is just expanding Recursive repeatedly (as tuples, for clarity):

Recursive ==
(u8, Option<Recursive>) ==
(u8, Option<(u8, Option<Recursive>)>) ==
(u8, Option<(u8, Option<(u8, Option<Recursive>)>)>) ==
...

and all of this is stored inline in a single chunk of memory.

A Box<T> is a pointer, i.e. it has a fixed size, so (u8, Option<Box<Recursive>>) is 1 + 8 bytes. (One way to regard Box<T> is that it's a normal T with the guarantee that it has a fixed size.)

C# Generic Class Question

Yes, this sort of mutually recursive generic declaration will work, but it will make things very complicated - I know from experience. If you want an example of something similar, look at this declaration from my protocol buffers port:

public interface IMessage<TMessage, TBuilder> : IMessage<TMessage>
where TMessage : IMessage<TMessage, TBuilder>
where TBuilder : IBuilder<TMessage, TBuilder>

IBuilder<,> has the equivalent.

This declaration also demonstrates the answer to your last question: if some parts of your interface don't need to know the exact type of transaction, you can declare them in a "less generic" base interface. So you could have:

interface ITransaction<TAction, TItems> : ITransaction
where TItems : ITransactionItem<TAction, TItems>
where TAction : ITransaction<TAction, TItems>

for example, where ITransaction is a non-generic interface.

Again though, this is not for the faint of heart. In my case I can get away with it because almost no-one uses the raw interfaces - all the implementations are autogenerated, and client code uses those non-generic implementations. I would think long and hard before inflicting this on a developer to actually use day to day...

Java generics - Make Generic to extends 2 interfaces

Reimeus already pointed out that what you're asking for in your edit isn't possible. I'd just like to expand a little on why.

One would think you could use the following:

public <T, U extends T & IDisposable> void mapThis(
Class<? extends MyClass<T>> key,
Class<? extends U> value
) { ... }

In fact that's what came to my mind when I first saw this post. But this actually gives a compiler error:

a type variable may not be followed by other bounds

To help me explain why, I'd like to quote an Oracle Blogs post by Victor Rudometov about this error:

This fact is not always clear, but it is true. The following code
should not compile:

interface I {}

class TestBounds <U, T extends U & I> {

}

Because JLS Chapter 4
Types, Values, and Variables section 4.4 Type Variables states: "The
bound consists of either a type variable, or a class or interface type
T possibly followed by further interface types I1 , ..., In.
". So one
may use T extends U, T extends SomeClass & I, but not T extends U & I.
This rule applies to all cases including type variables and bounds in
methods and constructors.

The reasons for this restriction are explored in a closely related post: Why can't I use a type argument in a type parameter with multiple bounds?

To summarize, the restriction was imposed in order to "preclude certain awkward situations coming into existence" (JLS §4.9).

What kind of awkward situations? An answer by Chris Povirk describes one:

[A reason for the restriction is] the possibility of specifying illegal types. Specifically, extending a generic interface twice with different parameters. I can't come up with a non-contrived example, but:

/** Contains a Comparator<String> that also implements the given type T. */
class StringComparatorHolder<T, C extends T & Comparator<String>> {
private final C comparator;
// ...
}

void foo(StringComparatorHolder<Comparator<Integer>, ?> holder) { ... }

Now holder.comparator is a Comparator<Integer> and a Comparator<String>.

Chris also points to Sun bug 4899305, which was a bug contesting this language restriction. It was closed as Won't Fix with the following comment:

If a type variable could be followed by type variables or by (possibly
parameterized) interfaces, there would likely be more mutually
recursive type variables, which are very difficult to handle. Things
are already complicated when a bound is simply a parameterized type,
e.g. <S,R extends Comparable<S>>. Consequently, bounds are not going
to change now. Both javac and Eclipse agree that S&T and
S&Comparable<S> are illegal.

So those are the reasons behind the restriction. Addressing generic methods specifically (which your question concerns), I'd like to further point out that type inference would theoretically cause such bounds to be pointless anyway.

If we reexamine the type parameters declared in the hypothetical signature above:

<T, U extends T & IDisposable>

Assuming the caller isn't explicitly specifying T and U, this can be reduced to the following:

<T, U extends Object & IDisposable>

Or just this (subtle difference, but that's another topic):

<T, U extends IDisposable>

This is because T doesn't have any bounds, so no matter what type of arguments get passed in, T can always resolve to Object at the very least, and so then can U.

Let's go back and say T is bounded:

<T extends Foo, U extends T & IDisposable>

This can be reduced in the same way (Foo could be a class or interface):

<T extends Foo, U extends Foo & IDisposable>

Based on that reasoning, the syntax you're trying to achieve is pointless as far as restricting the caller to more specific arguments.

Pre-Java 8 addendum:

Prior to Java 8, there is a use case for what you're trying to do. Because of a limitation with how the compiler infers generic method type parameters, my above reasoning to go out the window. Take the following generic method:

class MyClass {
static <T> void foo(T t1, T t2) { }
}

This is a common beginner's mistake of trying to make a method that takes two parameters of the "same type". Of course it's pointless because of the way inheritance works:

MyClass.foo("asdf", 42); // legal

Here, T is inferred to be Object - this matches up with earlier reasoning about simplifying the mapThis type parameters. You have to manually specify the type parameters in order to achieve the intended type checking:

MyClass.<String>foo("asdf", 42); // compiler error

However, and here's where your use case starts to come in, it's a different matter with multiple type parameters with staggered bounds:

class MyClass {
static <T, U extends T> void foo(T t, U u) { }
}

Now this call errors:

MyClass.foo("asdf", 42); // compiler error

The tables have turned - we have to manually relax the type parameters to get it to compile:

MyClass.<Object, Object>foo("asdf", 42); // legal

This happens because of the limited way in which the compiler infers method type parameters. For this reason, what you wanted to achieve would've actually had an application in restricting the caller's arguments.

However, this problem appears to have been fixed in Java 8, and MyClass.foo("asdf", 42) now compiles without any error (thanks to Regent for pointing this out).

Java Generics and Inheritance (specific issue)

You can remove your A1 class and just use Comparable interface.

public class F1<T extends Comparable<T>> implements Comparable<F1<T>> {
T t;

@Override public int compareTo(F1<T> other){
return t.compareTo(other.t);
}
}

Why can't I use a type argument in a type parameter with multiple bounds?

I'm also not sure why the restriction is there. You could try sending a friendly e-mail to the designers of Java 5 Generics (chiefly Gilad Bracha and Neal Gafter).

My guess is that they wanted to support only an absolute minimum of intersection types (which is what multiple bounds essentially are), to make the language no more complex than needed. An intersection cannot be used as a type annotation; a programmer can only express an intersection when it appears as the upper bound of a type variable.

And why was this case even supported? The answer is that multiple bounds allow you to control the erasure, which allows to maintain binary compatibility when generifying existing classes. As explained in section 17.4 of the book by Naftalin and Wadler, a max method would logically have the following signature:

public static <T extends Comparable<? super T>> T max(Collection<? extends T> coll)

However, this erases to:

public static Comparable max(Collection coll)

Which does not match the historical signature of max, and causes old clients to break.
With multiple bounds, only the left-most bound is considered for the erasure, so if max is given the following signature:

public static <T extends Object & Comparable<? super T>> T max(Collection<? extends T> coll)

Then the erasure of its signature becomes:

public static Object max(Collection coll)

Which is equal to the signature of max before Generics.

It seems plausible that the Java designers only cared about this simple case and restricted other (more advanced) uses of intersection types because they were just unsure of the complexity that it might bring. So the reason for this design decision does not need to be a possible safety problem (as the question suggests).

More discussion on intersection types and restrictions of generics in an upcoming OOPSLA paper.



Related Topics



Leave a reply



Submit