Difference Between Using Generic and Protocol as Type Parameters, What Are the Pros and Cons of Implement Them in a Function

Difference between using Generic and Protocol as type parameters, what are the pros and cons of implement them in a function

There is actually a video from this year's WWDC about that (it was about performance of classes, structs and protocols; I don't have a link but you should be able to find it).

In your second function, where you pass a any value that conforms to that protocol, you are actually passing a container that has 24 bytes of storage for the passed value, and 16 bytes for type related information (to determine which methods to call, ergo dynamic dispatch). If the passed value is now bigger than 24 bytes in memory, the object will be allocated on the heap and the container stores a reference to that object! That is actually extremely time consuming and should certainly be avoided if possible.

In your first function, where you use a generic constraint, there is actually created another function by the compiler that explicitly performs the function's operations upon that type. (If you use this function with lots of different types, your code size may, however, increase significantly; see C++ code bloat for further reference.) However, the compiler can now statically dispatch the methods, inline the function if possible and does certainly not have to allocate any heap space. Stated in the video mentioned above, code size does not have to increase significantly as code can still be shared... so the function with generic constraint is certainly the way to go!

What is the in-practice difference between generic and protocol-typed function parameters?

(I realise that OP is asking less about the language implications and more about what the compiler does – but I feel it's also worthwhile also to list the general differences between generic and protocol-typed function parameters)

1. A generic placeholder constrained by a protocol must be satisfied with a concrete type

This is a consequence of protocols not conforming to themselves, therefore you cannot call generic(some:) with a SomeProtocol typed argument.

struct Foo : SomeProtocol {
var someProperty: Int
}

// of course the solution here is to remove the redundant 'SomeProtocol' type annotation
// and let foo be of type Foo, but this problem is applicable anywhere an
// 'anything that conforms to SomeProtocol' typed variable is required.
let foo : SomeProtocol = Foo(someProperty: 42)

generic(some: something) // compiler error: cannot invoke 'generic' with an argument list
// of type '(some: SomeProtocol)'

This is because the generic function expects an argument of some type T that conforms to SomeProtocol – but SomeProtocol is not a type that conforms to SomeProtocol.

A non-generic function however, with a parameter type of SomeProtocol, will accept foo as an argument:

nonGeneric(some: foo) // compiles fine

This is because it accepts 'anything that can be typed as a SomeProtocol', rather than 'a specific type that conforms to SomeProtocol'.

2. Specialisation

As covered in this fantastic WWDC talk, an 'existential container' is used in order to represent a protocol-typed value.

This container consists of:

  • A value buffer to store the value itself, which is 3 words in length. Values larger than this will be heap allocated, and a reference to the value will be stored in the value buffer (as a reference is just 1 word in size).

  • A pointer to the type's metadata. Included in the type's metadata is a pointer to its value witness table, which manages the lifetime of value in the existential container.

  • One or (in the case of protocol composition) multiple pointers to protocol witness tables for the given type. These tables keep track of the type's implementation of the protocol requirements available to call on the given protocol-typed instance.

By default, a similar structure is used in order to pass a value into a generic placeholder typed argument.

  • The argument is stored in a 3 word value buffer (which may heap allocate), which is then passed to the parameter.

  • For each generic placeholder, the function takes a metadata pointer parameter. The metatype of the type that's used to satisfy the placeholder is passed to this parameter when calling.

  • For each protocol constraint on a given placeholder, the function takes a protocol witness table pointer parameter.

However, in optimised builds, Swift is able to specialise the implementations of generic functions – allowing the compiler to generate a new function for each type of generic placeholder that it's applied with. This allows for arguments to always be simply passed by value, at the cost of increasing code size. However, as the talk then goes onto say, aggressive compiler optimisations, particularly inlining, can counteract this bloat.

3. Dispatch of protocol requirements

Because of the fact that generic functions are able to be specialised, method calls on generic arguments passed in are able to be statically dispatched (although obviously not for types that use dynamic polymorphism, such as non-final classes).

Protocol-typed functions however generally cannot benefit from this, as they don't benefit from specialisation. Therefore method calls on a protocol-typed argument will be dynamically dispatched via the protocol witness table for that given argument, which is more expensive.

Although that being said, simple protocol-typed functions may be able to benefit from inlining. In such cases, the compiler is able to eliminate the overhead of the value buffer and protocol and value witness tables (this can be seen by examining the SIL emitted in a -O build), allowing it to statically dispatch methods in the same way as generic functions. However, unlike generic specialisation, this optimisation is not guaranteed for a given function (unless you apply the @inline(__always) attribute – but usually it's best to let the compiler decide this).

Therefore in general, generic functions are favoured over protocol-typed functions in terms of performance, as they can achieve static dispatch of methods without having to be inlined.

4. Overload resolution

When performing overload resolution, the compiler will favour the protocol-typed function over the generic one.

struct Foo : SomeProtocol {
var someProperty: Int
}

func bar<T : SomeProtocol>(_ some: T) {
print("generic")
}

func bar(_ some: SomeProtocol) {
print("protocol-typed")
}

bar(Foo(someProperty: 5)) // protocol-typed

This is because Swift favours an explicitly typed parameter over a generic one (see this Q&A).

5. Generic placeholders enforce the same type

As already said, using a generic placeholder allows you to enforce that the same type is used for all parameters/returns that are typed with that particular placeholder.

The function:

func generic<T : SomeProtocol>(a: T, b: T) -> T {
return a.someProperty < b.someProperty ? b : a
}

takes two arguments and has a return of the same concrete type, where that type conforms to SomeProtocol.

However the function:

func nongeneric(a: SomeProtocol, b: SomeProtocol) -> SomeProtocol {
return a.someProperty < b.someProperty ? b : a
}

carries no promises other than the arguments and return must conform to SomeProtocol. The actual concrete types that are passed and returned do not necessarily have to be the same.

Why don't associated types for protocols use generic type syntax in Swift?

This has been covered a few times on the devlist. The basic answer is that associated types are more flexible than type parameters. While you have a specific case here of one type parameter, it is quite possible to have several. For instance, Collections have an Element type, but also an Index type and a Generator type. If you specialized them entirely with type parameterization, you'd have to talk about things like Array<String, Int, Generator<String>> or the like. (This would allow me to create arrays that were subscripted by something other than Int, which could be considered a feature, but also adds a lot of complexity.)

It's possible to skip all that (Java does), but then you have fewer ways that you can constrain your types. Java in fact is pretty limited in how it can constrain types. You can't have an arbitrary indexing type on your collections in Java. Scala extends the Java type system with associated types just like Swift. Associated types have been incredibly powerful in Scala. They are also a regular source of confusion and hair-tearing.

Whether this extra power is worth it is a completely different question, and only time will tell. But associated types definitely are more powerful than simple type parameterization.

In constructor method references, difference between using generic type parameters and not?

The constructor reference TreeMap::new is the same as using diamond type inference (§15.13.1):

For convenience, when the name of a generic type is used to refer to an instance method (where the receiver becomes the first parameter), the target type is used to determine the type arguments. This facilitates usage like Pair::first in place of Pair<String,Integer>::first.

Similarly, a method reference like Pair::new is treated like a "diamond" instance creation (new Pair<>()). Because the "diamond" is implicit, this form does not instantiate a raw type; in fact, there is no way to express a reference to the constructor of a raw type.

You'd need to provide type arguments explicitly in more or less the same situations as when you would need to provide type arguments to a constructor explicitly.

For example, in the following, the call to get prevents the return assignment from being considered during inference of supplier, so T is inferred to be ArrayList<Object>:

class Example {
public static void main(String[] args) {
ArrayList<String> list =
supplier(ArrayList::new).get(); // compile error
}
static <T> Supplier<T> supplier(Supplier<T> s) { return s; }
}

In that contrived example, we'd have to use ArrayList<String>::new.

Are type-erased Any... structs necessary for non-generic protocols?

A protocol that does not have an associated type can easily be used as a Type in its own right. This is often done to allow diverse concrete types to be stored in collections identifying them only by a common protocol that all the concrete types implement.

Or to put it another way "type erasing" is a technique for dealing with protocols that have associated types. If your protocol does not have associated types, there is no need to employ the technique.

Multiple Type Constraints

You could use an if statement when your method is called. Then have two different versions of the method (one for each type of constraint) and depending on which constraint you need to use, call the appropriate method.



Related Topics



Leave a reply



Submit