What Are the Differences Between "Generic" Types in C++ and Java

What are the differences between generic types in C++ and Java?

There is a big difference between them. In C++ you don't have to specify a class or an interface for the generic type. That's why you can create truly generic functions and classes, with the caveat of a looser typing.

template <typename T> T sum(T a, T b) { return a + b; }

The method above adds two objects of the same type, and can be used for any type T that has the "+" operator available.

In Java you have to specify a type if you want to call methods on the objects passed, something like:

<T extends Something> T sum(T a, T b) { return a.add ( b ); }

In C++ generic functions/classes can only be defined in headers, since the compiler generates different functions for different types (that it's invoked with). So the compilation is slower. In Java the compilation doesn't have a major penalty, but Java uses a technique called "erasure" where the generic type is erased at runtime, so at runtime Java is actually calling ...

Something sum(Something a, Something b) { return a.add ( b ); }

Nevertheless, Java's generics help with type-safety.

What are the differences between Generics in C# and Java... and Templates in C++?

I'll add my voice to the noise and take a stab at making things clear:

C# Generics allow you to declare something like this.

List<Person> foo = new List<Person>();

and then the compiler will prevent you from putting things that aren't Person into the list.

Behind the scenes the C# compiler is just putting List<Person> into the .NET dll file, but at runtime the JIT compiler goes and builds a new set of code, as if you had written a special list class just for containing people - something like ListOfPerson.

The benefit of this is that it makes it really fast. There's no casting or any other stuff, and because the dll contains the information that this is a List of Person, other code that looks at it later on using reflection can tell that it contains Person objects (so you get intellisense and so on).

The downside of this is that old C# 1.0 and 1.1 code (before they added generics) doesn't understand these new List<something>, so you have to manually convert things back to plain old List to interoperate with them. This is not that big of a problem, because C# 2.0 binary code is not backwards compatible. The only time this will ever happen is if you're upgrading some old C# 1.0/1.1 code to C# 2.0

Java Generics allow you to declare something like this.

ArrayList<Person> foo = new ArrayList<Person>();

On the surface it looks the same, and it sort-of is. The compiler will also prevent you from putting things that aren't Person into the list.

The difference is what happens behind the scenes. Unlike C#, Java does not go and build a special ListOfPerson - it just uses the plain old ArrayList which has always been in Java. When you get things out of the array, the usual Person p = (Person)foo.get(1); casting-dance still has to be done. The compiler is saving you the key-presses, but the speed hit/casting is still incurred just like it always was.

When people mention "Type Erasure" this is what they're talking about. The compiler inserts the casts for you, and then 'erases' the fact that it's meant to be a list of Person not just Object

The benefit of this approach is that old code which doesn't understand generics doesn't have to care. It's still dealing with the same old ArrayList as it always has. This is more important in the java world because they wanted to support compiling code using Java 5 with generics, and having it run on old 1.4 or previous JVM's, which microsoft deliberately decided not to bother with.

The downside is the speed hit I mentioned previously, and also because there is no ListOfPerson pseudo-class or anything like that going into the .class files, code that looks at it later on (with reflection, or if you pull it out of another collection where it's been converted into Object or so on) can't tell in any way that it's meant to be a list containing only Person and not just any other array list.

C++ Templates allow you to declare something like this

std::list<Person>* foo = new std::list<Person>();

It looks like C# and Java generics, and it will do what you think it should do, but behind the scenes different things are happening.

It has the most in common with C# generics in that it builds special pseudo-classes rather than just throwing the type information away like java does, but it's a whole different kettle of fish.

Both C# and Java produce output which is designed for virtual machines. If you write some code which has a Person class in it, in both cases some information about a Person class will go into the .dll or .class file, and the JVM/CLR will do stuff with this.

C++ produces raw x86 binary code. Everything is not an object, and there's no underlying virtual machine which needs to know about a Person class. There's no boxing or unboxing, and functions don't have to belong to classes, or indeed anything.

Because of this, the C++ compiler places no restrictions on what you can do with templates - basically any code you could write manually, you can get templates to write for you.

The most obvious example is adding things:

In C# and Java, the generics system needs to know what methods are available for a class, and it needs to pass this down to the virtual machine. The only way to tell it this is by either hard-coding the actual class in, or using interfaces. For example:

string addNames<T>( T first, T second ) { return first.Name() + second.Name(); }

That code won't compile in C# or Java, because it doesn't know that the type T actually provides a method called Name(). You have to tell it - in C# like this:

interface IHasName{ string Name(); };
string addNames<T>( T first, T second ) where T : IHasName { .... }

And then you have to make sure the things you pass to addNames implement the IHasName interface and so on. The java syntax is different (<T extends IHasName>), but it suffers from the same problems.

The 'classic' case for this problem is trying to write a function which does this

string addNames<T>( T first, T second ) { return first + second; }

You can't actually write this code because there are no ways to declare an interface with the + method in it. You fail.

C++ suffers from none of these problems. The compiler doesn't care about passing types down to any VM's - if both your objects have a .Name() function, it will compile. If they don't, it won't. Simple.

So, there you have it :-)

What are the Differences between C++ Templates and Java/C# Generics and what are the limits?

First off, you might want to read my 2009 article on this subject.

The primary difference to my mind between C++ templates and C# generics is that C++ templates actually completely recompile the code upon construction of the template. The pros and cons of the C++ approach are many:

  • PRO: You can effectively create constraints like "the type argument T must have an addition operator"; if the code contains a couple of Ts added to each other then the template will not compile if you construct it with a type argument that doesn't permit addition.

  • CON: You can accidentally create undocumented constraints like "the type argument T must have an addition operator".

In C# you have to say what the constraints are which helps the user, but you are limited to only a small set of possible constraints: interfaces, base classes, value vs reference type and default constructor constraints, and that's all.

  • PRO: Semantic analysis can be completely different for two different constructions. If you want that, that's awesome.

  • CON: Semantic analysis can be completely different for two different constructions. If you don't want that, that's a bug waiting to happen.

In C# the semantic analysis is done once no matter how many times the type is constructed, and it is therefore required to work with any type argument that meets the constraints, not just the type arguments that are actually supplied.

  • PRO: You only generate the code for exactly the constructions you need.

  • CON: You generate the code for all the constructions you use.

Templates can cause codegen to get large. In C#, the IL for a generic type is generated once, and then at runtime the jitter does codegen for all the types your program uses. This has a small performance cost, but it is mitigated somewhat by the fact that the jitter actually only generates code once for all reference type arguments. So if you have List<object> and List<string> then the jitted code is only generated once and used for both. List<int> and List<short> by contrast jits the code twice.

  • PRO: when you use a template library, you have the source code right there.

  • CON: to use a template library you have to have the source code.

In C#, generic types are first-class types. If you stick them in a library, you can use that library anywhere without having to ship the source code.

And finally:

  • PRO: Templates permit template metaprogramming.

  • CON: Template metaprogramming is hard to understand for novices.

  • CON: The template system actually does not permit some type topologies that are extremely straightforward in a generic system.

For example, I imagine that it would be difficult to do something like this in C++:

class D<T> 
{
class S { }
D<D<T>.S> ds;
}

In C# generics, no problem. At runtime the type is only built once for all reference type arguments.

But in C++ templates, what happens when you have D<int>? The interior type constructs a field of type D<D<int>.S>, so we need to construct that type. But that type constructs a field of type D<D<D<int>.S>.S>... and so on to infinity.

How are Java generics different from C++ templates? Why can't I use int as a parameter?

Java generics are so different from C++ templates that I am not going to try to list the differences here. (See What are the differences between “generic” types in C++ and Java? for more details.)

In this particular case, the problem is that you cannot use primitives as generic type parameters (see JLS §4.5.1: "Type arguments may be either reference types or wildcards.").

However, due to autoboxing, you can do things like:

List<Integer> ints = new ArrayList<Integer>();
ints.add(3); // 3 is autoboxed into Integer.valueOf(3)

So that removes some of the pain. It definitely hurts runtime efficiency, though.

C# vs Java generics

streloksi's link does a great job of breaking down the differences. The quick and dirty summary though is ...

In terms of syntax and usage. The syntax is roughly the same between the languages. A few quirks here and there (most notably in constraints). But basically if you can read one, you can likely read/use the other.

The biggest difference though is in the implementation.

Java uses the notion of type erasure to implement generics. In short the underlying compiled classes are not actually generic. They compile down to Object and casts. In effect Java generics are a compile time artifact and can easily be subverted at runtime.

C# on the other hand, by virtue of the CLR, implement generics all they way down to the byte code. The CLR took several breaking changes in order to support generics in 2.0. The benefits are performance improvements, deep type safety verification and reflection.

Again the provided link has a much more in depth breakdown I encourage you to read

What is the difference between 'E', 'T', and '?' for Java generics?

Well there's no difference between the first two - they're just using different names for the type parameter (E or T).

The third isn't a valid declaration - ? is used as a wildcard which is used when providing a type argument, e.g. List<?> foo = ... means that foo refers to a list of some type, but we don't know what.

All of this is generics, which is a pretty huge topic. You may wish to learn about it through the following resources, although there are more available of course:

  • Java Tutorial on Generics
  • Language guide to generics
  • Generics in the Java programming language
  • Angelika Langer's Java Generics FAQ (massive and comprehensive; more for reference though)

Why do generic methods and generic types have different type introduction syntax?

The answer indeed lies in the GJ Specification, which has already been linked, quote from the document, p.14:

The convention of passing parameters before the method name is made necessary by parsing constraints: with the more conventional “type parameters after method name” convention the expression f (a<b,c>(d)) would have two possible parses.

Implementation from comments:

f(a<b,c>(d)) can parsed as either of f(a < b, c > d) (two booleans from comparisons passed to f) or f(a<B, C>(d)) (call of a with type arguments B and C and value argument d passed to f). I think this might also be why Scala chose to use [] instead of <> for generics.



Related Topics



Leave a reply



Submit