What Are the Differences Between Generics in C# and Java... and Templates in C++

What are the differences between Generics in C# and Java... and Templates in C++?

I'll add my voice to the noise and take a stab at making things clear:

C# Generics allow you to declare something like this.

List<Person> foo = new List<Person>();

and then the compiler will prevent you from putting things that aren't Person into the list.

Behind the scenes the C# compiler is just putting List<Person> into the .NET dll file, but at runtime the JIT compiler goes and builds a new set of code, as if you had written a special list class just for containing people - something like ListOfPerson.

The benefit of this is that it makes it really fast. There's no casting or any other stuff, and because the dll contains the information that this is a List of Person, other code that looks at it later on using reflection can tell that it contains Person objects (so you get intellisense and so on).

The downside of this is that old C# 1.0 and 1.1 code (before they added generics) doesn't understand these new List<something>, so you have to manually convert things back to plain old List to interoperate with them. This is not that big of a problem, because C# 2.0 binary code is not backwards compatible. The only time this will ever happen is if you're upgrading some old C# 1.0/1.1 code to C# 2.0

Java Generics allow you to declare something like this.

ArrayList<Person> foo = new ArrayList<Person>();

On the surface it looks the same, and it sort-of is. The compiler will also prevent you from putting things that aren't Person into the list.

The difference is what happens behind the scenes. Unlike C#, Java does not go and build a special ListOfPerson - it just uses the plain old ArrayList which has always been in Java. When you get things out of the array, the usual Person p = (Person)foo.get(1); casting-dance still has to be done. The compiler is saving you the key-presses, but the speed hit/casting is still incurred just like it always was.

When people mention "Type Erasure" this is what they're talking about. The compiler inserts the casts for you, and then 'erases' the fact that it's meant to be a list of Person not just Object

The benefit of this approach is that old code which doesn't understand generics doesn't have to care. It's still dealing with the same old ArrayList as it always has. This is more important in the java world because they wanted to support compiling code using Java 5 with generics, and having it run on old 1.4 or previous JVM's, which microsoft deliberately decided not to bother with.

The downside is the speed hit I mentioned previously, and also because there is no ListOfPerson pseudo-class or anything like that going into the .class files, code that looks at it later on (with reflection, or if you pull it out of another collection where it's been converted into Object or so on) can't tell in any way that it's meant to be a list containing only Person and not just any other array list.

C++ Templates allow you to declare something like this

std::list<Person>* foo = new std::list<Person>();

It looks like C# and Java generics, and it will do what you think it should do, but behind the scenes different things are happening.

It has the most in common with C# generics in that it builds special pseudo-classes rather than just throwing the type information away like java does, but it's a whole different kettle of fish.

Both C# and Java produce output which is designed for virtual machines. If you write some code which has a Person class in it, in both cases some information about a Person class will go into the .dll or .class file, and the JVM/CLR will do stuff with this.

C++ produces raw x86 binary code. Everything is not an object, and there's no underlying virtual machine which needs to know about a Person class. There's no boxing or unboxing, and functions don't have to belong to classes, or indeed anything.

Because of this, the C++ compiler places no restrictions on what you can do with templates - basically any code you could write manually, you can get templates to write for you.

The most obvious example is adding things:

In C# and Java, the generics system needs to know what methods are available for a class, and it needs to pass this down to the virtual machine. The only way to tell it this is by either hard-coding the actual class in, or using interfaces. For example:

string addNames<T>( T first, T second ) { return first.Name() + second.Name(); }

That code won't compile in C# or Java, because it doesn't know that the type T actually provides a method called Name(). You have to tell it - in C# like this:

interface IHasName{ string Name(); };
string addNames<T>( T first, T second ) where T : IHasName { .... }

And then you have to make sure the things you pass to addNames implement the IHasName interface and so on. The java syntax is different (<T extends IHasName>), but it suffers from the same problems.

The 'classic' case for this problem is trying to write a function which does this

string addNames<T>( T first, T second ) { return first + second; }

You can't actually write this code because there are no ways to declare an interface with the + method in it. You fail.

C++ suffers from none of these problems. The compiler doesn't care about passing types down to any VM's - if both your objects have a .Name() function, it will compile. If they don't, it won't. Simple.

So, there you have it :-)

What are the Differences between C++ Templates and Java/C# Generics and what are the limits?

First off, you might want to read my 2009 article on this subject.

The primary difference to my mind between C++ templates and C# generics is that C++ templates actually completely recompile the code upon construction of the template. The pros and cons of the C++ approach are many:

  • PRO: You can effectively create constraints like "the type argument T must have an addition operator"; if the code contains a couple of Ts added to each other then the template will not compile if you construct it with a type argument that doesn't permit addition.

  • CON: You can accidentally create undocumented constraints like "the type argument T must have an addition operator".

In C# you have to say what the constraints are which helps the user, but you are limited to only a small set of possible constraints: interfaces, base classes, value vs reference type and default constructor constraints, and that's all.

  • PRO: Semantic analysis can be completely different for two different constructions. If you want that, that's awesome.

  • CON: Semantic analysis can be completely different for two different constructions. If you don't want that, that's a bug waiting to happen.

In C# the semantic analysis is done once no matter how many times the type is constructed, and it is therefore required to work with any type argument that meets the constraints, not just the type arguments that are actually supplied.

  • PRO: You only generate the code for exactly the constructions you need.

  • CON: You generate the code for all the constructions you use.

Templates can cause codegen to get large. In C#, the IL for a generic type is generated once, and then at runtime the jitter does codegen for all the types your program uses. This has a small performance cost, but it is mitigated somewhat by the fact that the jitter actually only generates code once for all reference type arguments. So if you have List<object> and List<string> then the jitted code is only generated once and used for both. List<int> and List<short> by contrast jits the code twice.

  • PRO: when you use a template library, you have the source code right there.

  • CON: to use a template library you have to have the source code.

In C#, generic types are first-class types. If you stick them in a library, you can use that library anywhere without having to ship the source code.

And finally:

  • PRO: Templates permit template metaprogramming.

  • CON: Template metaprogramming is hard to understand for novices.

  • CON: The template system actually does not permit some type topologies that are extremely straightforward in a generic system.

For example, I imagine that it would be difficult to do something like this in C++:

class D<T> 
{
class S { }
D<D<T>.S> ds;
}

In C# generics, no problem. At runtime the type is only built once for all reference type arguments.

But in C++ templates, what happens when you have D<int>? The interior type constructs a field of type D<D<int>.S>, so we need to construct that type. But that type constructs a field of type D<D<D<int>.S>.S>... and so on to infinity.

What makes a template different from a generic?

Hm.. if you say you understand C++ templates in depth and say that you don't see/feel the difference between generics and them, well, that most probably you are right :)

There are many differences that will describe how/why generics are better than templates, list tons of differences, etc, but that's mostly irrelevant to the core of the idea.

The idea is to allow better code reuse. Templates/generics provide you a way to build a some kind of higher-order class definitions that abstract over some of the actual types.

In this terms, there is no difference between them, and the only differences are those enforced by specific features and constraints of the underlying language and runtime.

One may argue that generics provide some extra features (usually when talking about dynamic introspection of object's class tree), but very few of them (if any at all) cannot be implemented manually in C++'s templates. With some effort, most of them can be implemented, or emulated, hence they are not good as a distinction between 'proper generics' and 'real templates'.

Others will argue that the sheer potential power of optimization that is available thanks to the C++'s copy-paste behavior is the difference. Sorry, not true. JITs in Java and C# can do it too, well, almost, but do it very well.

There is however one thing that really could make the Java/C#'s generics a true subset of C++'s templates features. And you even have mentioned it!

It is template specialization.

In C++, each specialization behaves as a completely different definition.

In C++, template<typename T> Foo specialized to T==int may look like:

class Foo<int> 
{
void hug_me();

int hugs_count() const;
}

while "the same" template specialized to T==MyNumericType may look like

class Foo<MyNumericType> 
{
void hug_me();

MyNumericType get_value() const;
void reset_value() const;
}

FYI: that's just pseudocode, won't compile:)

Neither Java's nor C#'s generics can do that, because their definition states that all generic-type-materializations will have the same "user interface".

More to it, C++ uses a SFINAE rule. Many "theoretically colliding" specializations' definitions may exist for a template. However, when the template is being used, only those "actually good" are used.

With classes similar to the example above, if you use:

 Foo<double> foood;
foood.reset_value();

only the second specialization would be used, as the first one would not compile because of ... "reset_value" missing.

With generics, you cannot do that. You'd need to create a generic class that has all possible methods, and then that would at runtime dynamically inspect the inner objects and throw some 'not implemented' or 'not supported' exceptions for unavailable methods. That's... just awful. Such things should be possible at compile-time.

The actual power, implications, problems and overall complexity of template specialization and SFINAE is what truly differentiates the generics and templates. Simply, generics are defined in a such way, that specialization is not possible, hence SFINAE is not possible, hence, the whole mechanism is paradoxically much easier/simplier.

Both easier/simplier to implement in the compiler's internals, and to be understood by non-savant brains.

Although I agree with the overall benefits of generics in Java/C#, I really miss the specializations, interface flexibility, and SFINAE rule. However, I would not be fair if I'd not mention one important thing related to sane OO design: if you template-specialization for type xxx actually changes it's client API, then most probably it should be named differently and should form a different template. All the extra goodies that templates can do were mostly added to the tools set because ... in C++ there was no reflection and it had to be emulated somehow. SFINAE is a form of compile-time reflection.

Hence, the biggest player in the world of differences gets reduced to a curious (beneficial) sideeffect of a hotfix applied to mask the runtime's deficiency, which is the almost complete lack of runtime introspection :))

Therefore, I say that there are no difference other than some arbitrary ones enforced by laguage, or some arbitrary ones enforced by the runtime platform.

All of them are just a form of higher-order classes or functions/methods, and I think that this is the most important thing and feature.

C# vs Java generics

streloksi's link does a great job of breaking down the differences. The quick and dirty summary though is ...

In terms of syntax and usage. The syntax is roughly the same between the languages. A few quirks here and there (most notably in constraints). But basically if you can read one, you can likely read/use the other.

The biggest difference though is in the implementation.

Java uses the notion of type erasure to implement generics. In short the underlying compiled classes are not actually generic. They compile down to Object and casts. In effect Java generics are a compile time artifact and can easily be subverted at runtime.

C# on the other hand, by virtue of the CLR, implement generics all they way down to the byte code. The CLR took several breaking changes in order to support generics in 2.0. The benefits are performance improvements, deep type safety verification and reflection.

Again the provided link has a much more in depth breakdown I encourage you to read

C# generics compared to C++ templates

You can consider C++ templates to be an interpreted, functional programming language disguised as a generics system. If this doesn't scare you, it should :)

C# generics are very restricted; you can parameterize a class on a type or types, and use those types in methods. So, to take an example from MSDN, you could do:

public class Stack<T>
{
T[] m_Items;
public void Push(T item)
{...}
public T Pop()
{...}
}

And now you can declare a Stack<int> or Stack<SomeObject> and it'll store objects of that type, safely (ie, no worried about putting SomeOtherObject in by mistake).

Internally, the .NET runtime will specialize it into variants for fundamental types like int, and a variant for object types. This allows the representation for Stack<byte> to be much smaller than that of Stack<SomeObject>, for example.

C++ templates allow a similar use:

template<typename T>
class Stack
{
T *m_Items;
public void Push(const T &item)
{...}
public T Pop()
{...}
};

This looks similar at first glance, but there are a few important differences. First, instead of one variant for each fundamental type and one for all object types, there is one variant for each type it's instantiated against. That can be a lot of types!

The next major difference is (on most C++ compilers) it will be compiled in each translation unit it's used in. That can slow down compiles a lot.

Another interesting attribute to C++'s templates is they can by applied to things other than classes - and when they are, their arguments can be automatically detected. For example:

template<typename T>
T min(const T &a, const T &b) {
return a > b ? b : a;
}

The type T will be automatically determined by the context the function is used in.

These attributes can be used to good ends, at the expense of your sanity. Because a C++ template is recompiled for each type it's used against, and the implementation of a template is always available to the compiler, C++ can do very aggressive inlining on templates. Add to that the automatic detection of template values in functions, and you can make anonymous pseudo-functions in C++, using boost::lambda. Thus, an expression like:

_1 + _2 + _3

Produces an object with a seriously scary type, which has an operator() which adds up its arguments.

There are plenty of other dark corners of the C++ template system - it's an extremely powerful tool, but can be painful to think about, and sometimes hard to use - particularly when it gives you a twenty-page long error message. The C# system is much simpler - less powerful, but easier to understand and harder to abuse.

What are the differences between generic types in C++ and Java?

There is a big difference between them. In C++ you don't have to specify a class or an interface for the generic type. That's why you can create truly generic functions and classes, with the caveat of a looser typing.

template <typename T> T sum(T a, T b) { return a + b; }

The method above adds two objects of the same type, and can be used for any type T that has the "+" operator available.

In Java you have to specify a type if you want to call methods on the objects passed, something like:

<T extends Something> T sum(T a, T b) { return a.add ( b ); }

In C++ generic functions/classes can only be defined in headers, since the compiler generates different functions for different types (that it's invoked with). So the compilation is slower. In Java the compilation doesn't have a major penalty, but Java uses a technique called "erasure" where the generic type is erased at runtime, so at runtime Java is actually calling ...

Something sum(Something a, Something b) { return a.add ( b ); }

Nevertheless, Java's generics help with type-safety.

What are the differences between Generics in C# and Java... and Templates in C++?

I'll add my voice to the noise and take a stab at making things clear:

C# Generics allow you to declare something like this.

List<Person> foo = new List<Person>();

and then the compiler will prevent you from putting things that aren't Person into the list.

Behind the scenes the C# compiler is just putting List<Person> into the .NET dll file, but at runtime the JIT compiler goes and builds a new set of code, as if you had written a special list class just for containing people - something like ListOfPerson.

The benefit of this is that it makes it really fast. There's no casting or any other stuff, and because the dll contains the information that this is a List of Person, other code that looks at it later on using reflection can tell that it contains Person objects (so you get intellisense and so on).

The downside of this is that old C# 1.0 and 1.1 code (before they added generics) doesn't understand these new List<something>, so you have to manually convert things back to plain old List to interoperate with them. This is not that big of a problem, because C# 2.0 binary code is not backwards compatible. The only time this will ever happen is if you're upgrading some old C# 1.0/1.1 code to C# 2.0

Java Generics allow you to declare something like this.

ArrayList<Person> foo = new ArrayList<Person>();

On the surface it looks the same, and it sort-of is. The compiler will also prevent you from putting things that aren't Person into the list.

The difference is what happens behind the scenes. Unlike C#, Java does not go and build a special ListOfPerson - it just uses the plain old ArrayList which has always been in Java. When you get things out of the array, the usual Person p = (Person)foo.get(1); casting-dance still has to be done. The compiler is saving you the key-presses, but the speed hit/casting is still incurred just like it always was.

When people mention "Type Erasure" this is what they're talking about. The compiler inserts the casts for you, and then 'erases' the fact that it's meant to be a list of Person not just Object

The benefit of this approach is that old code which doesn't understand generics doesn't have to care. It's still dealing with the same old ArrayList as it always has. This is more important in the java world because they wanted to support compiling code using Java 5 with generics, and having it run on old 1.4 or previous JVM's, which microsoft deliberately decided not to bother with.

The downside is the speed hit I mentioned previously, and also because there is no ListOfPerson pseudo-class or anything like that going into the .class files, code that looks at it later on (with reflection, or if you pull it out of another collection where it's been converted into Object or so on) can't tell in any way that it's meant to be a list containing only Person and not just any other array list.

C++ Templates allow you to declare something like this

std::list<Person>* foo = new std::list<Person>();

It looks like C# and Java generics, and it will do what you think it should do, but behind the scenes different things are happening.

It has the most in common with C# generics in that it builds special pseudo-classes rather than just throwing the type information away like java does, but it's a whole different kettle of fish.

Both C# and Java produce output which is designed for virtual machines. If you write some code which has a Person class in it, in both cases some information about a Person class will go into the .dll or .class file, and the JVM/CLR will do stuff with this.

C++ produces raw x86 binary code. Everything is not an object, and there's no underlying virtual machine which needs to know about a Person class. There's no boxing or unboxing, and functions don't have to belong to classes, or indeed anything.

Because of this, the C++ compiler places no restrictions on what you can do with templates - basically any code you could write manually, you can get templates to write for you.

The most obvious example is adding things:

In C# and Java, the generics system needs to know what methods are available for a class, and it needs to pass this down to the virtual machine. The only way to tell it this is by either hard-coding the actual class in, or using interfaces. For example:

string addNames<T>( T first, T second ) { return first.Name() + second.Name(); }

That code won't compile in C# or Java, because it doesn't know that the type T actually provides a method called Name(). You have to tell it - in C# like this:

interface IHasName{ string Name(); };
string addNames<T>( T first, T second ) where T : IHasName { .... }

And then you have to make sure the things you pass to addNames implement the IHasName interface and so on. The java syntax is different (<T extends IHasName>), but it suffers from the same problems.

The 'classic' case for this problem is trying to write a function which does this

string addNames<T>( T first, T second ) { return first + second; }

You can't actually write this code because there are no ways to declare an interface with the + method in it. You fail.

C++ suffers from none of these problems. The compiler doesn't care about passing types down to any VM's - if both your objects have a .Name() function, it will compile. If they don't, it won't. Simple.

So, there you have it :-)

C# generics vs C++ templates - need a clarification about constraints

Well, in general, C++ templates and C# generics are similar - compared to Java generics which are completely different, but they have also large differences. Like in C#, there is runtime support by using reflection, getting an object describing the types used to instantiate a generics. C++ doesn't have reflection, and all it does with types is done at compile time.

The biggest difference between C# generics and C++ templates indeed are that C# generics are better type checked. They are always constrained, in the sense that they don't allow operations that are not stated valid at the time of defining the generics. C#'s chief designer raised as a reason of that the added complexity it would have taken to have implied constraints. I'm not well versed with C#, so i can't talk further here. I'll talk about about how matters are in C++ and how they are going to be improved, so that people don't think C++'s stuff is all wrong.

In C++, templates are not constrained. If you do an operation, at template definition time it is implied that the operation will succeed at instantiation time. It's not even required to a C++ compiler that the template is syntactically checked for validity. If it contains a syntax error, then that error has to be diagnosed at instantiation. Any diagnose before that is a pure goody of the implementation.

Those implied constraint have shown to be easy for the template designer in the short term, because they don't have to care about stating the valid operations in their template interface. They put the burden on the user of their template - so the user has to make sure he fulfills all those requirements. Often it happens that the user tries seemingly valid operations but fails, with the compiler giving the user hundreds of lines of error messages about some invalid syntax or not found names. Because the compiler can't know what constraint in particular was violated in the first place, it lists all parts of code paths ever involved around the faulty place and all not even important details, and the user will have to crawl through the horrible error message text.

That is a fundamental problem, which can be solved by just stating at the interface for a template or generics what properties a type parameter has to have. C#, as far as i know it, can constraint the parameter to implement an interface or inherit a base-class. It solves that on a type-level.

The C++ committee has long seen there is need to fix these problems, and soon (next year, probably), C++ will have a way to state such explicit constraints too (see time-machine note below), as in the following case.

template<typename T> requires VariableType<T>
T f(T a, T b) {
return a + b;
}

The compiler signals an error at that point, because the expression as written is not marked valid by the requirements. This first helps the designer of the template to write more correct code, because the code is type-checked already to some degree (well to what is possible there). The programmer can now state that requirement:

template<typename T> requires VariableType<T> && HasPlus<T, T>
T f(T a, T b) {
return a + b;
}

Now, it will compiler. The compiler, by seeing T appearing as the return type, automatically implied that T is copyable, because that use of T appears in the interface, rather than in the templates body. The other requirements were stated using requirement clauses. Now, the user will get a appropriate error message if he uses a type that doesn't have an op+ defined.

C++1x decouples the requirements from the type. The above works for primitive types aswell as for classes. In this sense, they are more flexible, but quite a bit complex. The rules that state when and when requirements are satisfied are long... You can with the new rules say the following:

template<typename T> requires MyCuteType<T>
void f(T t) { *t = 10; }

And then, call f with an int! That would work by just writing a concept map for MyCuteType<int> that teaches the compiler how an int can be dereferenced. It will get quite handy in loops like this:

for_each(0, 100, doSomething());

Since the programmer can tell the compiler how an int can satisfy the concept of an input iterator, you could actually write such code in C++1x, if you only write the appropriate concept map, which really isn't all that difficult.

Ok, enough with this. I hope i could show you that having templates constrained is not all that bad, but in fact better, because the relationship betweens types and the operations on them within the templates are now known by the compiler. And i haven't even written about axioms, which are another nice thing in C++1x' concepts. Remember that this is future stuff, it's not yet out, but it will approximately at 2010. Then we will have to wait for some compiler to implement that all :)


UPDATE FROM "FUTURE"

C++0x concepts were not accepted into the draft but have been voted out at late of 2009. Too bad! But perhaps we will see it again in the next C++ version? Let's all hope!

Are generics in C# treated the same way as in C++?

This is a very broad topic and can better be explained by:

http://msdn.microsoft.com/en-us/library/c6cyy67b.aspx

To sumarize (from MSDN article linked above):



Leave a reply



Submit