Why does C++ not have reflection?
There are several problems with reflection in C++.
It's a lot of work to add, and the C++ committee is fairly conservative, and don't spend time on radical new features unless they're sure it'll pay off. (A suggestion for adding a module system similar to .NET assemblies has been made, and while I think there's general consensus that it'd be nice to have, it's not their top priority at the moment, and has been pushed back until well after C++0x. The motivation for this feature is to get rid of the
#include
system, but it would also enable at least some metadata).You don't pay for what you don't
use. That's one of the must basic
design philosophies underlying C++.
Why should my code carry around
metadata if I may never need it?
Moreover, the addition of metadata
may inhibit the compiler from
optimizing. Why should I pay that
cost in my code if I may never need
that metadata?Which leads us to another big point:
C++ makes very few guarantees
about the compiled code. The
compiler is allowed to do pretty
much anything it likes, as long as
the resulting functionality is what
is expected. For example, your
classes aren't required to actually
be there. The compiler can optimize them away, inline
everything they do, and it
frequently does just that, because
even simple template code tends to
create quite a few template
instantiations. The C++ standard
library relies on this aggressive
optimization. Functors are only
performant if the overhead of
instantiating and destructing the
object can be optimized away.operator[]
on a vector is only comparable to raw
array indexing in performance
because the entire operator can be
inlined and thus removed entirely
from the compiled code. C# and Java
make a lot of guarantees about the
output of the compiler. If I define
a class in C#, then that class will
exist in the resulting assembly.
Even if I never use it. Even if all
calls to its member functions could
be inlined. The class has to be
there, so that reflection can find
it. Part of this is alleviated by C#
compiling to bytecode, which means
that the JIT compiler can remove
class definitions and inline
functions if it likes, even if the
initial C# compiler can't. In C++,
you only have one compiler, and it
has to output efficient code. If you
were allowed to inspect the metadata
of a C++ executable, you'd expect to
see every class it defined, which
means that the compiler would have
to preserve all the defined classes,
even if they're not necessary.And then there are templates.
Templates in C++ are nothing like
generics in other languages. Every
template instantiation creates a
new type.std::vector<int>
is a completely separate class fromstd::vector<float>
. That adds up to
a lot of different types in a entire
program. What should our reflection
see? The templatestd::vector
? But
how can it, since that's a
source-code construct, which has no
meaning at runtime? It'd have to see
the separate classesstd::vector<int>
andstd::vector<float>
. Andstd::vector<int>::iterator
andstd::vector<float>::iterator
, same
forconst_iterator
and so on. And
once you step into template
metaprogramming, you quickly end up
instantiating hundreds of templates,
all of which get inlined and removed
again by the compiler. They have no
meaning, except as part of a
compile-time metaprogram. Should all
these hundreds of classes be visible
to reflection? They'd have to,
because otherwise our reflection
would be useless, if it doesn't even guarantee that the classes I defined will actually be there. And a side problem is that the template class doesn't exist until it is instantiated. Imagine a program which usesstd::vector<int>
. Should our reflection system be able to seestd::vector<int>::iterator
? On one hand, you'd certainly expect so. It's an important class, and it's defined in terms ofstd::vector<int>
, which does exist in the metadata. On the other hand, if the program never actually uses this iterator class template, its type will never have been instantiated, and so the compiler won't have generated the class in the first place. And it's too late to create it at runtime, since it requires access to the source code.- And finally, reflection isn't quite
as vital in C++ as it is in C#. The
reason is again, template
metaprogramming. It can't solve
everything, but for many cases where
you'd otherwise resort to
reflection, it's possible to write a
metaprogram which does the same
thing at compile-time.boost::type_traits
is a simple
example. You want to know about typeT
? Check itstype_traits
. In C#,
you'd have to fish around after its
type using reflection. Reflection
would still be useful for some
things (the main use I can see,
which metaprogramming can't easily
replace, is for autogenerated
serialization code), but it would
carry some significant costs for
C++, and it's just not necessary as often as it is in other languages.
Edit:
In response to comments:
cdleary:
Yes, debug symbols do something similar, in that they store metadata about the types used in the executable. But they also suffer from the problems I described. If you've ever tried debugging a release build, you'll know what I mean. There are large logical gaps where you created a class in the source code, which has gotten inlined away in the final code. If you were to use reflection for anything useful, you'd need it to be more reliable and consistent. As it is, types would be vanishing and disappearing almost every time you compile. You change a tiny little detail, and the compiler decides to change which types get inlined and which ones don't, as a response. How do you extract anything useful from that, when you're not even guaranteed that the most relevant types will be represented in your metadata? The type you were looking for may have been there in the last build, but now it's gone. And tomorrow, someone will check in a small innocent change to a small innocent function, which makes the type just big enough that it won't get completely inlined, so it'll be back again. That's still useful for debug symbols, but not much more than that. I'd hate trying to generate serialization code for a class under those terms.
Evan Teran: Of course these issues could be resolved. But that falls back to my point #1. It'd take a lot of work, and the C++ committee has plenty of things they feel is more important. Is the benefit of getting some limited reflection (and it would be limited) in C++ really big enough to justify focusing on that at the expense of other features? Is there really a huge benefit in adding features the core language which can already (mostly) be done through libraries and preprocessors like QT's? Perhaps, but the need is a lot less urgent than if such libraries didn't exist.
For your specific suggestions though, I believe disallowing it on templates would make it completely useless. You'd be unable to use reflection on the standard library, for example. What kind of reflection wouldn't let you see a std::vector
? Templates are a huge part of C++. A feature that doesn't work on templates is basically useless.
But you're right, some form of reflection could be implemented. But it'd be a major change in the language. As it is now, types are exclusively a compile-time construct. They exist for the benefit of the compiler, and nothing else. Once the code has been compiled, there are no classes. If you stretch yourself, you could argue that functions still exist, but really, all there is is a bunch of jump assembler instructions, and a lot of stack push/pop's. There's not much to go on, when adding such metadata.
But like I said, there is a proposal for changes to the compilation model, adding self-contained modules, storing metadata for select types, allowing other modules to reference them without having to mess with #include
s. That's a good start, and to be honest, I'm surprised the standard committee didn't just throw the proposal out for being too big a change. So perhaps in 5-10 years? :)
Reflection support in C
Reflection in general is a means for a program to analyze the structure of some code.
This analysis is used to change the effective behavior of the code.
Reflection as analysis is generally very weak; usually it can only provide access to function and field names. This weakness comes from the language implementers essentially not wanting to make the full source code available at runtime, along with the appropriate analysis routines to extract what one wants from the source code.
Another approach is tackle program analysis head on, by using a strong program analysis tool, e.g., one that can parse the source text exactly the way the compiler does it.
(Often people propose to abuse the compiler itself to do this, but that usually doesn't work; the compiler machinery wants to be a compiler and it is darn hard to bend it to other purposes).
What is needed is a tool that:
- Parses language source text
- Builds abstract syntax trees representing every detail of the program.
(It is helpful if the ASTs retain comments and other details of the source
code layout such as column numbers, literal radix values, etc.) - Builds symbol tables showing the scope and meaning of every identifier
- Can extract control flows from functions
- Can extact data flow from the code
- Can construct a call graph for the system
- Can determine what each pointer points-to
- Enables the construction of custom analyzers using the above facts
- Can transform the code according to such custom analyses
(usually by revising the ASTs that represent the parsed code) - Can regenerate source text (including layout and comments) from
the revised ASTs.
Using such machinery, one implements analysis at whatever level of detail is needed, and then transforms the code to achieve the effect that runtime reflection would accomplish.
There are several major benefits:
- The detail level or amount of analysis is a matter of ambition (e.g., it isn't
limited by what runtime reflection can only do) - There isn't any runtime overhead to achieve the reflected change in behavior
- The machinery involved can be general and applied across many languages, rather
than be limited to what a specific language implementation provides. - This is compatible with the C/C++ idea that you don't pay for what you don't use.
If you don't need reflection, you don't need this machinery. And your language
doesn't need to have the intellectual baggage of weak reflection built in.
See our DMS Software Reengineering Toolkit for a system that can do all of the above for C, Java, and COBOL, and most of it for C++.
[EDIT August 2017: Now handles C11 and C++2017]
How can I add reflection to a C++ application?
Ponder is a C++ reflection library, in answer to this question. I considered the options and decided to make my own since I couldn't find one that ticked all my boxes.
Although there are great answers to this question, I don't want to use tonnes of macros, or rely on Boost. Boost is a great library, but there are lots of small bespoke C++0x projects out that are simpler and have faster compile times. There are also advantages to being able to decorate a class externally, like wrapping a C++ library that doesn't (yet?) support C++11. It is fork of CAMP, using C++11, that no longer requires Boost.
Reflections for C language?
You can use ANTLR to do this. There is already a C grammar you can use with ANTLR so mainly all you have to do is pass the code into antlr then walk the syntax tree looking for various attributes....
you can use ANTLR from a number of languages. While it might seem daunting at first. It's actually surprisingly easy to work with.
Is there any major programming language that doesn't support any form of reflection?
C, C++ dont have any forms of reflection. What can be done is embed debugging symbol in the executable with the compiler, and then process the symbol table from within the executable. However, this process must be implemented by the code (i.e. write code in c to break down and process the symbol table in the executable). Therefore, it isn't inherent in the language.
Does C++ already have some kind of reflection?
It's compiler dependant. Obviously, it's easy for the compiler to spot every throw, and encode the type of every thrown object into the executable. But there is no requirement that they should do this.
And thinking about it, exceptions have to be copied into a weird implementation-dependent space when they are thrown. So it makes sense that the name of their type is accessible via this mechanism to the runtime of a specific compiler.
reflection TS - in C++23?
While the Reflection TS was officially finished and published, at the same time significant progress was being made developing an alternative syntax that made use of newer language features like consteval
to express reflection information as values rather than types (as in traditional template metaprogramming). The TS was published anyway as a record of the design decisions already made and to serve as a point of reference for the new design, but so long as progress continues smoothly on it it’s unlikely that the old version will be implemented anywhere. It’s also unlikely that the new system will be finalized in time for C++23, although experimental implementations of it might become available at about that time.
Reflection in C++
If you are looking for a totally general way to manipulate objects at runtime when you don't know their types at compile time in C++, you essentially need to:
- Define an interface (abstract base class with all pure virtual methods and no members) for each capability that a class might support.
- Each class must inherit virtually from all interfaces that it wants to implement (possibly among other classes).
Now, suppose pFoo
holds an interface pointer of type IFoo*
to some object x
(you don't need to know x
's concrete type). You can see whether this object supports interface IBar
by saying:
if (IBar* pBar = dynamic_cast<IBar*>(pFoo)) {
// Do stuff using pBar here
pBar->endWorldHunger();
} else {
// Object doesn't support the interface: degrade gracefully
pFoo->grinStupidly();
}
This approach assumes you know all relevant interfaces at compile time -- if you don't, you won't be able to use normal C++ syntax for calling methods anyway. But it's hard to imagine a situation where the calling program doesn't know what interfaces it needs -- about the only case I can think of would be if you want to expose C++ objects via an interactive interpreter. Even then, you can devise an (ugly, maintenance-intensive) way of shoehorning this into the above paradigm, so that methods can be called by specifying their names and arguments as strings.
The other aspect to consider is object creation. To accomplish this without knowing concrete types, you'll need a factory function, plus unique identifiers for classes to specify which concrete class you want. It's possible to arrange for classes to register themselves with a global factory upon startup, as described here by C++ expert Herb Sutter -- this avoids maintaining a gigantic switch
statement, considerably easing maintenance. It's possible to use a single factory, though this implies that there is a single interface that every object in your system must implement (the factory will return a pointer or reference to this interface type).
At the end of the day, what you wind up with is basically (isomorphic to) COM -- dynamic_cast<IFoo*>
does the same job as QueryInterface(IID_IFoo)
, and the base interface implemented by all objects is equivalent to IUnknown
.
Why can reflection access protected/private member of class in C#?
This is necessary for scenarios such as remoting, serialization, materialization, etc. You shouldn't use it blindly, but note that these facilities have always been available in any system (essentially, by addressing the memory directly). Reflection simply formalises it, and places controls and checks in the way - which you aren't seeing because you are presumably running at "full trust", so you are already stronger than the system that is being protected.
If you try this in partial trust, you'll see much more control over the internal state.
Is it an anti-pattern?
Only if your code uses it inappropriately. For example, consider the following (valid for a WCF data-contract):
[DataMember]
private int foo;
public int Foo { get {return foo;} set {foo = value;} }
Is it incorrect for WCF to support this? I suspect not... there are multiple scenarios where you want to serialize something that isn't part of the public API, without having a separate DTO. Likewise, LINQ-to-SQL will materialize into private members if you so elect.
Related Topics
Superiority of Unnamed Namespace Over Static
How to Enable C++17 Compiling in Visual Studio
C++11 Make_Pair With Specified Template Parameters Doesn't Compile
Return Type of ':' (Ternary Conditional Operator)
Correct Way to Work With Vector of Arrays
Sorting Zipped (Locked) Containers in C++ Using Boost or the Stl
What Exactly Is "Broken" With Microsoft Visual C++'S Two-Phase Template Instantiation
What Does Casting to 'Void' Really Do
Debugging Core Files Generated on a Customer'S Box
Will Std::String Always Be Null-Terminated in C++11
Why Is My Program Slow When Looping Over Exactly 8192 Elements
What's the Rationale For Null Terminated Strings
How to Redirect Cin and Cout to Files
Multi-Character Constant Warnings