Why Does C++ Not Have Reflection

Why does C++ not have reflection?

There are several problems with reflection in C++.

  • It's a lot of work to add, and the C++ committee is fairly conservative, and don't spend time on radical new features unless they're sure it'll pay off. (A suggestion for adding a module system similar to .NET assemblies has been made, and while I think there's general consensus that it'd be nice to have, it's not their top priority at the moment, and has been pushed back until well after C++0x. The motivation for this feature is to get rid of the #include system, but it would also enable at least some metadata).

  • You don't pay for what you don't
    use. That's one of the must basic
    design philosophies underlying C++.
    Why should my code carry around
    metadata if I may never need it?
    Moreover, the addition of metadata
    may inhibit the compiler from
    optimizing. Why should I pay that
    cost in my code if I may never need
    that metadata?

  • Which leads us to another big point:
    C++ makes very few guarantees
    about the compiled code. The
    compiler is allowed to do pretty
    much anything it likes, as long as
    the resulting functionality is what
    is expected. For example, your
    classes aren't required to actually
    be there. The compiler can optimize them away, inline
    everything they do, and it
    frequently does just that, because
    even simple template code tends to
    create quite a few template
    instantiations. The C++ standard
    library relies on this aggressive
    optimization. Functors are only
    performant if the overhead of
    instantiating and destructing the
    object can be optimized away.
    operator[] on a vector is only comparable to raw
    array indexing in performance
    because the entire operator can be
    inlined and thus removed entirely
    from the compiled code. C# and Java
    make a lot of guarantees about the
    output of the compiler. If I define
    a class in C#, then that class will
    exist
    in the resulting assembly.
    Even if I never use it. Even if all
    calls to its member functions could
    be inlined. The class has to be
    there, so that reflection can find
    it. Part of this is alleviated by C#
    compiling to bytecode, which means
    that the JIT compiler can remove
    class definitions and inline
    functions if it likes, even if the
    initial C# compiler can't. In C++,
    you only have one compiler, and it
    has to output efficient code. If you
    were allowed to inspect the metadata
    of a C++ executable, you'd expect to
    see every class it defined, which
    means that the compiler would have
    to preserve all the defined classes,
    even if they're not necessary.

  • And then there are templates.
    Templates in C++ are nothing like
    generics in other languages. Every
    template instantiation creates a
    new type. std::vector<int> is a completely separate class from
    std::vector<float>. That adds up to
    a lot of different types in a entire
    program. What should our reflection
    see? The template std::vector? But
    how can it, since that's a
    source-code construct, which has no
    meaning at runtime? It'd have to see
    the separate classes
    std::vector<int> and
    std::vector<float>. And
    std::vector<int>::iterator and
    std::vector<float>::iterator, same
    for const_iterator and so on. And
    once you step into template
    metaprogramming, you quickly end up
    instantiating hundreds of templates,
    all of which get inlined and removed
    again by the compiler. They have no
    meaning, except as part of a
    compile-time metaprogram. Should all
    these hundreds of classes be visible
    to reflection? They'd have to,
    because otherwise our reflection
    would be useless, if it doesn't even guarantee that the classes I defined will actually be there. And a side problem is that the template class doesn't exist until it is instantiated. Imagine a program which uses std::vector<int>. Should our reflection system be able to see std::vector<int>::iterator? On one hand, you'd certainly expect so. It's an important class, and it's defined in terms of std::vector<int>, which does exist in the metadata. On the other hand, if the program never actually uses this iterator class template, its type will never have been instantiated, and so the compiler won't have generated the class in the first place. And it's too late to create it at runtime, since it requires access to the source code.

  • And finally, reflection isn't quite
    as vital in C++ as it is in C#. The
    reason is again, template
    metaprogramming. It can't solve
    everything, but for many cases where
    you'd otherwise resort to
    reflection, it's possible to write a
    metaprogram which does the same
    thing at compile-time.
    boost::type_traits is a simple
    example. You want to know about type
    T? Check its type_traits. In C#,
    you'd have to fish around after its
    type using reflection. Reflection
    would still be useful for some
    things (the main use I can see,
    which metaprogramming can't easily
    replace, is for autogenerated
    serialization code), but it would
    carry some significant costs for
    C++, and it's just not necessary as often as it is in other languages.

Edit:
In response to comments:

cdleary:
Yes, debug symbols do something similar, in that they store metadata about the types used in the executable. But they also suffer from the problems I described. If you've ever tried debugging a release build, you'll know what I mean. There are large logical gaps where you created a class in the source code, which has gotten inlined away in the final code. If you were to use reflection for anything useful, you'd need it to be more reliable and consistent. As it is, types would be vanishing and disappearing almost every time you compile. You change a tiny little detail, and the compiler decides to change which types get inlined and which ones don't, as a response. How do you extract anything useful from that, when you're not even guaranteed that the most relevant types will be represented in your metadata? The type you were looking for may have been there in the last build, but now it's gone. And tomorrow, someone will check in a small innocent change to a small innocent function, which makes the type just big enough that it won't get completely inlined, so it'll be back again. That's still useful for debug symbols, but not much more than that. I'd hate trying to generate serialization code for a class under those terms.

Evan Teran: Of course these issues could be resolved. But that falls back to my point #1. It'd take a lot of work, and the C++ committee has plenty of things they feel is more important. Is the benefit of getting some limited reflection (and it would be limited) in C++ really big enough to justify focusing on that at the expense of other features? Is there really a huge benefit in adding features the core language which can already (mostly) be done through libraries and preprocessors like QT's? Perhaps, but the need is a lot less urgent than if such libraries didn't exist.
For your specific suggestions though, I believe disallowing it on templates would make it completely useless. You'd be unable to use reflection on the standard library, for example. What kind of reflection wouldn't let you see a std::vector? Templates are a huge part of C++. A feature that doesn't work on templates is basically useless.

But you're right, some form of reflection could be implemented. But it'd be a major change in the language. As it is now, types are exclusively a compile-time construct. They exist for the benefit of the compiler, and nothing else. Once the code has been compiled, there are no classes. If you stretch yourself, you could argue that functions still exist, but really, all there is is a bunch of jump assembler instructions, and a lot of stack push/pop's. There's not much to go on, when adding such metadata.

But like I said, there is a proposal for changes to the compilation model, adding self-contained modules, storing metadata for select types, allowing other modules to reference them without having to mess with #includes. That's a good start, and to be honest, I'm surprised the standard committee didn't just throw the proposal out for being too big a change. So perhaps in 5-10 years? :)

Reflection support in C

Reflection in general is a means for a program to analyze the structure of some code.
This analysis is used to change the effective behavior of the code.

Reflection as analysis is generally very weak; usually it can only provide access to function and field names. This weakness comes from the language implementers essentially not wanting to make the full source code available at runtime, along with the appropriate analysis routines to extract what one wants from the source code.

Another approach is tackle program analysis head on, by using a strong program analysis tool, e.g., one that can parse the source text exactly the way the compiler does it.
(Often people propose to abuse the compiler itself to do this, but that usually doesn't work; the compiler machinery wants to be a compiler and it is darn hard to bend it to other purposes).

What is needed is a tool that:

  • Parses language source text
  • Builds abstract syntax trees representing every detail of the program.
    (It is helpful if the ASTs retain comments and other details of the source
    code layout such as column numbers, literal radix values, etc.)
  • Builds symbol tables showing the scope and meaning of every identifier
  • Can extract control flows from functions
  • Can extact data flow from the code
  • Can construct a call graph for the system
  • Can determine what each pointer points-to
  • Enables the construction of custom analyzers using the above facts
  • Can transform the code according to such custom analyses
    (usually by revising the ASTs that represent the parsed code)
  • Can regenerate source text (including layout and comments) from
    the revised ASTs.

Using such machinery, one implements analysis at whatever level of detail is needed, and then transforms the code to achieve the effect that runtime reflection would accomplish.
There are several major benefits:

  • The detail level or amount of analysis is a matter of ambition (e.g., it isn't
    limited by what runtime reflection can only do)
  • There isn't any runtime overhead to achieve the reflected change in behavior
  • The machinery involved can be general and applied across many languages, rather
    than be limited to what a specific language implementation provides.
  • This is compatible with the C/C++ idea that you don't pay for what you don't use.
    If you don't need reflection, you don't need this machinery. And your language
    doesn't need to have the intellectual baggage of weak reflection built in.

See our DMS Software Reengineering Toolkit for a system that can do all of the above for C, Java, and COBOL, and most of it for C++.

[EDIT August 2017: Now handles C11 and C++2017]

How can I add reflection to a C++ application?

Ponder is a C++ reflection library, in answer to this question. I considered the options and decided to make my own since I couldn't find one that ticked all my boxes.

Although there are great answers to this question, I don't want to use tonnes of macros, or rely on Boost. Boost is a great library, but there are lots of small bespoke C++0x projects out that are simpler and have faster compile times. There are also advantages to being able to decorate a class externally, like wrapping a C++ library that doesn't (yet?) support C++11. It is fork of CAMP, using C++11, that no longer requires Boost.

Reflections for C language?

You can use ANTLR to do this. There is already a C grammar you can use with ANTLR so mainly all you have to do is pass the code into antlr then walk the syntax tree looking for various attributes....

you can use ANTLR from a number of languages. While it might seem daunting at first. It's actually surprisingly easy to work with.

Is there any major programming language that doesn't support any form of reflection?

C, C++ dont have any forms of reflection. What can be done is embed debugging symbol in the executable with the compiler, and then process the symbol table from within the executable. However, this process must be implemented by the code (i.e. write code in c to break down and process the symbol table in the executable). Therefore, it isn't inherent in the language.

Does C++ already have some kind of reflection?

It's compiler dependant. Obviously, it's easy for the compiler to spot every throw, and encode the type of every thrown object into the executable. But there is no requirement that they should do this.

And thinking about it, exceptions have to be copied into a weird implementation-dependent space when they are thrown. So it makes sense that the name of their type is accessible via this mechanism to the runtime of a specific compiler.

reflection TS - in C++23?

While the Reflection TS was officially finished and published, at the same time significant progress was being made developing an alternative syntax that made use of newer language features like consteval to express reflection information as values rather than types (as in traditional template metaprogramming). The TS was published anyway as a record of the design decisions already made and to serve as a point of reference for the new design, but so long as progress continues smoothly on it it’s unlikely that the old version will be implemented anywhere. It’s also unlikely that the new system will be finalized in time for C++23, although experimental implementations of it might become available at about that time.

Reflection in C++

If you are looking for a totally general way to manipulate objects at runtime when you don't know their types at compile time in C++, you essentially need to:

  1. Define an interface (abstract base class with all pure virtual methods and no members) for each capability that a class might support.
  2. Each class must inherit virtually from all interfaces that it wants to implement (possibly among other classes).

Now, suppose pFoo holds an interface pointer of type IFoo* to some object x (you don't need to know x's concrete type). You can see whether this object supports interface IBar by saying:

if (IBar* pBar = dynamic_cast<IBar*>(pFoo)) {
// Do stuff using pBar here
pBar->endWorldHunger();
} else {
// Object doesn't support the interface: degrade gracefully
pFoo->grinStupidly();
}

This approach assumes you know all relevant interfaces at compile time -- if you don't, you won't be able to use normal C++ syntax for calling methods anyway. But it's hard to imagine a situation where the calling program doesn't know what interfaces it needs -- about the only case I can think of would be if you want to expose C++ objects via an interactive interpreter. Even then, you can devise an (ugly, maintenance-intensive) way of shoehorning this into the above paradigm, so that methods can be called by specifying their names and arguments as strings.

The other aspect to consider is object creation. To accomplish this without knowing concrete types, you'll need a factory function, plus unique identifiers for classes to specify which concrete class you want. It's possible to arrange for classes to register themselves with a global factory upon startup, as described here by C++ expert Herb Sutter -- this avoids maintaining a gigantic switch statement, considerably easing maintenance. It's possible to use a single factory, though this implies that there is a single interface that every object in your system must implement (the factory will return a pointer or reference to this interface type).

At the end of the day, what you wind up with is basically (isomorphic to) COM -- dynamic_cast<IFoo*> does the same job as QueryInterface(IID_IFoo), and the base interface implemented by all objects is equivalent to IUnknown.

Why can reflection access protected/private member of class in C#?

This is necessary for scenarios such as remoting, serialization, materialization, etc. You shouldn't use it blindly, but note that these facilities have always been available in any system (essentially, by addressing the memory directly). Reflection simply formalises it, and places controls and checks in the way - which you aren't seeing because you are presumably running at "full trust", so you are already stronger than the system that is being protected.

If you try this in partial trust, you'll see much more control over the internal state.

Is it an anti-pattern?

Only if your code uses it inappropriately. For example, consider the following (valid for a WCF data-contract):

[DataMember]
private int foo;

public int Foo { get {return foo;} set {foo = value;} }

Is it incorrect for WCF to support this? I suspect not... there are multiple scenarios where you want to serialize something that isn't part of the public API, without having a separate DTO. Likewise, LINQ-to-SQL will materialize into private members if you so elect.



Related Topics



Leave a reply



Submit