What Are Customization Point Objects and How to Use Them

What are customization point objects and how to use them?

What are customization point objects?

They are function object instances in namespace std that fulfill two objectives: first unconditionally trigger (conceptified) type requirements on the argument(s), then dispatch to the correct function in namespace std or via ADL.

In particular, why are they objects?

That's necessary to circumvent a second lookup phase that would directly bring in the user provided function via ADL (this should be postponed by design). See below for more details.

... and how to use them?

When developing an application: you mainly don't. This is a standard library feature, it will add concept checking to future customization points, hopefully resulting e.g. in clear error messages when you mess up template instantiations. However, with a qualified call to such a customization point, you can directly use it. Here's an example with an imaginary std::customization_point object that adheres to the design:

namespace a {
struct A {};
// Knows what to do with the argument, but doesn't check type requirements:
void customization_point(const A&);
}

// Does concept checking, then calls a::customization_point via ADL:
std::customization_point(a::A{});

This is currently not possible with e.g. std::swap, std::begin and the like.

Explanation (a summary of N4381)

Let me try to digest the proposal behind this section in the standard. There are two issues with "classical" customization points used by the standard library.

  • They are easy to get wrong. As an example, swapping objects in generic code is supposed to look like this

    template<class T> void f(T& t1, T& t2)
    {
    using std::swap;
    swap(t1, t2);
    }

    but making a qualified call to std::swap(t1, t2) instead is too simple - the user-provided
    swap would never be called (see
    N4381, Motivation and Scope)

  • More severely, there is no way to centralize (conceptified) constraints on types passed to such user provided functions (this is also why this topic gained importance with C++20). Again
    from N4381:

    Suppose that a future version of std::begin requires that its argument model a Range concept.
    Adding such a constraint would have no effect on code that uses std::begin idiomatically:


    using std::begin;


    begin(a);


    If the call to begin dispatches to a user-defined overload, then the constraint on std::begin
    has been bypassed.

The solution that is described in the proposal mitigates both issues
by an approach like the following, imaginary implementation of std::begin.

namespace std {
namespace __detail {
/* Classical definitions of function templates "begin" for
raw arrays and ranges... */

struct __begin_fn {
/* Call operator template that performs concept checking and
* invokes begin(arg). This is the heart of the technique.
* Everyting from above is already in the __detail scope, but
* ADL is triggered, too. */

};
}

/* Thanks to @cpplearner for pointing out that the global
function object will be an inline variable: */
inline constexpr __detail::__begin_fn begin{};
}

First, a qualified call to e.g. std::begin(someObject) always detours via std::__detail::__begin_fn,
which is desired. For what happens with an unqualified call, I again refer to the original paper:

In the case that begin is called unqualified after bringing std::begin into scope, the situation
is different. In the first phase of lookup, the name begin will resolve to the global object
std::begin. Since lookup has found an object and not a function, the second phase of lookup is not
performed. In other words, if std::begin is an object, then using std::begin; begin(a); is
equivalent to std::begin(a); which, as we’ve already seen, does argument-dependent lookup on the
users’ behalf.

This way, concept checking can be performed within the function object in the std namespace,
before the ADL call to a user provided function is performed. There is no way to circumvent this.

Customization points and ADL

I've found two solutions to this problem. Both have their downsides.

Declare All Std Overloads

Let overloads for standard types be found by normal lookup. This basically means declaring all of them before using the extension function. Remember: when you perform an unqualified call in a function template, normal lookup happens at the point of definition, while ADL happens at the point of instantiation. This means that normal lookup only finds overloads visible from where the template is written, whereas ADL finds stuff defined later on.

The upside of this approach is that nothing changes for the user when writing his own functions.

The downside is that you have to include the header of every standard type you want to provide an overload for, and provide that overload, in the header that just wants to define the extension point. This can mean a very heavy dependency.

Add Another Argument

The other option is to pass a second argument to the function. Here's how this works:

namespace your_stuff {
namespace adl {
struct tag {}

void extension_point() = delete; // this is just a dummy
}

template <typename T>
void use_extension_point(const T& t) {
using adl::extension_point;
extension_point(t, adl::tag{}); // this is the ADL call
}

template <typename T>
void needs_extension_point(const T& t) {
your_stuff::use_extension_point(t); // suppress ADL
}
}

Now you can, at basically any point in the program, provide overloads for std (or even global or built-in) types like this:

namespace your_stuff { namespace adl {
void extension_point(const std::string& s, tag) {
// do stuff here
}
void extension_point(int i, tag) {
// do stuff here
}
}}

The user can, for his own types, write overloads like this:

namespace user_stuff {
void extension_point(const user_type& u, your_stuff::adl::tag) {
// do stuff here
}
}

Upside: Works.

Downside: the user must add the your_stuff::adl::tag argument to his overloads. This will be probably seen as annoying boilerplate by many, and more importantly, can lead to the big puzzling "why doesn't it find my overload" problem when the user forgets to add the argument. On the other hand, the argument also clearly identifies the overloads as fulfilling a contract (being an extension point), which could be important when the next programmer comes along and renames the function to extensionPoint (to conform with naming conventions) and then freaks out when things don't compile anymore.

Why is deleting a function necessary when you're defining customization point object?

TL;DR: It's there to keep from calling std::swap.

This is actually an explicit requirement of the ranges::swap customization point:

S is (void)swap(E1, E2) if E1 or E2 has class or enumeration type ([basic.compound]) and that expression is valid, with overload resolution performed in a context that includes this definition:

 template<class T>
void swap(T&, T&) = delete;

So what does this do? To understand the point of this, we have to remember that the ranges namespace is actually the std::ranges namespace. That's important because a lot of stuff lives in the std namespace. Including this, declared in <utility>:

template< class T >
void swap( T& a, T& b );

There's probably a constexpr and noexcept on there somewhere, but that's not relevant for our needs.

std::ranges::swap, as a customization point, has a specific way it wants you to customize it. It wants you to provide a swap function that can be found via argument-dependent lookup. Which means that ranges::swap is going to find your swap function by doing this: swap(E1, E2).

That's fine, except for one problem: std::swap exists. In the pre-C++20 days, one valid way of making a type swappable was to provide a specialization for the std::swap template. So if you called std::swap directly to swap something, your specializations would be picked up and used.

ranges::swap does not want to use those. It has one customization mechanism, and it wants you to very definitely use that mechanism, not template specialization of std::swap.

However, because std::ranges::swap lives in the std namespace, unqualified calls to swap(E1, E2) can find std::swap. To avoid finding and using this overload, it poisons the overload by making visible a version that is = deleted. So if you don't provide an ADL-visible swap for your type, you get a hard error. A proper customization is also required to be more specialized (or more constrained) than the std::swap version, so that it can be considered a better overload match.

Note that ranges::begin/end and similar functions have similar wording to shut down similar problems with similarly named std:: functions.

Does std::(customization point) invoke the most appropriate overload?

A real example is std::swap, which is a designated customization point. Does this mean since C++20, we can write std::swap(a, b) directly instead of using std::swap; swap(a, b);?

No. std::swap itself did not gain any powers. It's still just a function template, so if you call it directly, you're... calling it directly. No ADL or anything.

The point of this is to say how customization points should be opted into. That is, you write:

namespace N { // not std
void swap(Foo&, Foo&);
}

Not:

namespace std {
void swap(N::Foo&, N::Foo&);
}

Nor:

namespace std {
template <>
void swap(N::Foo&, N::Foo&);
}

However, C++20 does introduce a lot of new things called customization point objects which you can use directly do this kind of thing. The CPO for swap is spelled std::ranges::swap (and likewise there are CPOs for all the useful ranges things... ranges::begin, ranges::end, etc.).

What is a niebloid?

The term niebloid comes from Eric Niebler's name. In simple words, they are function objects that disable ADL (argument-dependent lookup) from happening so that the overloads in std:: aren't picked up when an algorithm from std::ranges is called.

Here's a tweet (from 2018) and answer from Eric himself suggesting the name. Eric wrote an article in 2014 explaining this concept.

It can best be seen in action in the standard document itself:

25.2.2
The entities defined in the std​::​ranges namespace in this Clause are not found by argument-dependent name lookup (basic.lookup.argdep).
When found by unqualified (basic.lookup.unqual) name lookup for the postfix-expression in a function call, they inhibit argument-dependent name lookup.

void foo() {
using namespace std::ranges;
std::vector<int> vec{1,2,3};
find(begin(vec), end(vec), 2); // #1
}

The function call expression at #1 invokes std​::​ranges​::​find, not std​::​find, despite that (a) the iterator type returned from begin(vec) and end(vec) may be associated with namespace std and (b) std​::​find is more specialized ([temp.func.order]) than std​::​ranges​::​find since the former requires its first two parameters to have the same type.

The above example has ADL turned off, so the call goes directly to std::ranges::find.

Let's create a small example to explore this further:

namespace mystd
{
class B{};
class A{};
template<typename T>
void swap(T &a, T &b)
{
std::cout << "mystd::swap\n";
}
}

namespace sx
{
namespace impl {
//our functor, the niebloid
struct __swap {
template<typename R, typename = std::enable_if_t< std::is_same<R, mystd::A>::value > >
void operator()(R &a, R &b) const
{
std::cout << "in sx::swap()\n";
// swap(a, b);
}
};
}
inline constexpr impl::__swap swap{};
}

int main()
{
mystd::B a, b;
swap(a, b); // calls mystd::swap()

using namespace sx;
mystd::A c, d;
swap(c, d); //No ADL!, calls sx::swap!

return 0;
}

Description from cppreference:

The function-like entities described on this page are niebloids, that is:

  • Explicit template argument lists may not be specified when calling any of them.
  • None of them is visible to argument-dependent lookup.
  • When one of them is found by normal unqualified lookup for the name to the left of the function-call operator, it inhibits argument-dependent lookup.

Niebloids aren't visible to argument dependent lookup(ADL) because they are function objects, and ADL is done only for free functions and not function objects. The third point is what happened in the example from the standard:

find(begin(vec), end(vec), 2); //unqualified call to find

The call to find() is unqualified, so when lookup starts, it finds std::ranges::find function object which in turn stops ADL from happening.

Searching some more, I found this which, in my opinion is the most understandable explanation of niebloids and CPOs (customization point objects):

... a CPO is an object (not a function); it’s callable; it’s constexpr-constructible, [...] it’s customizable (that’s what it means to “interact with program-defined types”); and it’s concept-constrained.

[...]

If you remove the adjectives “customizable, concept-constrained” from the above, then you have a function object that turns off ADL — but is not necessarily a customization point. The C++2a Ranges algorithms, such as std::ranges::find, are like this. Any callable, constexpr-constructible object is colloquially known as a “niebloid,” in honor of Eric Niebler.

How to write concepts that make use of ADL

The idiom using std::X; X(...); is considered a bad idea post-C++20. The standard idiom now is to create a customization point object for X. Or in your case, use the existing customization point object: std::ranges::begin.

The way you call such a customization point is by spelling it out in full; internally, it can make an ADL call (without using anything) if the type being provided has such a call.

customisation point for alias to std types

one of the user wants to specialise my_func for his type, which is an alias to std type

This is the original sin, which is causing you all the pain. Type aliases in C++ are just aliases; they're not new types. You have a generic algorithm that uses a customization point, something like

// stringify_pair is my generic algorithm; operator<< is my customization point
template<class T>
std::string stringify_pair(K key, V value) {
std::ostringstream oss;
oss << key << ':' << value;
return std::move(oss).str();
}

Your user wants to call this generic algorithm with a standard type, like

std::string mykey = "abc";
std::optional<int> myvalue = 42;
std::cout << stringify_pair(mykey, myvalue);

This doesn't work because std::optional<int> doesn't provide an operator<<.
It can't possibly be made to work, because your user doesn't own the std::optional<int> type and therefore can't add operations to it. (They can certainly try, physically speaking; but it doesn't work from a philosophical point of view, which is why you keep running into roadblocks every time you get (physically) close.)

The simplest way for the user to make their code work is for them to "take legal ownership" of the type definition, instead of relying on somebody else's type.

struct OptionalInt {
std::optional<int> data_;
OptionalInt(int x) : data_(x) {}
friend std::ostream& operator<<(std::ostream&, const OptionalInt&);
};
OptionalInt myvalue = 42; // no problem now

You ask why tag_invoke doesn't have the same problem as raw ADL. I believe the answer is that when you call lib::my_func(t), which calls lib_ti::tag_invoke(*this, t), which does an ADL call to tag_invoke(lib::my_func, t), it's doing ADL with an argument list that includes both your t (which doesn't really matter) and that first argument of type lib::my_func_fn (which means lib is an associated namespace for this call). That's why it finds the tag_invoke overload you put into namespace lib.

In the raw ADL case, namespace lib is not an associated namespace of the call to my_func(t). The my_func overload you put into namespace lib is not found, because it isn't found by ADL (not in an associated namespace) and it isn't found by regular unqualified lookup either (because waves hands vaguely two-phase lookup).



What is the best way to write generic algorithms and customisation points and allow clients to customise for aliases for std types?

Don't. The "interface" of a type — what operations it supports, what you're allowed to do with it — is under the control of the author of the type. If you're not the author of the type, don't add operations to it; instead, create your own type (possibly by inheritance, preferably by composition) and give it whatever operations you want.

In the worst case, you end up with two different users in different parts of the program, one doing

using IntSet = std::set<int>;
template<> struct std::hash<IntSet> {
size_t operator()(const IntSet& s) const { return s.size(); }
};

and the other one doing

using IntSet = std::set<int>;
template<> struct std::hash<IntSet> {
size_t operator()(const IntSet& s, size_t h = 0) const {
for (int i : s) h += std::hash<int>()(i);
return h;
}
};

and then both of them try to use std::unordered_set<IntSet>, and then boom, ODR violation and undefined behavior at runtime when you pass a std::unordered_set<IntSet> from one object file to another and they agree on the name of std::hash<std::set<int>> but disagree on its meaning. It's just a huge can of worms. Don't open it.



Related Topics



Leave a reply



Submit