What Are Transparent Comparators

What are transparent comparators?

What problem does this solve,

See Dietmar's answer and remyabel's answer.

and does this change how standard containers work?

No, not by default.

The new member function template overloads of find etc. allow you to use a type that is comparable with the container's key, instead of using the key type itself. See N3465 by Joaquín Mª López Muñoz for rationale and a detailed, carefully written proposal to add this feature.

At the Bristol meeting the LWG agreed that the heteregeneous lookup feature was useful and desirable, but we could not be sure that Joaquín's proposal would be safe in all cases. The N3465 proposal would have caused serious problems for some programs (see the Impact on existing code section). Joaquín prepared an updated draft proposal with some alternative implementations with different trade-offs, which was very useful helping the LWG understand the pros and cons, but they all risked breaking some programs in some way so there was no consensus to add the feature. We decided that although it wouldn't be safe to add the feature unconditionally, it would be safe if it was disabled by default and only "opt in".

The key difference of the N3657 proposal (which was a last-minute revision by myself and STL based on N3465 and a later unpublished draft by Joaquín) was to add the is_transparent type as the protocol that can be used to opt in to the new functionality.

If you don't use a "transparent functor" (i.e. one that defines a is_transparent type) then the containers behave the same as they've always done, and that's still the default.

Iff you choose to use std::less<> (which is new for C++14) or another "transparent functor" type then you get the new functionality.

Using std::less<> is easy with alias templates:

template<typename T, typename Cmp = std::less<>, typename Alloc = std::allocator<T>>
using set = std::set<T, Cmp, Alloc>;

The name is_transparent comes from STL's N3421 which added the "diamond operators" to C++14. A "transparent functor" is one which accepts any argument types (which don't have to be the same) and simply forwards those arguments to another operator. Such a functor happens to be exactly what you want for heterogeneous lookup in associative containers, so the type is_transparent was added to all the diamond operators and used as the tag type to indicate the new functionality should be enabled in associative containers. Technically, the containers don't need a "transparent functor", just one that supports calling it with heterogeneous types (e.g. the pointer_comp type in https://stackoverflow.com/a/18940595/981959 is not transparent according to STL's definition, but defining pointer_comp::is_transparent allows it to be used to solve the problem). If you only ever lookup in your std::set<T, C> with keys of type T or int then C only needs to be callable with arguments of type T and int (in either order), it doesn't need to be truly transparent. We used that name partly because we couldn't come up with a better name (I would have preferred is_polymorphic because such functors use static polymorphism, but there's already a std::is_polymorphic type trait which refers to dynamic polymorphism).

map or set with transparent comparator and non-unique elements in heterogeneous sense

The explanatory text before the associative container requirements table says:

kl is a value such that a [sic] is partitioned ([alg.sorting])
with respect to c(r, kl), with r the key value of e and e in
a; ku is a value such that a is partitioned with respect to
!c(ku, r); ke is a value such that a is partitioned with respect
to c(r, ke) and !c(ke, r), with c(r, ke) implying !c(ke, r).

And then describes the behavior of a_tran.{find,count,equal_range}(ke), a_tran.lower_bound(kl) and a_tran.upper_bound(ku). Therefore, the requirements are:

  • For find, count, and equal_range:

    • The elements in the container must be partitioned with respect to c(r, ke) and !c(ke, r)
    • c(r, ke) must imply !c(ke, r)
  • For lower_bound, the elements in the container must be partitioned with respect to c(r, kl).
  • For upper_bound, the elements in the container must be partitioned with respect to !c(ku, r).

Provided that you meet those requirements, there's nothing wrong with using heterogeneous lookup with something that's equivalent to multiple keys in the container. The motivating example in the original proposal, after all, is about looking up everyone whose family name is "Smith" in a set of names.

Transparent comparator code minimization

You may do as following:

struct Foo {
std::string id;
};

struct FooComp {
using is_transparent = std::true_type;

template <typename LHS, typename RHS>
bool operator()(const LHS& lhs, const RHS& rhs) const
{
return ProjectAsId(lhs) < ProjectAsId(rhs);
}

private:
const std::string& ProjectAsId(const std::string& s) const { return s; }
const std::string& ProjectAsId(const Foo& foo) const { return foo.id; }
};

You write comparison once, but you have to write the projection for each type.

In C++17, it can even be

template <auto f> struct ProjLess
{
using is_transparent = std::true_type;

template <typename LHS, typename RHS>
bool operator()(const LHS& lhs, const RHS& rhs) const
{
return project(lhs) < project(rhs);
}

private:
template <typename T>
using f_t = decltype(std::invoke(f, std::declval<const T&>()));

template <typename T>
using is_f_callable = is_detected<f_t, T>;

template <typename T, std::enable_if_t<is_f_callable<T>::value>* = nullptr>
decltype(auto) project(const T& t) const { return std::invoke(f, t); }

template <typename T, std::enable_if_t<!is_f_callable<T>::value>* = nullptr>
const T& project(const T& t) const { return t; }
};

And usage:

std::set<Foo, ProjLess<&Foo::id>> s;

Demo with C++17

Using std::set with a key-based comparator

You can customize Cmp with a key policy. Minimal example:

template<class Key>
struct Compare_on {
Compare_on(Key key = Key()) : key_(key)
{}

template<class T>
bool operator()(const T& x, const T& y) const {
return key_(x) < key_(y);
}

private:
Key key_;
};

struct First3 {
std::string_view operator()(const std::string& s) const {
return std::string_view(s).substr(0, 3);
}
};

// Example:
std::set<std::string, Compare_on<First3>> set;
set.insert("abc1");
set.insert("abc2");

Demo


Compare_on can be improved by making it a transparent comparator:

template<class Key>
struct Compare_on {
using is_transparent = void;

Compare_on(Key key = Key()) : key_(key)
{}

template<class T1, class T2>
bool operator()(const T1& x, const T2& y) const {
return key_(x) < key_(y);
}

private:
Key key_;
};

struct First3 {
template<class T>
std::string_view operator()(const T& s) const {
return std::string_view(s).substr(0, 3);
}
};

Now when we do

auto pos = set.find("abc");

no temporary std::string will be constructed for the string literal "abc".

Demo 2

Why is there no std::is_transparent equivalent for unordered containers?

Keys that compare equal should produce the same hash value. Decoupling the hash function and the predicate, and at the same time making one or both heterogeneous, could be too much error prone.

Recent paper, P0919r2, brings up the following example:

std::hash<long>{}(-1L) == 18446744073709551615ULL
std::hash<double>{}(-1.0) == 11078049357879903929ULL

Although -1L and -1.0 compare equal, some heterogeneous hash function, not in line with the selected equality comparison logic, could produce different values. The paper adds heterogeneous lookup-enabled function templates --
find, count, equal_­range, and contains -- but makes them available when the below requirements are met [unord.req]/p17:

If the qualified-id Hash::transparent_­key_­equal is valid and denotes a type ([temp.deduct]), then the program is ill-formed if either:

  • qualified-id Hash::transparent_­key_­equal::is_­transparent is not valid or does not denote a type, or
  • Pred is a different type than equal_­to<Key> or Hash::transparent_­key_­equal.

The member function templates find, count, equal_­range, and contains shall not participate in overload resolution unless the qualified-id Hash::transparent_­key_equal is valid and denotes a type ([temp.deduct]).

In such a case, Hash::transparent_­key_­equal overwrites the default predicate (std::equal_to<Key>) and is used for (transparent) equality checking, together with Hash itself for (transparent) hashing.

Under these conditions, the below transparent function objects could be used to enable heterogeneous lookup:

struct string_equal
{
using is_transparent = void;

bool operator()(const std::string& l, const std::string& r) const
{
return l.compare(r) == 0;
}

bool operator()(const std::string& l, const char* r) const
{
return l.compare(r) == 0;
}

bool operator()(const char* l, const std::string& r) const
{
return r.compare(l) == 0;
}
};

struct string_hash
{
using transparent_key_equal = string_equal; // or std::equal_to<>

std::size_t operator()(const std::string& s) const
{
return s.size();
}

std::size_t operator()(const char* s) const
{
return std::strlen(s);
}
};

Both -- string_equal and std::equal_to<> -- are transparent comparators and can be used as transparent_key_equal for string_hash.

Having this type alias (or a type definition itself) within the hash function class definition makes it clear that it is a valid predicate that works fine with that particular hashing logic and the two can't diverge. Such an unordered set can be declared as:

std::unordered_set<std::string, string_hash> u;

or:

std::unordered_set<std::string, string_hash, string_hash::transparent_key_equal> u;

Either will use string_hash and string_equal.

Comparator for matching point in a range

find returns an element that compares equivalent to the argument. Equivalent means that it compares neither larger nor smaller in the strict weak ordering provided to the std::set.

Therefore, to make your use case work, you want all points in a range to compare equivalent to the range.

If two ranges overlap, then the points shared by the two ranges need to compare equivalent to both ranges. The priority doesn't matter for this, since the equivalence should presumably hold if only one of the ranges is present.

However, one of the defining properties of a strict weak ordering is that the property of comparing equivalent is transitive. Therefore in this ordering the two ranges must then also compare equal in order to satisfy the requirements of std::set.

Therefore, as long as the possible ranges are not completely separated, the only valid strict weak ordering is the one that compares all ranges and points equivalent.

This is however not an order that would give you what you want.

This analysis holds for all standard library associative containers, since they have the same requirements on the ordering.

Why no transparent C++1x std::map::at?

My guess is that std::map::at() must be a "bounds-checked" version of std::map::operator[](). Providing a transparent version of std::map::operator[]() imposes an additional requirement on std::map::key_type and the query key type K - if the query key is not in the map, it must be inserted (with default constructed value), which means that std::map::key_type must be constructible from the the query key type.



Related Topics



Leave a reply



Submit