Is Gcc Std::Unordered_Map Implementation Slow? If So - Why

Is gcc std::unordered_map implementation slow? If so - why?

I found the reason: it is a Problem of gcc-4.7!!

With gcc-4.7

inserts: 37728
get    : 2985

With gcc-4.6

inserts: 2531
get    : 1565

So std::unordered_map in gcc-4.7 is broken (or my installation, which is an installation of gcc-4.7.0 on Ubuntu - and another installation which is gcc 4.7.1 on debian testing).

I will submit a bug report.. until then: DO NOT use std::unordered_map with gcc 4.7!

Why is std::unordered_map slow, and can I use it more effectively to alleviate that?

The standard library's maps are, indeed, inherently slow (std::map especially but std::unoredered_map as well). Google's Chandler Carruth explains this in his CppCon 2014 talk; in a nutshell: std::unordered_map is cache-unfriendly because it uses linked lists as buckets.

This SO question mentioned some efficient hash map implementations - use one of those instead.

Slow compilation speed for const unordered_map as global variable

Change

#include "map_content.h"
inline const std::unordered_map<std::string, Entry_t> squaredampl{MAP_CONTENT};
#undef MAP_CONTENT

extern const std::unordered_map<std::string, Entry_t> squaredampl;

and in a new .cpp file:

#include "correspondance.h"
#include "map_content.h"
const std::unordered_map<std::string, Entry_t> squaredampl{MAP_CONTENT};
#undef MAP_CONTENT

This will cause only one definition of the map to be emitted (in contrast to inline which will emit it in every translation unit odr-using it).

It is safe as long as the map isn't used during the construction of another static storage duration object before main is entered. If that is required, it would be preferable to make the map a local static variable in a function returning it by-reference to be used by callers. This would avoid the "static initialization order fiasco".

It might also help to make a function to initialize the map with an insert statement for each entry instead of doing it all in the initializer, i.e. have a function initSquaredampl returning a std::unordered_map<std::string, Entry_t> by-value and then:

auto initSquaredampl() {
    std::unordered_map<std::string, Entry_t> squaredampl;
    // Insert items one-by-one
    return squaredampl;
}

const std::unordered_map<std::string, Entry_t> squaredampl = initSquaredampl();

This may help, because a compiler might not be optimized for compilation of very long expressions or initializers. However, they may also have trouble with very long functions.

Will std::unordered_map::clear() be slower than std::map::clear() because clear operations are carried on in the former?

When will std::map::clear() going to be faster that std::unordered_map::clear()?

Both have linear asymptotic complexity. You can time your code using both to see if either is measurably faster.

Note that if the destructor of the element is non-trivial, then clear of all containers has linear complexity. If the destructor is trivial, then the only standard containers that have less than linear complexity of clear are std::vector and std::basic_string.

Regarding the case of:

grow container large
clear
insert few
clear

The cost of latter clear being high that applies to std::unordered_map is not a problem that reproduces with std::map.

How std::unordered_map is implemented

The Standard effectively mandates that implementations of std::unordered_set and std::unordered_map - and their "multi" brethren - use open hashing aka separate chaining, which means an array of buckets, each of which holds the head of a linked list†. That requirement is subtle: it is a consequence of:

the default max_load_factor() being 1.0 (which means the table will resize whenever size() would otherwise exceed 1.0 times the bucket_count(), and
the guarantee that the table will not be rehashed unless grown beyond that load factor.

That would be impractical without chaining, as the collisions with the other main category of hash table implementation - closed hashing aka open addressing - become overwhelming as the load_factor()](https://en.cppreference.com/w/cpp/container/unordered_map/load_factor) approaches 1.

References:

23.2.5/15: The insert and emplace members shall not affect the validity of iterators if (N+n) < z * B, where N is the number of elements in the container prior to the insert operation, n is the number of elements inserted, B is the container’s bucket count, and z is the container’s maximum load factor.
amongst the Effects of the constructor at 23.5.4.2/1: max_load_factor() returns 1.0.

† To allow optimal iteration without passing over any empty buckets, GCC's implementation fills the buckets with iterators into a single singly-linked list holding all the values: the iterators point to the element immediately before that bucket's elements, so the next pointer there can be rewired if erasing the bucket's last value.

Regarding the text you quote:

No, that is not at all the most efficient way to implement a hash map for most common uses. Unfortunately, a small "oversight" in the specification of unordered_map all but requires this behavior. The required behavior is that iterators to elements must stay valid when inserting or deleting other elements

There is no "oversight"... what was done was very deliberate and done with full awareness. It's true that other compromises could have been struck, but the open hashing / chaining approach is a reasonable compromise for general use, that copes reasonably elegantly with collisions from mediocre hash functions, isn't too wasteful with small or large key/value types, and handles arbitrarily-many insert/erase pairs without gradually degrading performance the way many closed hashing implementations do.

As evidence of the awareness, from Matthew Austern's proposal here:

I'm not aware of any satisfactory implementation of open addressing in a generic framework. Open addressing presents a number of problems:
• It's necessary to distinguish between a vacant position and an occupied one.
• It's necessary either to restrict the hash table to types with a default constructor, and to construct every array element ahead of time, or else to maintain an array some of whose elements are objects and others of which are raw memory.
• Open addressing makes collision management difficult: if you're inserting an element whose hash code maps to an already-occupied location, you need a policy that tells you where to try next. This is a solved problem, but the best known solutions are complicated.
• Collision management is especially complicated when erasing elements is allowed. (See Knuth for a discussion.) A container class for the standard library ought to allow erasure.
• Collision management schemes for open addressing tend to assume a fixed size array that can hold up to N elements. A container class for the standard library ought to be able to grow as necessary when new elements are inserted, up to the limit of available memory.
Solving these problems could be an interesting research project, but, in the absence of implementation experience in the context of C++, it would be inappropriate to standardize an open-addressing container class.

Specifically for insert-only tables with data small enough to store directly in the buckets, a convenient sentinel value for unused buckets, and a good hash function, a closed hashing approach may be roughly an order of magnitude faster and use dramatically less memory, but that's not general purpose.

A full comparison and elaboration of hash table design options and their implications is off topic for S.O. as it's way too broad to address properly here.

Is there any possibility that std::unordered_map collides?

Is there any possibility that std::unordered_map collides?

It isn't the container that has collisions, but the hash function that you provide for the container.

And yes, all hash functions - when their output range is smaller than the input domain - have collisions.

is there any possibility that it prints ERROR!?

No, it isn't possible. It's completely normal for the hash function to put multiple values into a single bucket. That will happen in case of hash collision, but it also happens with different hash values. The lookup function will get the correct value using linear search. The identity of a key is determined by the equality function, not by the hash function.

Is Gcc Std::Unordered_Map Implementation Slow? If So - Why