Cost of Using Std::Map with Std::String Keys VS Int Keys

Cost of using std::map with std::string keys vs int keys?

In addition to the time complexity from comparing strings already mentioned, a string key will also cause an additional memory allocation each time an item is added to the container. In certain cases, e.g. highly parallel systems, a global allocator mutex can be a source of performance problems.

In general, you should choose the alternative that makes the most sense in your situation, and only optimize based on actual performance testing. It's notoriously hard to judge what will be a bottleneck.

Performance difference between mapstring,.. and mapint,..?

I'm assuming you meant map<string, unsigned short> for the second case, otherwise it won't even compile.

Both trigger an O(log n) lookup in the map based on comparisons of keys. Comparing 32 bit integers is generally faster than comparing strings, so the first case should be faster.

I wouldn't worry about it though, unless there's profiling data showing this to have a significant impact on performance. If you do that only when the player connects, and the session tends to last "long enough", chances are this would be an insignificant optimization - and even then, switching to unordered_map will probably be more significant than changing the type of the key.

What can bring a std::map to not find one of its keys?

You have a map of pointers to characters, not strings. The map lookup is based on the pointer value (address) and not the value of what's pointed at. In the first case, where "zero" is found in the map, you compiler has performed some string merging and is using one array of characters for both identical strings. This is not required by the language but is a common optimization. In the second case, when the string is not found, this merging has not been done (possibly your code here is in a different source module), so the address being used in the map is different from what was inserted and is then not found.

To fix this either store std::string objects in the map, or specify a comparison in your map declaration to order based on the strings and not the addresses.

Using an std::string as a key for an std::map

Error 24 error C2676: binary '<' : 'const std::string' does not define this operator or a conversion to a type acceptable to the predefined operator c:\program files\microsoft visual studio 10.0\vc\include\xfunctional 125 1 FXCMMarketDataServer

That's what VC spits into your face when you forgot to include <string>. That header definitely defines this operator.

Is there a way for std::map to give a counter of all keys = target_key in O(1) time?

This is called a rank query, and std::map does not provide any way of doing it in better than linear time.

It's true that you can find the iterator in logarithmic time but there is no way to get the distance between that iterator and begin() in better than linear time. In order for std::map to provide such functionality, it would have to keep a count of subtree size in each node of the red-black tree, which would slow down each update operation, including for users who don't care about doing rank queries.

If you do augment the tree with subtree sizes, you can do the rank query in logarithmic time. Boost.Intrusive can help (I imagine there are people who have already written the necessary libraries to do rank queries, but I can't find them on Google right now).

std::map with custom key

If you don't care about the specific order and just want to satisfy the requirements for sorting, a common and simple pattern is to use std::tie with all of the class members of the compared instances, and compare those results instead.

std::tie creates a std::tuple of references to the members, and std::tuple implements operator< which compares its elements (the members of the object in this case) lexicographically.

In your case, using member operator< :

bool operator<(const ParserKey & other) const
{
return std::tie(compno_, resno_, precinctIndex_) <
std::tie(other.compno_, other.resno_, other.precinctIndex_);
}

Live example https://godbolt.org/z/v433v54jz

Avoiding key construction for std::map::find()

Your concern is real, and there's no good workaround for C++11.

C++14 fixes this issue by adding a templated overload of std::map::find — the relevant proposal is N3657. In C++14, your program would look like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <map>
#include <algorithm>

class std_string {
char *m_s;
public:
std_string() { m_s = nullptr; }
std_string(const char* s) { puts("Oops! A new std_string was constructed!"); m_s = strdup(s); }
~std_string() { free(m_s); }
std_string(std_string&& ss) = delete;
std_string(const std_string& ss) = delete;
std_string& operator=(std_string&& ss) = delete;
std_string& operator=(const std_string& ss) = delete;

bool operator< (const char* s) const { return strcmp(m_s, s) < 0; }
bool operator< (const std_string& ss) const { return strcmp(m_s, ss.m_s) < 0; }
friend bool operator< (const char* s, const std_string& ss) { return strcmp(s, ss.m_s) < 0; }
};

int main()
{
{
puts("The C++11 way makes a copy...");
std::map<std_string, int> m;
auto it = m.find("Olaf");
}
{
puts("The C++14 way doesn't...");
std::map<std_string, int, std::less<>> m;
auto it = m.find("Olaf");
}
}

(std::less<> is the generalized "less-than" comparator, equivalent to operator<. C++03 and C++11 have a broken-by-design version of this comparator that forces both arguments to be the same type. C++14 finally does it right.)

Unfortunately the Committee seems to have decided that people should go through all their C++11 code and update every container to use std::less<> as the comparator — it doesn't just happen by default. There's no good reason for this decision; it's just The Way It Is. (See my comments above about broken-by-design. C++ has a bad habit of introducing broken versions of things before introducing the "real" versions several years later.)

For C++11, std::map::find has only one overload (the one that takes const Key&), so any workaround will necessarily involve changing the Key type to be less expensive — we can't just fiddle with the comparator, because by the time execution reaches the comparator, we've already promoted find's argument to the Key type.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <map>
#include <algorithm>

class std_string {
char *m_s;
public:
std_string() : m_s(nullptr) { }
std_string(const char* s) : m_s(strdup(s)) { puts("Oops! A new std_string was constructed!"); }
~std_string() { free(m_s); }
std_string(std_string&& ss) : m_s(nullptr) { std::swap(m_s, ss.m_s); }
std_string(const std_string& ss) : m_s(strdup(ss.data())) { puts("Oops! A new std_string was constructed!"); }
std_string& operator=(std_string&& ss) = delete;
std_string& operator=(const std_string& ss) = delete;

const char* data() const { return m_s; }

bool operator< (const char* s) const { return strcmp(m_s, s) < 0; }
bool operator< (const std_string& ss) const { return strcmp(m_s, ss.m_s) < 0; }
friend bool operator< (const char* s, const std_string& ss) { return strcmp(s, ss.m_s) < 0; }
};

struct string_or_ptr {
union {
const char* ptr;
alignas(std_string) unsigned char str[sizeof (std_string)];
} m_u;
bool m_deep;

char const* & ptr() { return m_u.ptr; }
std_string& str() { return *reinterpret_cast<std_string*>(m_u.str); }
char const* const & ptr() const { return m_u.ptr; }
std_string const& str() const { return *reinterpret_cast<const std_string*>(m_u.str); }

string_or_ptr() : m_deep(false) { ptr() = ""; }
string_or_ptr(const char* s) : m_deep(false) { ptr() = s; }
string_or_ptr(std_string&& s) : m_deep(true) { new ((void*)&str()) std_string(std::move(s)); }
string_or_ptr(const std_string& s) : m_deep(true) { new ((void*)&str()) std_string(s); }
~string_or_ptr() { if (m_deep) str().~std_string(); }
std_string& operator=(std_string&& ss) = delete;
std_string& operator=(const std_string& ss) = delete;

operator const char*() const { return m_deep ? str().data() : ptr(); }

bool operator< (const char* s) const { return strcmp((const char*)*this, s) < 0; }
bool operator< (const std_string& ss) const { return (const char*)*this < ss; }
bool operator< (const string_or_ptr& sp) const { return strcmp((const char*)*this, (const char*)sp) < 0; }
friend bool operator< (const char* s, const string_or_ptr& sp) { return strcmp(s, (const char*)sp) < 0; }
friend bool operator< (const std_string& ss, const string_or_ptr& sp) { return ss < (const char*)sp; }
};

int main()
{
{
puts("The C++11 way...");
std::map<std_string, int> m;
auto it = m.find("Olaf");
}
{
puts("The C++11 way with a custom string-or-pointer Key type...");
std::map<string_or_ptr, int> m;
auto it = m.find("Olaf");
}
}


Related Topics



Leave a reply



Submit