Implementing a HashMap in C
Well if you know the basics behind them, it shouldn't be too hard.
Generally you create an array called "buckets" that contain the key and value, with an optional pointer to create a linked list.
When you access the hash table with a key, you process the key with a custom hash function which will return an integer. You then take the modulus of the result and that is the location of your array index or "bucket". Then you check the unhashed key with the stored key, and if it matches, then you found the right place.
Otherwise, you've had a "collision" and must crawl through the linked list and compare keys until you match. (note some implementations use a binary tree instead of linked list for collisions).
Check out this fast hash table implementation:
https://attractivechaos.wordpress.com/2009/09/29/khash-h/
What is the best way to use a HashMap in C++?
The standard library includes the ordered and the unordered map (std::map
and std::unordered_map
) containers. In an ordered map the elements are sorted by the key, insert and access is in O(log n). Usually the standard library internally uses red black trees for ordered maps. But this is just an implementation detail. In an unordered map insert and access is in O(1). It is just another name for a hashtable.
An example with (ordered) std::map
:
#include <map>
#include <iostream>
#include <cassert>
int main(int argc, char **argv)
{
std::map<std::string, int> m;
m["hello"] = 23;
// check if key is present
if (m.find("world") != m.end())
std::cout << "map contains key world!\n";
// retrieve
std::cout << m["hello"] << '\n';
std::map<std::string, int>::iterator i = m.find("hello");
assert(i != m.end());
std::cout << "Key: " << i->first << " Value: " << i->second << '\n';
return 0;
}
Output:
23
Key: hello Value: 23
If you need ordering in your container and are fine with the O(log n) runtime then just use std::map
.
Otherwise, if you really need a hash-table (O(1) insert/access), check out std::unordered_map
, which has a similar to std::map
API (e.g. in the above example you just have to search and replace map
with unordered_map
).
The unordered_map
container was introduced with the C++11 standard revision. Thus, depending on your compiler, you have to enable C++11 features (e.g. when using GCC 4.8 you have to add -std=c++11
to the CXXFLAGS).
Even before the C++11 release GCC supported unordered_map
- in the namespace std::tr1
. Thus, for old GCC compilers you can try to use it like this:
#include <tr1/unordered_map>
std::tr1::unordered_map<std::string, int> m;
It is also part of boost, i.e. you can use the corresponding boost-header for better portability.
implement hash map in c
Creating a hash map requires a bit of work, but is definitely an interesting thing to do.
The general approach to build a hash map is to create an array called buckets that contains keys and values. However, this structure is not able to deal with collisions. A collision between two values may arise when different keys are assigned by the hash function to the same value.
To address this problem a third field, usually a pointer, is added to the bucket. When a collision arises, the new value is added in the data structure pointed at by the third field, usually a linked list or binary tree. For more information about how to resolve collisions, read this link.
An example of the Hash map Structure described above (note the collision at index 153):
To access the hash table, a custom hash function, which returns an integer which identifies the array index location, is applied to the key. Finally, you check if the key used to access the element and the key stored in the array at the index returned by the hash function match. If they do, you have found the right element.
This is just an example; you can find different ways to implement a hash map.
In addition, this question has already been asked here.
Why are there no hashtables in the C standard library?
There is no hashtable in the standard C library because either:
- no-one has submitted a proposal to the working group; or
- the working group has deemed it unnecessary.
That's the way ISO works. Proposals are put forward and accepted or rejected.
You have to be careful what you add to the standard library since you have two conflicting groups. As a user, you might want every data structure under the sun to be added to the standard to make the language more useful.
But, as a language implementor (as an aside, these are probably the people that tend to make up most of the various working groups so their view is likely to have more impact), you don't really want the hassle of having to implement stuff that may not be used by everyone. All the stuff that was there when C89 appeared was to do with the fact that the primary purpose was to codify existing practice rather than introduce new practices. All iterations of the standards since then have been a little freer in what they can do but backwards compatibility is still an important issue.
Myself, I also have conflicts. I'd love to have all the features of the Java, C++ or Python libraries at my disposal in C. Of course, that would make it so much harder to learn everything for newcomers and, as one commenter stated, probably make it so any code monkey can pump out useful code, reducing my value in the process :-)
And I pretty much have all the data structures I'll ever need, from my long and (mostly) illustrious career. You're not limited to the standard library for this sort of stuff. There are plenty of third-party tools you can get to do the job and (like me) you can also roll your own.
If you want to know why certain decisions were made in each iteration, ISO (and ANSI originally, before ISO took over) usually publish rationale documents. The C89 one from ANSI can be found here. It contains this little beauty in the scope:
This Rationale focuses primarily on additions, clarifications, and changes made to the language as described in the Base Documents. It is not a rationale for the C language as a whole: the Committee was charged with codifying an existing language, not designing a new one. No attempt is made in this Rationale to defend the pre-existing syntax of the language, such as the syntax of declarations or the binding of operators.
I especially enjoy the admission that they're not responsible for any unholy mess that may have predated their attempts to standardise.
But, perhaps the real answer to your question lies in this bit, one of the guiding principles:
Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. Some of the facets of the spirit of C can be summarized in phrases like:
- Trust the programmer.
- Don't prevent the programmer from doing what needs to be done.
- Keep the language small and simple.
- Provide only one way to do an operation.
- Make it fast, even if it is not guaranteed to be portable.
That third one is probably the main reason why the library wasn't massively expanded with the initial standardisation effort - that, and the fact that such an expansion from a committee would probably have resulted in ANSI C being labeled C2038 rather than C89.
C - How to assign multiple values to the same key in a hashmap?
This may not be the best possible solution, but thanks to the above comments I got something working. I am declaring multiple values as pointers to char
. Please let me know if I can improve this.
I am yet to implement collision correction.
struct map { /* creating a map structure */
struct map *next;
char *KEY;
char *value1;
char *value2;
char *value3;
};
I then pass these to the insert function.
struct map *insert(char *KEY, char *value1, char *value2, char *value3) {
/* install: put (name, defn) in */
/* Install uses lookup to determine whether the KEY being installed
is already present. Proceeds to create a new entry or update*/
struct map *np;
unsigned hashval;
if ((np = lookup(KEY)) == NULL) {
np = (struct map *) malloc(sizeof(*np));
if (np == NULL || (np->KEY = strdup(KEY)) == NULL) {
return NULL;
}
hashval = hash(KEY);
np->next = hashtab[hashval];
hashtab[hashval] = np;
}
else {
free((void *)np->value1); //type cast cannot convert from 'Value' to 'Value*'
free((void *)np->value2);
free((void *)np->value3);
}
if ((np->time = strdup(value1)) == NULL) { //map has no field value
return NULL;
}
if ((np->description = strdup(value2)) == NULL) { //map has no field value
return NULL;
}
if ((np->duration = strdup(value3)) == NULL) { //map has no field value
return NULL;
}
return np;
}
Printing these will give the key pointing the the four values.
key1->value1 value2 value3
Related Topics
Boost Zip_Iterator and Std::Sort
Can't C++ Pod Type Have Any Constructor
Reset Cuda Context After Exception
Why Using the Const Keyword Before and After Method or Function Name
Why Can a T* Be Passed in Register, But a Unique_Ptr<T> Cannot
How to Use the Mingw Gdb Debugger to Debug a C++ Program in Windows
What Exactly Is an 'Aligned Pointer'
Correct Usage(S) of Const_Cast<>
Xxxxxx.Exe Is Not a Valid Win32 Application
With a Private Modifier, Why Can the Member in Other Objects Be Accessed Directly
In C++ What Causes an Assignment to Evaluate as True or False When Used in a Control Structure
Assertion Failed (Size.Width>0 && Size.Height>0)
How to Use Nested Loops with Vectors in Cpp
Why Do We Require Requires Requires
How to Play or Open *.Mp3 or *.Wav Sound File in C++ Program