Define: What Is a Hashset

Define: What is a HashSet?


    1. A HashSet holds a set of objects, but in a way that allows you to easily and quickly determine whether an object is already in the set or not. It does so by internally managing an array and storing the object using an index which is calculated from the hashcode of the object. Take a look here
  1. HashSet is an unordered collection containing unique elements. It has the standard collection operations Add, Remove, Contains, but since it uses a hash-based implementation, these operations are O(1). (As opposed to List for example, which is O(n) for Contains and Remove.) HashSet also provides standard set operations such as union, intersection, and symmetric difference. Take a look here

  2. There are different implementations of Sets. Some make insertion and lookup operations super fast by hashing elements. However, that means that the order in which the elements were added is lost. Other implementations preserve the added order at the cost of slower running times.

The HashSet class in C# goes for the first approach, thus not preserving the order of elements. It is much faster than a regular List. Some basic benchmarks showed that HashSet is decently faster when dealing with primary types (int, double, bool, etc.). It is a lot faster when working with class objects. So the point is that HashSet is fast.

The only catch of HashSet is that there is no access by indices. To access elements you can either use an enumerator or use the built-in function to convert the HashSet into a List and iterate through that. Take a look here

Why would I use a HashSet over a Dictionary?

Dictionary is not better than HashSet, it's just different.

  • You use a HashSet when you want to store an unordered collection of items, and
  • You use a Dictionary when you want to associate a set of items called "keys" with another collection of items called "values"

One could think of a HashSet as a Dictionary with no associated values (in fact, HashSet is sometimes implemented using a Dictionary behind the scene) but it is not necessary to think about it in this way: thinking of the two as of entirely different things works fine, too.

In your case you could potentially improve performance by making a dictionary by actor, like this:

Dictionary<ActorType,List<CachedPath>> _cachedPathsByActor

This way your linear search would quickly choose a sub-list based on an actor, and then search linearly by target:

var cached = _cachedPathsByActor[self].FirstOrDefault(p => p.From == from && p.To == target);

or by making an equality comparer that considers all three items, and using a Dictionary with CachedPath as both keys and values, and that custom IEqualityComparer<T> as the key comparer:

class CachedPathEqualityComparer : IEqualityComparer<CachedPath> {
public bool Equals(CachedPath a, CachedPath b) {
return a.Actor == b.Actor
&& a.From == b.From
&& a.To == b.To;
}
public int GetHashCode(CachedPath p) {
return 31*31*p.Actor.GetHashCode()+31*p.From.GetHashCode()+p.To.GetHashCode();
}
}
...
var _cachedPaths = new Dictionary<CachedPath,CachedPath>(new CachedPathEqualityComparer());
...
CachedPath cached;
if (_cachedPaths.TryGetValue(self, out cached)) {
...
}

However, this approach assumes that there would be at most one item in the dictionary with identical From, To, and Actor.

When should I use the HashSetT type?

The important thing about HashSet<T> is right there in the name: it's a set. The only things you can do with a single set is to establish what its members are, and to check whether an item is a member.

Asking if you can retrieve a single element (e.g. set[45]) is misunderstanding the concept of the set. There's no such thing as the 45th element of a set. Items in a set have no ordering. The sets {1, 2, 3} and {2, 3, 1} are identical in every respect because they have the same membership, and membership is all that matters.

It's somewhat dangerous to iterate over a HashSet<T> because doing so imposes an order on the items in the set. That order is not really a property of the set. You should not rely on it. If ordering of the items in a collection is important to you, that collection isn't a set.

Sets are really limited and with unique members. On the other hand, they're really fast.

Define custom HashSet in Scala?

The short answer is: the Scala way is the same as the Java way. That said, case classes can help you by automatically constructing a valid hash and equals methods (and toString and apply for that matter). It goes like this:

case class MyClass(myInt: Int, myString: String)
val hSet = new mutable.HashSet[MyClass]();
hSet += MyCLass(2, "foo")

Note that HashSets have type parameters, just as in Java. For immutable sets the syntax varies a little, but comming from Java this will feel easier.

Also, when you define an object you are basically defining a class with a single instance (a singleton pattern). This is clearly not what you want here.

Define a hashset with customized key

a) Assuming you don't want to compare the objects by their references, you should override GetHashCode and Equals methods of MyClass

HashSet<MyClass> myHash = new HashSet<MyClass>();
MyClass m1 = new MyClass("1", "123");
MyClass m2 = new MyClass("1", "123");
myHash.Add(m1);
bool b = myHash.Contains(m2); //true

public class MyClass
{
private string m_x;
private string m_y;
public MyClass(string x, string y)
{
m_x = x;
m_y = y;
}

public override int GetHashCode()
{
return m_x.GetHashCode() ^ m_y.GetHashCode();
}

public override bool Equals(object obj)
{
if (ReferenceEquals(this, obj)) return true;
if (obj == null) return false;
var other = obj as MyClass;
return m_x == other.m_x && m_y == other.m_y;
}
}

b) You can also use IEqualityComparer to compare your objects, But in this case you need some public properties

public class MyClass
{
public string m_x;
public string m_y;
public MyClass(string x, string y)
{
m_x = x;
m_y = y;
}
}

public class MyEqualityComparer : IEqualityComparer<MyClass>
{

public bool Equals(MyClass x, MyClass y)
{
return x.m_x == y.m_x && x.m_y == y.m_y;
}

public int GetHashCode(MyClass obj)
{
return obj.m_x.GetHashCode() ^ obj.m_y.GetHashCode();
}
}

Now, you only need to give the comparer to the HashSet's constructor

HashSet<MyClass> myHash = new HashSet<MyClass>( new MyEqualityComparer());

When do we use HashSet

i want to know when we use HashSet<> , Dictionary<> or <List>

They all have different purpose and used in different scenarios

HashSet

Is used when you want to have a collection with unique elements. HashSet stores list of unique elements and won't allow duplicates in it.

Dictionary

Is used when you want to have a value against a unique key. Each element in Dictionary has two parts a (unique) key and a value. You can store a unique key in it (just like Hashset) in addition you can store a value against that unique key.

List

Is just a simple collection of elements. You can have duplicates in it.

How can I create a constant hashset in c#

The best way to create a 'constant' Set is probably by exposing your HashSet as its IEnumerable interface, using the following:

public static readonly IEnumerable<string> fruits = new HashSet<string> { "Apples", "Oranges" };
  • public: everyone can access it.
  • static: there's only going to be one copy in memory, no matter how many instances of the parent class is created.
  • readonly: you can't re-assign it to a new value.
  • IEnumerable<>: you can only iterate through its contents, but not add/remove/modify.

To search, you can use LINQ to call Contains() on your IEnumerable, and it is smart enough to know it's backed by a HashSet and delegate the proper call to utilise the hashed nature of your set. (well, ok, it calls it via ICollection, but ends up in HashSet's overridden method anyway)

Debug.WriteLine(fruits.Contains("Apples")); // True
Debug.WriteLine(fruits.Contains("Berries")); // False

fruits = new HashSet<string>(); // FAIL! readonly fields can't be re-assigned
fruits.Add("Grapes"); // FAIL! IEnumerables don't have Add()

How to lookup inner hashset in dictionary and return key value in c#?

You need to iterate over the dictionary, and get all the keys that point to a hashset that contains the value you're looking for:

var matches = dict.Where(kvp => kvp.Value.Contains("user3"));

Explanation:

You're asking for all the key-value pairs, where the Value (which we know is of type Hashset) contain the string you're looking for.

Update: to get just the keys from the key-value pairs, I believe you can do this:

var matches = dict.Where(kvp => kvp.Value.Contains("user3")).Select(kvp => kvp.Key);

Further musing: if your use cases will always prioritize finding by user, perhaps you should invert the dictionary: have the user name be the key, and the value would be the groups the user belongs to. That way, finding all groups for a user will be O(1).

C#: more elegant way to build a HashSet from the OrderedDictionary.Keys?

Sure - use the constructor, casting the Keys property sequence accordingly:

var hs = new HashSet<string>(d.Keys.Cast<string>());

(As ever with LINQ, make sure you have a using directive for the System.Linq namespace.)



Related Topics



Leave a reply



Submit