.Net Data Structures: Arraylist, List, Hashtable, Dictionary, Sortedlist, Sorteddictionary -- Speed, Memory, and When to Use Each

.Net Data structures: ArrayList, List, HashTable, Dictionary, SortedList, SortedDictionary -- Speed, memory, and when to use each?

Off the top of my head:

  • Array* - represents an old-school memory array - kind of like a alias for a normal type[] array. Can enumerate. Can't grow automatically. I would assume very fast insert and retrival speed.

  • ArrayList - automatically growing array. Adds more overhead. Can enum., probably slower than a normal array but still pretty fast. These are used a lot in .NET

  • List - one of my favs - can be used with generics, so you can have a strongly typed array, e.g. List<string>. Other than that, acts very much like ArrayList

  • Hashtable - plain old hashtable. O(1) to O(n) worst case. Can enumerate the value and keys properties, and do key/val pairs

  • Dictionary - same as above only strongly typed via generics, such as Dictionary<string, string>

  • SortedList - a sorted generic list. Slowed on insertion since it has to figure out where to put things. Can enum., probably the same on retrieval since it doesn't have to resort, but deletion will be slower than a plain old list.

I tend to use List and Dictionary all the time - once you start using them strongly typed with generics, its really hard to go back to the standard non-generic ones.

There are lots of other data structures too - there's KeyValuePair which you can use to do some interesting things, there's a SortedDictionary which can be useful as well.

Dictionary vs ArrayList

You should actually not use ArrayList at all, as you have the strongly typed List<T> to use.

Which you use depends on how you need to access the data. The List stores a sequential list of items, while a Dictionary stores items identified by a key. (You can still read the items from the Dictionary sequentially, but the order is not preserved.)

The performance is pretty much the same, both uses arrays internally to store the actual data. When they reach their capacity they allocate a new larger array and copies the data to it. If you know how large the collection will get, you should specify the capacity when you create it, so that it doesn't have to resize itself.

Which is better? array, ArrayList or ListT (in terms of performance and speed)

List<T> should generally be preferred over ArrayList

  • faster for value types as it avoids boxing.
  • strongly typed elements

If you want lists you expose to callers to be immutable, this is supported by both List<T> and ArrayList:

List<T>.AsReadOnly()
ArrayList.ReadOnly(ArrayList list);

Your question asks about choosing between ArrayList and List<T>, but your example shows an array, which is neither.

When to use a HashTable

Maybe not directly related to the OPs question, but there's a useful blog post about which collection structure to use at: SortedSets

Basically, what you want to do with the collection determines what type of collection you should create.

To summarise in more detail:

  • Use IList if you want to be able to enumerate and / or modify the collection (normally adding at end of list)
  • Use IEnumeration if you just want to enumerate the collection (don't need to add / remove - usually used as a return type)
  • Use IDictionary if you want to access elements by a key (adding / removing elements quickly using a key)
  • Use SortedSet if you want to access a collection in a predefined order (most common usage being to access the collection in order)

  • Overall, use Dictionary if you want to access / modify items by key in no particular order (preferred over list as that's generally done in order, preferred over enumeration as you can't modify an enumeration, preferred over hashtable as that's not strictly typed, preferred over sortedlist when you don't need keys sorted)

Dictionary, List or Array?

It depends on which operation you want to execute. Let's assume that you want to find an object with a given ID.

  • The huge array approach is fastest: Accessing myArray[84397] is a constant-time operation O(1). Of course, this approach requires the most memory.
  • The dictionary is almost as fast but requires less memory, since it uses a hash table internally.
  • The list of pairs approach is the slowest, since you might have to traverse the whole list to find your entry, which yields O(n) complexity.

Thus, in your situation, I would choose the dictionary, unless the marginally better performance of the huge array is really relevant in your case.

Why is Dictionary preferred over Hashtable in C#?

For what it's worth, a Dictionary is (conceptually) a hash table.

If you meant "why do we use the Dictionary<TKey, TValue> class instead of the Hashtable class?", then it's an easy answer: Dictionary<TKey, TValue> is a generic type, Hashtable is not. That means you get type safety with Dictionary<TKey, TValue>, because you can't insert any random object into it, and you don't have to cast the values you take out.

Interestingly, the Dictionary<TKey, TValue> implementation in the .NET Framework is based on the Hashtable, as you can tell from this comment in its source code:

The generic Dictionary was copied from Hashtable's source

Source



Related Topics



Leave a reply



Submit