What .Net Collection Provides the Fastest Search

What .NET collection provides the fastest search

In the most general case, consider System.Collections.Generic.HashSet as your default "Contains" workhorse data structure, because it takes constant time to evaluate Contains.

The actual answer to "What is the fastest searchable collection" depends on your specific data size, ordered-ness, cost-of-hashing, and search frequency.

Performing the fastest search - which collection should i use?

The thing which is often skipped when comparing ArrayList and LinkedList is cache and memory management optimisations. ArrayList is effectively just an array which means that it is stored in a continuous space in the memory. This allows the Operating System to use optimisations such as "when a byte in memory was accessed, most likely the next byte will be accessed soon". Because of this, ArrayList is faster than LinkedList in all but one case: when inserting/deleting the element at the beginning of the list (because all elements in the array have to be shifted). Adding/deleting at the end or in the middle, iterating over, accessing the element are all faster in case of ArrayList.

If you need to search for student with given name and id, it sounds to me like a map with composite key - Map<Student, StudentData>. I would recommend to use HashMap implementation, unless you need to be able to both search the collection and retrieve all elements sorted by key in which case TreeMap may be a better idea. Although remember that HashMap has O(1) access time, while TreeMap has O(logn) access time.

Can you provide. NET Collection's sorted order list on basis of lookup time?

I think this article can be useful C#/.NET Fundamentals: Choosing the Right Collection Class

Fastest way to search in a string collection

You could consider doing the filtering task on a background thread which would invoke a callback method when it's done, or simply restart filtering if input is changed.

The general idea is to be able to use it like this:

public partial class YourForm : Form
{
private readonly BackgroundWordFilter _filter;

public YourForm()
{
InitializeComponent();

// setup the background worker to return no more than 10 items,
// and to set ListBox.DataSource when results are ready

_filter = new BackgroundWordFilter
(
items: GetDictionaryItems(),
maxItemsToMatch: 10,
callback: results =>
this.Invoke(new Action(() => listBox_choices.DataSource = results))
);
}

private void textBox_search_TextChanged(object sender, EventArgs e)
{
// this will update the background worker's "current entry"
_filter.SetCurrentEntry(textBox_search.Text);
}
}

A rough sketch would be something like:

public class BackgroundWordFilter : IDisposable
{
private readonly List<string> _items;
private readonly AutoResetEvent _signal = new AutoResetEvent(false);
private readonly Thread _workerThread;
private readonly int _maxItemsToMatch;
private readonly Action<List<string>> _callback;

private volatile bool _shouldRun = true;
private volatile string _currentEntry = null;

public BackgroundWordFilter(
List<string> items,
int maxItemsToMatch,
Action<List<string>> callback)
{
_items = items;
_callback = callback;
_maxItemsToMatch = maxItemsToMatch;

// start the long-lived backgroud thread
_workerThread = new Thread(WorkerLoop)
{
IsBackground = true,
Priority = ThreadPriority.BelowNormal
};

_workerThread.Start();
}

public void SetCurrentEntry(string currentEntry)
{
// set the current entry and signal the worker thread
_currentEntry = currentEntry;
_signal.Set();
}

void WorkerLoop()
{
while (_shouldRun)
{
// wait here until there is a new entry
_signal.WaitOne();
if (!_shouldRun)
return;

var entry = _currentEntry;
var results = new List<string>();

// if there is nothing to process,
// return an empty list
if (string.IsNullOrEmpty(entry))
{
_callback(results);
continue;
}

// do the search in a for-loop to
// allow early termination when current entry
// is changed on a different thread
foreach (var i in _items)
{
// if matched, add to the list of results
if (i.Contains(entry))
results.Add(i);

// check if the current entry was updated in the meantime,
// or we found enough items
if (entry != _currentEntry || results.Count >= _maxItemsToMatch)
break;
}

if (entry == _currentEntry)
_callback(results);
}
}

public void Dispose()
{
// we are using AutoResetEvent and a background thread
// and therefore must dispose it explicitly
Dispose(true);
}

private void Dispose(bool disposing)
{
if (!disposing)
return;

// shutdown the thread
if (_workerThread.IsAlive)
{
_shouldRun = false;
_currentEntry = null;
_signal.Set();
_workerThread.Join();
}

// if targetting .NET 3.5 or older, we have to
// use the explicit IDisposable implementation
(_signal as IDisposable).Dispose();
}
}

Also, you should actually dispose the _filter instance when the parent Form is disposed. This means you should open and edit your Form's Dispose method (inside the YourForm.Designer.cs file) to look something like:

// inside "xxxxxx.Designer.cs"
protected override void Dispose(bool disposing)
{
if (disposing)
{
if (_filter != null)
_filter.Dispose();

// this part is added by Visual Studio designer
if (components != null)
components.Dispose();
}

base.Dispose(disposing);
}

On my machine, it works pretty quickly, so you should test and profile this before going for a more complex solution.

That being said, a "more complex solution" would possibly be to store the last couple of results in a dictionary, and then only filter them if it turns out that the new entry differs by only the first of last character.

C# Fastest search object which has nearest value given

You can slightly speed up things by caching some values and inverting loop:

float target = 100.0f;
int index = -1;
float nearest = float.MaxValue;
int count = objectList.Count;
for(int i = count-1; i >= 0; i--)
{
var diff = Math.Abs(objectList[i].value - target);
if(diff < nearest)
{
nearest = diff;
index = i;
}
if(nearest == 0)
break;
}

Also, if your object list not changing too much you can Sort it and apply binary nearest search which will run in O(log(n)). There is plenty of optimisations can be done.

For example, you can put everything into sorted binary tree (RB-tree for example) and run your search on it. It will run considerably faster than plain look. Of course this will work only if you have a bunch of searches on this list.

Other way would be to split array into batches and process them simulteniously through Parallel.For. Then just process result of batches.

Concurrent Collection with fastest possible Add, Remove and Find the highest

Keys.Max() is the killer. That's O(N). No need for a dictionary if you do this.

You can't incrementally compute the max value because you are adding and removing. So you better use a data structure that is made for this. Trees usually are. The BCL has SortedList and SortedSet and SortedDictionary I believe. One of them was based on a fast tree. It has min and max operations.

Or, use a .NET collection library with a priority queue.

Bug: Add is racy. You might overwrite a non-empty collection.

Fastest method of finding words in a collection with wildcard letters

Since your wildcards can only match one letter, the problem isn't too hard. If you needed to support variable length substrings, I'd suggest you go and read some of the scientific literature on how regular expressions work.

This is a fairly basic 2nd year comp-sci "data structures and algorithms" exercise. Using a Dictionary in every Node probably isn't going to be the fastest / most memory efficient. But I would tackle the problem like this;

class Node
{
public bool endWord;
public Dictionary<char, Node> next;
}

public class Words
{
private Node root = new Node { endWord = false };
public const char wildcard = '_';

public void DefineWord(string word)
{
var node = root;
foreach (var c in word)
{
if (node.next == null)
node.next = new Dictionary<char, Node>();
if (node.next.TryGetValue(c, out var nextNode))
{
node = nextNode;
}
else
{
node = node.next[c] = new Node { endWord = false };
}
}
node.endWord = true;
}

private bool IsValid(ReadOnlySpan<char> word, Node node)
{
if (word.IsEmpty && node.endWord)
return true;
if (node.next == null)
return false;

if (word[0] == wildcard)
{
word = word.Slice(1);
foreach(var n in node.next.Values)
{
if (IsValid(word, n))
return true;
}
} else if (node.next.TryGetValue(word[0], out var nextNode))
return IsValid(word.Slice(1), nextNode);
return false;
}

public bool IsValid(string word)
=> IsValid(word, root);

public static void Test1()
{
var words = new Words();
words.DefineWord("APE");
words.DefineWord("APPLE");
words.DefineWord("BEAR");
words.DefineWord("BEER");
words.DefineWord("PEAR");
words.DefineWord("PEER");
words.DefineWord("PEERS");

Assert.True(words.IsValid("APE"));
Assert.True(words.IsValid("APPLE"));
Assert.True(words.IsValid("PEAR"));
Assert.True(words.IsValid("PEER"));
Assert.True(words.IsValid("PEERS"));
Assert.True(!words.IsValid("PLIERS"));
Assert.True(words.IsValid("PE_R"));
Assert.True(words.IsValid("_EAR"));
Assert.True(words.IsValid("_E_R"));
}
}

Best Collection for Fast String Lookup

If you're on .NET 3.5 or higher, use HashSet<String>.

Failing that, a Dictionary<string, byte> (or whatever type you want for the TValue type parameter) would be faster than a SortedList if you have a lot of entries - the latter will use a binary search, so it'll be O(log n) lookup, instead of O(1).



Related Topics



Leave a reply



Submit