Is there a Lower Bound function on a SortedListK ,V?
Binary search the SortedList.Keys
collection.
Here we go. This is O(log n):
private static int BinarySearch<T>(IList<T> list, T value)
{
if (list == null)
throw new ArgumentNullException("list");
var comp = Comparer<T>.Default;
int lo = 0, hi = list.Count - 1;
while (lo < hi) {
int m = (hi + lo) / 2; // this might overflow; be careful.
if (comp.Compare(list[m], value) < 0) lo = m + 1;
else hi = m - 1;
}
if (comp.Compare(list[lo], value) < 0) lo++;
return lo;
}
public static int FindFirstIndexGreaterThanOrEqualTo<T,U>
(this SortedList<T,U> sortedList, T key)
{
return BinarySearch(sortedList.Keys, key);
}
rationale for std::lower_bound and std::upper_bound?
If you have multiple elements in the range [first
, last
) whose value equals the value val
you are searching for, then the range [l
, u
) where
l = std::lower_bound(first, last, val)
u = std::upper_bound(first, last, val)
is precisely the range of elements equal to val
within the range [first
, last
). So l
and u
are the "lower bound" and "upper bound" for the equal range. It makes sense if you're accustomed to thinking in terms of half-open intervals.
(Note that std::equal_range
will return both the lower and upper bound in a pair, in a single call.)
Finding closest lower key of a number in a dictionary
You can filters the keys keeping only the lowers, then gets the max key:
Dictionary<int, float> delta = new Dictionary<int, float>();
var key = 16;
var maxKey = delta.Keys.Where(k => k < key).Max();
var value = delta[maxKey];
However, as noted in the comments a better approach is use classes like SortedDictionary<>
or SortedList<>
.
If all add/remove operations runs first, and later only searches are performed a solution can be convert the keys to array (O(N)) and use Array.BinarySearch()
method (O(log N)):
SortedDictionary<int, float> sortedDict = new SortedDictionary<int, float>();
// Add-Remove operations
var keys = sortedDict.Keys.ToArray();
// Search operations
int maxKey = Array.BinarySearch(keys, key);
float value = maxIndex >= 0 ? sortedDict[maxKey] : sortedDict[~maxIndex - 1];
A SortedList.IndexOfKey(key) that returns the index of the item where item.key = key
I'm afraid you're out of luck, there's nothing built-in.
If you create a binary search extension method for IList<T>
then you could use it against the Keys
property. This is a bit annoying, though not too difficult.
(The convention used by the framework's built-in binary search methods -- Array
and List<T>
-- is to return the bitwise complement of the index of the next element when the element isn't found.)
int index = yourSortedList.Keys.YourBinarySearchExtensionMethod(key);
if (index >= 0)
{
// key found
}
else
{
// key not found
}
Looking up floats with some epsilon from a data structure in C#, with search and insert both O(lg n) time
Now while this answer I'm about to give is a C++ profiled answer and not with C#, it solves the problem in a much better and faster way.
The better way to solve this is multiplying the floating point by the inverse of the epsilon. For example if your epsilon is 0.25, then you'd want to multiply all your floats/doubles by 4 and then cast it to an integer (or floor/ceil it if you care about things collecting around zero). The following uses int
as the key but this would be fine for long
s as well. My data fits in the +/- 2^31 range after quantizing (on computers with at least sizeof int being 4 bytes) so this is sufficient for me.
// Consider using std::is_floating_point_v for K
template <typename K, typename V>
class QuantizedGrid {
int quantizer;
std::unordered_map<int, V> map;
public:
explicit QuantizedGrid(const double epsilon) {
quantizer = 1.0 / epsilon;
}
V& operator[](const K k) {
return map[static_cast<int>(quantizer * k)];
}
bool contains(const K k) const {
int key = static_cast<int>(quantizer * k);
return map.count(key) > 0;
}
};
Compared to using upper/lower bound checks, the performance from that to the above code is as follows:
or rather it was 650% faster to convert to an integer and insert into a dictionary that supports O(1) amortized insertion/lookup/delete.
It is also way less code than implementing a custom upper/lower bound.
My guess is the O(lg n) BST lookup time is much worse by the O(1) dictionary time, and the cost of casting a float to and int is small enough to make this bound by data structure lookups/cache issues.
What is the fastest way to get all the keys between 2 keys in a SortedList?
I wondered why SortedList<TKey, TValue>
doesn't provide a BinarySearch
when it's already sorted by the keys. It also uses the method itself(f.e. in IndexOf
), but the array used is a private field. So i've tried to create an extension method for this. Have a look:
public static class SortedListExtensions
{
public static int BinarySearch<TKey, TValue>(this SortedList<TKey, TValue> sortedList, TKey keyToFind, IComparer<TKey> comparer = null)
{
TKey[] keyArray = sortedList.GetKeyArray();
if (comparer == null) comparer = Comparer<TKey>.Default;
int index = Array.BinarySearch<TKey>(keyArray, keyToFind, comparer);
return index;
}
public static IEnumerable<TKey> GetKeyRangeBetween<TKey, TValue>(this SortedList<TKey, TValue> sortedList, TKey low, TKey high, IComparer<TKey> comparer = null)
{
int lowIndex = sortedList.BinarySearch(low, comparer);
if (lowIndex < 0)
{
// list doesn't contain the key, find nearest behind
// If not found, BinarySearch returns the complement of the index
lowIndex = ~lowIndex;
}
int highIndex = sortedList.BinarySearch(high, comparer);
if (highIndex < 0)
{
// list doesn't contain the key, find nearest before
// If not found, BinarySearch returns the complement of the index
highIndex = ~highIndex - 1;
}
IList<TKey> keys = sortedList.Keys;
for (int i = lowIndex; i < highIndex; i++)
{
yield return keys[i];
}
}
private static TKey[] GetKeyArray<TKey, TValue>(this SortedList<TKey, TValue> sortedList)
{
// trying to resolve array with reflection because SortedList.keys is a private array
Type type = typeof(SortedList<TKey, TValue>);
FieldInfo keyField = type.GetField("keys", BindingFlags.NonPublic | BindingFlags.Instance);
if(keyField != null && keyField.GetValue(sortedList) is TKey[] keyArrayFromReflection)
{
return keyArrayFromReflection;
}
// fallback: fill a new array from the public Keys property, you might want to log this since you should change the reflection implementation
IList<TKey> keyList = sortedList.Keys;
TKey[] keyArray = new TKey[keyList.Count];
for (int i = 0; i < keyArray.Length; i++)
keyArray[i] = keyList[i];
return keyArray;
}
}
Create a sample SortedList
:
DateTime start = DateTime.Today.AddDays(-50);
var sortedList = new SortedList<DateTime, string>();
for(int i = 0; i < 50; i+=2)
{
var dt = start.AddDays(i);
sortedList.Add(dt, string.Format("Date #{0}: {1}", i, dt.ToShortDateString()));
}
DateTime low = start.AddDays(1); // is not in the SortedList which contains only every second day
DateTime high = start.AddDays(10);
Now you can use the extension method to get the range of keys between a low- and high-key:
IEnumerable<DateTime> dateRange = sortedList.GetKeyRangeBetween(low, high).ToList();
Result:
04/04/2014
04/06/2014
04/08/2014
04/10/2014
Note that this is built from scratch and not really tested.
The quickest way to find an integer in sorted range data
Stick with a binary search. If you have N ranges and look for K numbers, searching would take O(KlogN).
Using @Steve Wellens's suggestion of spreading everything will take a lot of setup - O(R) (with R being the last range ending - 1123 in your example). After the setup, K searches would take O(K), so you're looking at O(K+R)
Now, if the maximum number is less than KlogN, and memory is not a problem, spread out the ranges. If not (which is my guess, you said you had a lot of data), a binary search will be faster.
Efficiently find nearest dictionary key
Since SortedDictionary
is sorted on the key, you can create a sorted list of keys with
var keys = new List<DateTime>(dictionary.Keys);
and then efficiently perform binary search on it:
var index = keys.BinarySearch(key);
As the documentation says, if index
is positive or zero then the key exists; if it is negative, then ~index
is the index where key
would be found at if it existed. Therefore the index of the "immediately smaller" existing key is ~index - 1
. Make sure you handle correctly the edge case where key
is smaller than any of the existing keys and ~index - 1 == -1
.
Of course the above approach really only makes sense if keys
is built up once and then queried repeatedly; since it involves iterating over the whole sequence of keys and doing a binary search on top of that there's no point in trying this if you are only going to search once. In that case even naive iteration would be better.
Update
As digEmAll correctly points out, you could also switch to SortedList<DateTime, decimal>
so that the Keys
collection implements IList<T>
(which SortedDictionary.Keys does not). That interface provides enough functionality to perform a binary search on it manually, so you could take e.g. this code and make it an extension method on IList<T>
.
You should also keep in mind that SortedList
performs worse than SortedDictionary
during construction if the items are not inserted in already-sorted order, although in this particular case it is highly likely that dates are inserted in chronological (sorted) order which would be perfect.
Related Topics
Cannot Access a Disposed Object in ASP.NET Core When Injecting Dbcontext
How to Post a List of Items in MVC
C# "Internal" Access Modifier When Doing Unit Testing
How to View Msil/Cil Generated by C# Compiler? Why Is It Called Assembly
What's the Difference Between "Groups" and "Captures" in .Net Regular Expressions
Performance of Calling Delegates VS Methods
How to Sum Up an Array of Integers in C#
Getting Serial Port Information
Static Generic Class as Dictionary
How to Mark a Method as Obsolete or Deprecated
Posting JSON Data to ASP.NET MVC
How to Use Microsoft.Office.Interop.Excel on a MAChine Without Installed Ms Office