Is There a Lower Bound Function on a Sortedlist<K ,V>

Is there a Lower Bound function on a SortedListK ,V?

Binary search the SortedList.Keys collection.

Here we go. This is O(log n):

private static int BinarySearch<T>(IList<T> list, T value)
{
if (list == null)
throw new ArgumentNullException("list");
var comp = Comparer<T>.Default;
int lo = 0, hi = list.Count - 1;
while (lo < hi) {
int m = (hi + lo) / 2; // this might overflow; be careful.
if (comp.Compare(list[m], value) < 0) lo = m + 1;
else hi = m - 1;
}
if (comp.Compare(list[lo], value) < 0) lo++;
return lo;
}

public static int FindFirstIndexGreaterThanOrEqualTo<T,U>
(this SortedList<T,U> sortedList, T key)
{
return BinarySearch(sortedList.Keys, key);
}

rationale for std::lower_bound and std::upper_bound?

If you have multiple elements in the range [first, last) whose value equals the value val you are searching for, then the range [l, u) where

l = std::lower_bound(first, last, val)
u = std::upper_bound(first, last, val)

is precisely the range of elements equal to val within the range [first, last). So l and u are the "lower bound" and "upper bound" for the equal range. It makes sense if you're accustomed to thinking in terms of half-open intervals.

(Note that std::equal_range will return both the lower and upper bound in a pair, in a single call.)

Finding closest lower key of a number in a dictionary

You can filters the keys keeping only the lowers, then gets the max key:

Dictionary<int, float> delta = new Dictionary<int, float>();
var key = 16;

var maxKey = delta.Keys.Where(k => k < key).Max();
var value = delta[maxKey];

However, as noted in the comments a better approach is use classes like SortedDictionary<> or SortedList<>.

If all add/remove operations runs first, and later only searches are performed a solution can be convert the keys to array (O(N)) and use Array.BinarySearch() method (O(log N)):

SortedDictionary<int, float> sortedDict = new SortedDictionary<int, float>();

// Add-Remove operations

var keys = sortedDict.Keys.ToArray();

// Search operations

int maxKey = Array.BinarySearch(keys, key);
float value = maxIndex >= 0 ? sortedDict[maxKey] : sortedDict[~maxIndex - 1];

A SortedList.IndexOfKey(key) that returns the index of the item where item.key = key

I'm afraid you're out of luck, there's nothing built-in.

If you create a binary search extension method for IList<T> then you could use it against the Keys property. This is a bit annoying, though not too difficult.

(The convention used by the framework's built-in binary search methods -- Array and List<T> -- is to return the bitwise complement of the index of the next element when the element isn't found.)

int index = yourSortedList.Keys.YourBinarySearchExtensionMethod(key);
if (index >= 0)
{
// key found
}
else
{
// key not found
}

Looking up floats with some epsilon from a data structure in C#, with search and insert both O(lg n) time

Now while this answer I'm about to give is a C++ profiled answer and not with C#, it solves the problem in a much better and faster way.

The better way to solve this is multiplying the floating point by the inverse of the epsilon. For example if your epsilon is 0.25, then you'd want to multiply all your floats/doubles by 4 and then cast it to an integer (or floor/ceil it if you care about things collecting around zero). The following uses int as the key but this would be fine for longs as well. My data fits in the +/- 2^31 range after quantizing (on computers with at least sizeof int being 4 bytes) so this is sufficient for me.

// Consider using std::is_floating_point_v for K
template <typename K, typename V>
class QuantizedGrid {
int quantizer;
std::unordered_map<int, V> map;

public:
explicit QuantizedGrid(const double epsilon) {
quantizer = 1.0 / epsilon;
}

V& operator[](const K k) {
return map[static_cast<int>(quantizer * k)];
}

bool contains(const K k) const {
int key = static_cast<int>(quantizer * k);
return map.count(key) > 0;
}
};

Compared to using upper/lower bound checks, the performance from that to the above code is as follows:

Sample Image

or rather it was 650% faster to convert to an integer and insert into a dictionary that supports O(1) amortized insertion/lookup/delete.

It is also way less code than implementing a custom upper/lower bound.

My guess is the O(lg n) BST lookup time is much worse by the O(1) dictionary time, and the cost of casting a float to and int is small enough to make this bound by data structure lookups/cache issues.

What is the fastest way to get all the keys between 2 keys in a SortedList?

I wondered why SortedList<TKey, TValue> doesn't provide a BinarySearch when it's already sorted by the keys. It also uses the method itself(f.e. in IndexOf), but the array used is a private field. So i've tried to create an extension method for this. Have a look:

public static class SortedListExtensions
{
public static int BinarySearch<TKey, TValue>(this SortedList<TKey, TValue> sortedList, TKey keyToFind, IComparer<TKey> comparer = null)
{
TKey[] keyArray = sortedList.GetKeyArray();
if (comparer == null) comparer = Comparer<TKey>.Default;
int index = Array.BinarySearch<TKey>(keyArray, keyToFind, comparer);
return index;
}

public static IEnumerable<TKey> GetKeyRangeBetween<TKey, TValue>(this SortedList<TKey, TValue> sortedList, TKey low, TKey high, IComparer<TKey> comparer = null)
{
int lowIndex = sortedList.BinarySearch(low, comparer);
if (lowIndex < 0)
{
// list doesn't contain the key, find nearest behind
// If not found, BinarySearch returns the complement of the index
lowIndex = ~lowIndex;
}

int highIndex = sortedList.BinarySearch(high, comparer);
if (highIndex < 0)
{
// list doesn't contain the key, find nearest before
// If not found, BinarySearch returns the complement of the index
highIndex = ~highIndex - 1;
}

IList<TKey> keys = sortedList.Keys;
for (int i = lowIndex; i < highIndex; i++)
{
yield return keys[i];
}
}

private static TKey[] GetKeyArray<TKey, TValue>(this SortedList<TKey, TValue> sortedList)
{
// trying to resolve array with reflection because SortedList.keys is a private array
Type type = typeof(SortedList<TKey, TValue>);
FieldInfo keyField = type.GetField("keys", BindingFlags.NonPublic | BindingFlags.Instance);
if(keyField != null && keyField.GetValue(sortedList) is TKey[] keyArrayFromReflection)
{
return keyArrayFromReflection;
}

// fallback: fill a new array from the public Keys property, you might want to log this since you should change the reflection implementation
IList<TKey> keyList = sortedList.Keys;
TKey[] keyArray = new TKey[keyList.Count];
for (int i = 0; i < keyArray.Length; i++)
keyArray[i] = keyList[i];
return keyArray;
}
}

Create a sample SortedList:

DateTime start = DateTime.Today.AddDays(-50);
var sortedList = new SortedList<DateTime, string>();
for(int i = 0; i < 50; i+=2)
{
var dt = start.AddDays(i);
sortedList.Add(dt, string.Format("Date #{0}: {1}", i, dt.ToShortDateString()));
}

DateTime low = start.AddDays(1); // is not in the SortedList which contains only every second day
DateTime high = start.AddDays(10);

Now you can use the extension method to get the range of keys between a low- and high-key:

IEnumerable<DateTime> dateRange = sortedList.GetKeyRangeBetween(low, high).ToList();

Result:

04/04/2014
04/06/2014
04/08/2014
04/10/2014

Note that this is built from scratch and not really tested.

The quickest way to find an integer in sorted range data

Stick with a binary search. If you have N ranges and look for K numbers, searching would take O(KlogN).

Using @Steve Wellens's suggestion of spreading everything will take a lot of setup - O(R) (with R being the last range ending - 1123 in your example). After the setup, K searches would take O(K), so you're looking at O(K+R)

Now, if the maximum number is less than KlogN, and memory is not a problem, spread out the ranges. If not (which is my guess, you said you had a lot of data), a binary search will be faster.

Efficiently find nearest dictionary key

Since SortedDictionary is sorted on the key, you can create a sorted list of keys with

var keys = new List<DateTime>(dictionary.Keys);

and then efficiently perform binary search on it:

var index = keys.BinarySearch(key);

As the documentation says, if index is positive or zero then the key exists; if it is negative, then ~index is the index where key would be found at if it existed. Therefore the index of the "immediately smaller" existing key is ~index - 1. Make sure you handle correctly the edge case where key is smaller than any of the existing keys and ~index - 1 == -1.

Of course the above approach really only makes sense if keys is built up once and then queried repeatedly; since it involves iterating over the whole sequence of keys and doing a binary search on top of that there's no point in trying this if you are only going to search once. In that case even naive iteration would be better.

Update

As digEmAll correctly points out, you could also switch to SortedList<DateTime, decimal> so that the Keys collection implements IList<T> (which SortedDictionary.Keys does not). That interface provides enough functionality to perform a binary search on it manually, so you could take e.g. this code and make it an extension method on IList<T>.

You should also keep in mind that SortedList performs worse than SortedDictionary during construction if the items are not inserted in already-sorted order, although in this particular case it is highly likely that dates are inserted in chronological (sorted) order which would be perfect.



Related Topics



Leave a reply



Submit