How to Split an Ienumerable into Two by a Boolean Criteria Without Two Queries

Can I split an IEnumerable into two by a boolean criteria without two queries?

You can use this:

var groups = allValues.GroupBy(val => val.SomeProp);

To force immediate evaluation like in your example:

var groups = allValues.GroupBy(val => val.SomeProp)
.ToDictionary(g => g.Key, g => g.ToList());
List<MyObj> trues = groups[true];
List<MyObj> falses = groups[false];

Does LINQ natively support splitting a collection in two?

Does LINQ have any native support for splitting a collection in 1 linear pass?

There are no built-in methods that split a collection into two versions based on a predicate. You would need to use your own method, similar to the one you posted.

The closest built-in method would be GroupBy (or ToLookup). You could group by odd or even:

var groups = nums.GroupBy(i => IsEven(i));

This will split into two "groups" based on whether the numbers are odd or even.

C# How to split a List in two using LINQ

You can do this in one statement by converting it into a Lookup table:

var splitTables = events.Tolookup(event => event.Closer_User_ID == null);

This will return a sequence of two elements, where every element is an IGrouping<bool, EventModel>. The Key says whether the sequence is the sequence with null Closer_User_Id, or not.

However this looks rather mystical. My advice would be to extend LINQ with a new function.

This function takes a sequence of any kind, and a predicate that divides the sequence into two groups: the group that matches the predicate and the group that doesn't match the predicate.

This way you can use the function to divide all kinds of IEnumerable sequences into two sequences.

See Extension methods demystified

public static IEnumerable<IGrouping<bool, TSource>> Split<TSource>(
this IEnumerable<TSource> source,
Func<TSource,bool> predicate)
{
return source.ToLookup(predicate);
}

Usage:

IEnumerable<Person> persons = ...
// divide the persons into adults and non-adults:
var result = persons.Split(person => person.IsAdult);

Result has two elements: the one with Key true has all Adults.

Although usage has now become easier to read, you still have the problem that the complete sequence is processed, while in fact you might only want to use a few of the resulting items

Let's return an IEnumerable<KeyValuePair<bool, TSource>>, where the Boolean value indicates whether the item matches or doesn't match:

public static IEnumerable<KeyValuePair<bool, TSource>> Audit<TSource>(
this IEnumerable<TSource> source,
Func<TSource,bool> predicate)
{
foreach (var sourceItem in source)
{
yield return new KeyValuePair<bool, TSource>(predicate(sourceItem, sourceItem));
}
}

Now you get a sequence, where every element says whether it matches or not. If you only need a few of them, the rest of the sequence is not processed:

IEnumerable<EventModel> eventModels = ...
EventModel firstOpenEvent = eventModels.Audit(event => event.Closer_User_ID == null)
.Where(splitEvent => splitEvent.Key)
.FirstOrDefault();

The where says that you only want those Audited items that passed auditing (key is true).

Because you only need the first element, the rest of the sequence is not audited anymore

C# split list into two lists using bool function

You can use Linq's GroupBy():

var splitted = listOfStrings.GroupBy(s => Char.IsLetter(s[0]));

And with your predicate, it would be:

Func<string, bool> predicate;

var splitted = listOfStrings.GroupBy(predicate);

Usage:

The easiest way would be to convert the grouped data into a Dictionary<bool, IEnumerable<string>>, when the key is a bool that denotes whether the items in it start with a letter:

var splitted = list.GroupBy(x => Char.IsLetter(x[0]))
.ToDictionary(x => x.Key, z => z.ToArray());

var startWithLetter = splitted[true];
var dontStartWithLetter = splitted[false];

Of course, there are many ways to massage the data into your desired structure, but the above is pretty concise in my opinion.

See MSDN

Split an single-use large IEnumerableT in half using a condition

You can try to imlement a stateful iterator pattern over the ienumerator obtained from the initial ienumerable.

IEnumerable<T> StatefulTake(IEnumerator<T> source, Func<bool> getDone, Action setDone);

This method just checks done, calls MoveNext, yields Current and updates done if movenext returned false.

Then you split your set with subsequent calls to this method and doing partial enumeratiin on that with following methods for example:
TakeWhile
Any
First
...
Then you can do any operations on top of that, but each of those must be enumerated to the end.

var source = GetThemAll();
using (var e = source.GetEnumerator()){
bool done=!source.MoveNext();
foreach(var i in StatefulTake(e, ()=>done,()=>done=true).TakeWhile(i=>i.Time<...)){
//...
}

var theRestAverage = StatefulTake(e,()=>done,()=>done=true).Avg(i=>i.Score);
//...
}

Its a pattern i use often in my async toolkit.

Update: fixed the signature of the StatefulTake method, it can not use a ref parameter. Also the initial call to MoveNext is necessary. The three kinds of done varable referencess and the method itself should be encapsulated in a context class.

Split an IEnumerableT into fixed-sized chunks (return an IEnumerableIEnumerableT where the inner sequences are of fixed length)

You could try to implement Batch method mentioned above on your own like this:

    static class MyLinqExtensions 
{
public static IEnumerable<IEnumerable<T>> Batch<T>(
this IEnumerable<T> source, int batchSize)
{
using (var enumerator = source.GetEnumerator())
while (enumerator.MoveNext())
yield return YieldBatchElements(enumerator, batchSize - 1);
}

private static IEnumerable<T> YieldBatchElements<T>(
IEnumerator<T> source, int batchSize)
{
yield return source.Current;
for (int i = 0; i < batchSize && source.MoveNext(); i++)
yield return source.Current;
}
}

I've grabbed this code from http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx.

UPDATE: Please note, that this implementation not only lazily evaluates batches but also items inside batches, which means it will only produce correct results when batch is enumerated only after all previous batches were enumerated. For example:

public static void Main(string[] args)
{
var xs = Enumerable.Range(1, 20);
Print(xs.Batch(5).Skip(1)); // should skip first batch with 5 elements
}

public static void Print<T>(IEnumerable<IEnumerable<T>> batches)
{
foreach (var batch in batches)
{
Console.WriteLine($"[{string.Join(", ", batch)}]");
}
}

will output:

[2, 3, 4, 5, 6] //only first element is skipped.
[7, 8, 9, 10, 11]
[12, 13, 14, 15, 16]
[17, 18, 19, 20]

So, if you use case assumes batching when batches are sequentially evaluated, then lazy solution above will work, otherwise if you can't guarantee strictly sequential batch processing (e.g. when you want to process batches in parallel), you will probably need a solution which eagerly enumerates batch content, similar to one mentioned in the question above or in the MoreLINQ

how to get empty groups, lazily

The .NET platform does not contain a built-in way to produce empty IGroupings. There is no publicly accessible class that implements this interface, so we will have to create one manually:

class EmptyGrouping<TKey, TElement> : IGrouping<TKey, TElement>
{
public TKey Key { get; }

public EmptyGrouping(TKey key) => Key = key;

public IEnumerator<TElement> GetEnumerator()
=> Enumerable.Empty<TElement>().GetEnumerator();

IEnumerator IEnumerable.GetEnumerator()
=> GetEnumerator();
}

In order to check if all required groupings are available, we will need a way to compare them based on their Key. Below is a simple IEqualityComparer implementation for IGroupings:

public class GroupingComparerByKey<TKey, TElement>
: IEqualityComparer<IGrouping<TKey, TElement>>
{
public bool Equals(IGrouping<TKey, TElement> x, IGrouping<TKey, TElement> y)
=> EqualityComparer<TKey>.Default.Equals(x.Key, y.Key);

public int GetHashCode(IGrouping<TKey, TElement> obj)
=> obj.Key.GetHashCode();
}

With this infrastructure in place, we can now create a lazy LINQ operator that appends missing groupings to enumerables. Lets call it EnsureContains:

public static IEnumerable<IGrouping<TKey, TElement>> EnsureContains<TKey, TElement>(
this IEnumerable<IGrouping<TKey, TElement>> source, params TKey[] keys)
{
return source
.Union(keys.Select(key => new EmptyGrouping<TKey, TElement>(key)),
new GroupingComparerByKey<TKey, TElement>());
}

Usage example:

var groups = list
.GroupBy(i => i.Item2)
.EnsureContains(true, false);

Note: The enumerable produced by the GroupBy operator is lazy, so it is evaluated every time is used. Evaluating this operator is relatively expensive, so it is a good idea to avoid evaluating it more than once.



Related Topics



Leave a reply



Submit