Merge Multiple Lists into One List With Linq

c# - using SelectMany to merge multiple lists

You can use SelectMany here:

// suppose reports is of type List<List<SomeClass>>
reports.SelectMany(x => x)
.Select(x =>
{ return new {Log = x, Time = TimeSpan.Parse(x.Split(' ')[0])}; })
.OrderBy(x => x.Time)
.ToList();

Adding a .Distinct call will make it have the same effect as Union:

reports.SelectMany(x => x).Distinct()
.Select(x =>
{ return new {Log = x, Time = TimeSpan.Parse(x.Split(' ')[0])}; })
.OrderBy(x => x.Time)
.ToList();

Alternatively, keep using Union:

IEnumerable<SomeClass> query = reports[0];
for (int i = 1 ; i < reports.Length ; i++) {
query = query.Union(reports[i]);
}

How to merge multiple lists based on a condition using LINQ

You need to keep track of the type of value one way or another. Given your current type definitions, something like this should work:

var AllReadings = SoundReadings.Select(x => new { x.DatetimeStamp, x.Value, Type = "SoundReadings" })
.Union(TempReadings.Select(x => new { x.DatetimeStamp, x.Value, Type = "TempReadings" })
.Union(HumidReadings.Select(x => new { x.DatetimeStamp, x.Value, Type = "HumidReadings" })))
.GroupBy(obj => obj.DatetimeStamp)
.Select(newObj => new SensorReading()
{
DatetimeStamp = newObj.Key,
Sound = newObj.FirstOrDefault(x => x.Type == "SoundReadings")?.Value,
Temperature = newObj.FirstOrDefault(x => x.Type == "TempReadings")?.Value,
Humidity = newObj.FirstOrDefault(x => x.Type == "HumidReadings")?.Value
}).ToList();

LINQ merge multiple lists to one list

You can combine them on the client side.

var filtered = entitySource.Queryable()
.Where(ent => input.Id == ent.Id);

var rawData = await
filtered.SelectMany(e => e.table1.Select(t => new { e.Id, SubId = t.Id } ))
.Concat(filtered.SelectMany(e => e.table2.Select(t => new { e.Id, SubId = t.Id } ))
.Concat(filtered.SelectMany(e => e.table3.Select(t => new { e.Id, SubId = t.Id } ))
.ToListAsync(cancellationToken);

var entities = rawData.GroupBy(x => x.Id)
.Select(g => new MergedList()
{
PrincipalId = g.Key,
CombinedIds = g.Select(x => x.SubId).ToList()
})
.ToList();

Combine two list of lists in LINQ

Using Concat and GroupBy methods can produce the expected result (after making a code compile, of course)

var result = list1.Concat(list2).GroupBy(kv => kv.Key, kv => kv.DoubleList)
.Select(g => new KeyValues { Key = g.Key, DoubleList = g.SelectMany(i => i).ToList() });

Optimizing LINQ combining multiple lists into new generic list

That is what Zip is for.

var result = FirstNames
.Zip(LastNames, (f,l) => new {f,l})
.Zip(BirthDates, (fl, b) => new {First=fl.f, Last = fl.l, BirthDate = b});

Regarding scaling:

int count = 50000000;
var FirstNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var LastNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var BirthDates = Enumerable.Range(0, count).Select(x=> DateTime.Now.AddSeconds(x));

var sw = new Stopwatch();
sw.Start();

var result = FirstNames
.Zip(LastNames, (f,l) => new {f,l})
.Zip(BirthDates, (fl, b) => new {First=fl.f, Last = fl.l, BirthDate = b});

foreach(var r in result)
{
var x = r;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds); // Returns 69191 on my machine.

While these blow up with out of memory:

int count = 50000000;
var FirstNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var LastNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var BirthDates = Enumerable.Range(0, count).Select(x=> DateTime.Now.AddSeconds(x));

var sw = new Stopwatch();
sw.Start();

var FirstNamesList = FirstNames.ToList(); // Blows up in 32-bit .NET with out of Memory
var LastNamesList = LastNames.ToList();
var BirthDatesList = BirthDates.ToList();

var result = Enumerable.Range(0, FirstNamesList.Count())
.Select(i => new
{
First = FirstNamesList[i],
Last = LastNamesList[i],
Birthdate = BirthDatesList[i]
});

result = BirthDatesList.Select((bd, i) => new
{
First = FirstNamesList[i],
Last = LastNamesList[i],
BirthDate = bd
});

foreach(var r in result)
{
var x = r;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);

At lower values, the cost of converting the Enumerables to a List is much more expensive than the additional object creation as well. Zip was approximately 30% faster than the indexed versions. As you add more columns, Zips advantage would likely shrink.

The performance characteristics are also very different. The Zip routine will start outputting answers almost immediately, while the others will start outputting answers only after the entire Enumerables have been read and converted to Lists, so if you take the results and do pagination on it with .Skip(x).Take(y), or check if something exists .Any(...) it will be magnitudes faster as it doesn't have to convert the entire enumerable.

Lastly, if it becomes performance critical, and you need to implement many results, you could consider extending zip to handle an arbitrary number of Enumerables like (shamelessly stolen from Jon Skeet - https://codeblog.jonskeet.uk/2011/01/14/reimplementing-linq-to-objects-part-35-zip/):

private static IEnumerable<TResult> Zip<TFirst, TSecond, TThird, TResult>( 
IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
IEnumerable<TThird> third,
Func<TFirst, TSecond, TThird, TResult> resultSelector)
{
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
using (IEnumerator<TThird> iterator3 = third.GetEnumerator())
{
while (iterator1.MoveNext() && iterator2.MoveNext() && iterator3.MoveNext())
{
yield return resultSelector(iterator1.Current, iterator2.Current, iterator3.Current);
}
}
}

Then you can do this:

var result = FirstNames
.Zip(LastNames, BirthDates, (f,l,b) => new {First=f,Last=l,BirthDate=b});

And now you don't even have the issue of the middle object being created, so you get the best of all worlds.

Or use the implementation here to handle any number generically: Zip multiple/abitrary number of enumerables in C#

How to merge multiple list by id and get specific data?

I think I'd mostly skip LINQ for this

class Thing{
public string Name {get;set;}
public int Count {get;set;}
public long LastTimestamp {get;set;}
}

...

var ids = new Dictionary<int, string>();
var result = new Dictionary<string, Thing>();
foreach(var g in groupNames) {
ids[g.Id] = g.Name;
result[g.Name] = new Whatever { Name = n };
}

foreach(var c in counts)
result[ids[c.Id]].Count++;

foreach(var l in lastTime){
var t = result[ids[l.Id]];
if(t.LastTimeStamp < l.Timestamp) t.LastTimeStamp = l.TimeStamp;
}

We start off making two dictionaries (you could ToDictionary this).. If groupNames is already a dictionary that maps id:name then you can skip making the ids dictionary and just use groupNames directly. This gives us fast lookup from ID to Name, but we actually want to colelct results into a name:something mapping, so we make one of those too. doing result[name] = thing always succeeds, even if we've seen name before. We could skip on some object creation with a ContainsKey check here if you want

Then all we need to do is enumerate our other N collections, building the result. The result we want is accessed from result[ids[some_id_value_here]] and it always exists if groupnames id space is complete (we will never have an id in the counts that we do not have in groupNames)

For counts, we don't care for any of the other data; just the presence of the id is enough to increment the count

For dates, it's a simple max algorithm of "if known max is less than new max make known max = new max". If you know your dates list is sorted ascending you can skip that if too..



Related Topics



Leave a reply



Submit