Linq Aggregate Algorithm Explained

LINQ Aggregate algorithm explained

The easiest-to-understand definition of Aggregate is that it performs an operation on each element of the list taking into account the operations that have gone before. That is to say it performs the action on the first and second element and carries the result forward. Then it operates on the previous result and the third element and carries forward. etc.

Example 1. Summing numbers

var nums = new[]{1,2,3,4};
var sum = nums.Aggregate( (a,b) => a + b);
Console.WriteLine(sum); // output: 10 (1+2+3+4)

This adds 1 and 2 to make 3. Then adds 3 (result of previous) and 3 (next element in sequence) to make 6. Then adds 6 and 4 to make 10.

Example 2. create a csv from an array of strings

var chars = new []{"a","b","c", "d"};
var csv = chars.Aggregate( (a,b) => a + ',' + b);
Console.WriteLine(csv); // Output a,b,c,d

This works in much the same way. Concatenate a a comma and b to make a,b. Then concatenates a,b with a comma and c to make a,b,c. and so on.

Example 3. Multiplying numbers using a seed

For completeness, there is an overload of Aggregate which takes a seed value.

var multipliers = new []{10,20,30,40};
var multiplied = multipliers.Aggregate(5, (a,b) => a * b);
Console.WriteLine(multiplied); //Output 1200000 ((((5*10)*20)*30)*40)

Much like the above examples, this starts with a value of 5 and multiplies it by the first element of the sequence 10 giving a result of 50. This result is carried forward and multiplied by the next number in the sequence 20 to give a result of 1000. This continues through the remaining 2 element of the sequence.

Live examples: http://rextester.com/ZXZ64749

Docs: http://msdn.microsoft.com/en-us/library/bb548651.aspx


Addendum

Example 2, above, uses string concatenation to create a list of values separated by a comma. This is a simplistic way to explain the use of Aggregate which was the intention of this answer. However, if using this technique to actually create a large amount of comma separated data, it would be more appropriate to use a StringBuilder, and this is entirely compatible with Aggregate using the seeded overload to initiate the StringBuilder.

var chars = new []{"a","b","c", "d"};
var csv = chars.Aggregate(new StringBuilder(), (a,b) => {
if(a.Length>0)
a.Append(",");
a.Append(b);
return a;
});
Console.WriteLine(csv);

Updated example: http://rextester.com/YZCVXV6464

How can i use LINQ Aggregate here?

To tell the truth this is enough:

var sum = Groups.Select(x => x.Width).Sum();
var sum = Groups.Sum(x => x.Width);

But, if you want Aggregate():

var sum = Groups.Select(x => x.Width).Aggregate((current, next) => current += next);

trying to use linq aggregate function

I fix this issue by using aggregate like as below

 LibrarySourceRowInputs = libraries?.Aggregate(new List<LibrarySourceRowInput>(), 
(acc, ids) =>
{
if (ids != null)
{
acc.Add(new LibrarySourceRowInput() { LibrarySourceId = ids.Id,
SourceOfDataId = ids.SourceOfData.Id });
}
return acc;
})

Using LINQ, how to aggregate a collection to a specified type?

You can use one of LINQ's Aggregate functions to do this:

Aggregate<TSource,TAccumulate>(IEnumerable<TSource>, TAccumulate, Func<TAccumulate,TSource,TAccumulate>)

For example:

var aggregate = data.Aggregate(0, (acc, cur) => acc ^ Math.Max(cur.i, cur.j));

As described in the docs:

Applies an accumulator function over a sequence. The specified seed
value is used as the initial accumulator value.

Aggregate and break down data using linq

Try following :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("ID", typeof(int));
dt.Columns.Add("Label", typeof(string));
dt.Columns.Add("Type", typeof(int));
dt.Columns.Add("Amount", typeof(int));
dt.Columns.Add("Category", typeof(int));
dt.Columns.Add("OriginDate", typeof(DateTime));

dt.Rows.Add(new object[] { 1,"Foo",1,100,8, DateTime.Parse("2017-01-23")});
dt.Rows.Add(new object[] { 2,"Bar",2,250,1, DateTime.Parse("2017-01-30")});
dt.Rows.Add(new object[] { 3,"Foo",1,400,12, DateTime.Parse("2017-02-15")});

var results = dt.AsEnumerable().GroupBy(x => new { month = x.Field<DateTime>("OriginDate").Month, year = x.Field<DateTime>("OriginDate").Year }).Select(x => new {
month = x.Key.month,
year = x.Key.year,
amount = x.Select(y => y.Field<int>("Amount")).Sum()
}).ToList();

}
}
}

Aggregate vs Sum Performance in LINQ

Note: My computer is running .Net 4.5 RC, so it's possible that my results are affected by this.

Measuring the time it takes to execute a method just once is usually not very useful. It can be easily dominated by things like JIT compilation, which are not actual bottlenecks in real code. Because of this, I measured executing each method 100× (in Release mode without debugger attached). My results are:

  • Aggregate(): 9 ms
  • Sum(lambda): 12 ms
  • Sum(): 6 ms

The fact that Sum() is the fastest is not surprising: it contains a simple loop without any delegate invocations, which is really fast. The difference between Sum(lambda) and Aggregate() is not nearly as prominent as what you measured, but it's still there. What could be the reason for it? Let's look at decompiled code for the two methods:

public static TAccumulate Aggregate<TSource, TAccumulate>(this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> func)
{
if (source == null)
throw Error.ArgumentNull("source");
if (func == null)
throw Error.ArgumentNull("func");

TAccumulate local = seed;
foreach (TSource local2 in source)
local = func(local, local2);
return local;
}

public static int Sum<TSource>(this IEnumerable<TSource> source, Func<TSource, int> selector)
{
return source.Select<TSource, int>(selector).Sum();
}

As you can see, Aggregate() uses a loop but Sum(lambda) uses Select(), which in turn uses an iterator. And using an iterator means there is some overhead: creating the iterator object and (probably more importantly) one more method invocation for each item.

Let's verify that using Select() is actually the reason by writing our own Sum(lambda) twice, once using Select(), which should behave the same as Sum(lambda) from the framework, and once without using Select():

public static int SlowSum<T>(this IEnumerable<T> source, Func<T, int> selector)
{
return source.Select(selector).Sum();
}

public static int FastSum<T>(this IEnumerable<T> source, Func<T, int> selector)
{
if (source == null)
throw new ArgumentNullException("source");
if (selector == null)
throw new ArgumentNullException("selector");

int num = 0;
foreach (T item in source)
num += selector(item);
return num;
}

My measurements confirm what I thought:

  • SlowSum(lambda): 12 ms
  • FastSum(lambda): 9 ms

Linq operator to aggregate matching value and weight them

Using the indexed Select method, you can match each timeframe to its weight when it should be counted:

double[] confirmingList = new double[] { -3, -2, -1 };
var weights = new[] { 0.2, 0.25, 0.3, 0.4, 0.5, 0.8, 1.0 };

var Tdb = rDatas
.Select(d => new {
TScore = (new[] {
d.T15Min, // Let this count .2
d.T30Min, // .25
d.T65Min, // .3
d.T130Min, // .4
d.T195Min, // .5
d.TDaily, // .8
d.TWeekly // 1
}).Select((t,i) => confirmingList.Contains(t) ? weights[i] : 0).Sum()
});

C# aggregate in a better time complexity

You can create a LINQ-like extension method that aggregates a sequence "top-down" in a hierarchical fashion like you describe. However, for efficiency this requires random access to the source sequence so building on top on IEnumerable<T> is not the best choice. But you can use IReadOnlyList<T> as an alternative (which off course requires that your source is stored in an array or list).

static class ReadOnlyListExtensions {

public static T HierarchicalAggregate<T>(this IReadOnlyList<T> source, Func<T, T, T> func) {
if (source == null)
throw new ArgumentNullException("source");
if (func == null)
throw new ArgumentNullException("func");
if (source.Count == 0)
throw new InvalidOperationException("Sequence contains no elements");
return Recurse(source, 0, source.Count, func);
}

static T Recurse<T>(this IReadOnlyList<T> source, Int32 startIndex, Int32 count, Func<T, T, T> func) {
if (count == 1)
return source[startIndex];
var leftCount = count/2;
var leftAggregate = Recurse(source, startIndex, leftCount, func);
var rightCount = count - leftCount;
var rightAggregate = Recurse(source, startIndex + leftCount, rightCount, func);
return func(leftAggregate, rightAggregate);
}

}

Note that the division performed by this algorithm is slightly different compared to your example. At the first level the 10 element sequence is divided into two 5 element sequences that are then each is divided into a 2 and a 3 element sequence etc.:


55
15 + 40
3 + 12 13+27
1+2 3+9 6+7 8+19
4+5 9+10

C# Linq aggregate intermediate values

What you need is a custom version of aggregate:

public static IEnumerable<R> AggregateSequence<A, R>(
this IEnumerable<A> items,
Func<A, R, R> aggregator,
R initial)
{
// Error cases go here.
R result = initial;
foreach(A item in items)
{
result = aggregator(item, result);
yield return result;
}
}

That is a general mechanism for solving your specific problem:

public static IEnumerable<int> MovingSum(this IEnumerable<int> items)
{
return items.AggregateSequence( (item, sum) => item + sum, 0 );
}

And now you can solve your problem with

mySequence.MovingSum().Max();


Related Topics



Leave a reply



Submit