Why Is Addrange Faster Than Using a Foreach Loop

Why is AddRange faster than using a foreach loop?

Potentially, AddRange can check where the value passed to it implements IList or IList<T>. If it does, it can find out how many values are in the range, and thus how much space it needs to allocate... whereas the foreach loop may need to reallocate several times.

Additionally, even after allocation, List<T> can use IList<T>.CopyTo to perform a bulk copy into the underlying array (for ranges which implement IList<T>, of course.)

I suspect you'll find that if you try your test again but using Enumerable.Range(0, 100000) for fillData instead of a List<T>, the two will take about the same time.

How can I improve performance of an AddRange method on a custom BindingList?

You can pass in a List in the constructor and make use of List<T>.Capacity.

But i bet, the most significant speedup will come form suspending events when adding a range. So I included both things in my example code.

Probably needs some finetuning to handle some worst cases and what not.

public class MyBindingList<I> : BindingList<I>
{
private readonly List<I> _baseList;

public MyBindingList() : this(new List<I>())
{

}

public MyBindingList(List<I> baseList) : base(baseList)
{
if(baseList == null)
throw new ArgumentNullException();
_baseList = baseList;
}

public void AddRange(IEnumerable<I> vals)
{
ICollection<I> collection = vals as ICollection<I>;
if (collection != null)
{
int requiredCapacity = Count + collection.Count;
if (requiredCapacity > _baseList.Capacity)
_baseList.Capacity = requiredCapacity;
}

bool restore = RaiseListChangedEvents;
try
{
RaiseListChangedEvents = false;
foreach (I v in vals)
Add(v); // We cant call _baseList.Add, otherwise Events wont get hooked.
}
finally
{
RaiseListChangedEvents = restore;
if (RaiseListChangedEvents)
ResetBindings();
}
}
}

You cannot use the _baseList.AddRangesince BindingList<T> wont hook the PropertyChanged event then. You can bypass this only using Reflection by calling the private Method HookPropertyChanged for each Item after AddRange. this however only makes sence if vals (your method parameter) is a collection. Otherwise you risk enumerating the enumerable twice.

Thats the closest you can get to "optimal" without writing your own BindingList.
Which shouldnt be too dificult as you could copy the source code from BindingList and alter the parts to your needs.

c# .net Create a list from two object list - Performance differences

Well, you probably won't notice a difference, so your old way with loops is perfectly fine. But with way 1 you can save few lines of code. List.AddRange can indeed be a little bit more efficient because it can initialize the list (and the underlying storage which is an array) with the correct size when you pass a type which is a collection(has a Count property). If the array is not initialized with the correct size it must be resized in the loop.

You can see the optimization approach in the source(InsertRange called from AddRange):

public void InsertRange(int index, IEnumerable<T> collection) {
// ...

ICollection<T> c = collection as ICollection<T>;
if( c != null ) { // if collection is ICollection<T>
// this is the optimzed version if you use AddRange with a list
int count = c.Count;
if (count > 0) {
EnsureCapacity(_size + count);
if (index < _size) {
Array.Copy(_items, index, _items, index + count, _size - index);
}

// If we're inserting a List into itself, we want to be able to deal with that.
if (this == c) {
// Copy first part of _items to insert location
Array.Copy(_items, 0, _items, index, index);
// Copy last part of _items back to inserted location
Array.Copy(_items, index+count, _items, index*2, _size-index);
}
else {
T[] itemsToInsert = new T[count];
c.CopyTo(itemsToInsert, 0);
itemsToInsert.CopyTo(_items, index);
}
_size += count;
}
}
else {
// this is the loop version that you use
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Insert(index++, en.Current);
}
}
}
_version++;
}

Entity Framework 6 DbSet AddRange vs IDbSet Add - How Can AddRange be so much faster?

As Jakub answered, calling SaveChanges after every added entity was not helping. But you would still get some performance problems even if you move it out. That will not fix the performance issue caused by the Add method.

Add vs AddRange

That's a very common error to use the Add method to add multiple entities. In fact, it's the DetectChanges method that's INSANELY slow.

  • The Add method DetectChanges after every record added.
  • The AddRange method DetectChanges after all records are added.

See: Entity Framework - Performance Add


It is perhaps not SqlBulkCopy fast, but it is still a huge improvement

It's possible to get performance VERY close to SqlBulkCopy.

Disclaimer: I'm the owner of the project Entity Framework Extensions

(This library is NOT free)

This library can make your code more efficient by allowing you to save multiples entities at once. All bulk operations are supported:

  • BulkSaveChanges
  • BulkInsert
  • BulkUpdate
  • BulkDelete
  • BulkMerge
  • BulkSynchronize

Example:

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);

// Customize Primary Key
context.BulkMerge(customers, operation => {
operation.ColumnPrimaryKeyExpression =
customer => customer.Code;
});

foreach loop repeats result of searchresult resultpropertycollection more than once in a list of 31 users

This is likely because you're outputting the entire GroupMembers collection on each iteration of both the other two foreach loops.

Instead we should move the code that outputs the GroupMembers information to the console after the outer foreach loop (after you've finished populating it).

foreach (SearchResult result in search.FindAll())
{
ResultPropertyCollection resultProperties = result.Properties;

foreach (string groupMemberDN in resultProperties["uniqueMember"])
{
DirectoryEntry directoryMember = new DirectoryEntry(
"LDAP://123.45.678.9:389/" + groupMemberDN,
"uid=test_user, ou=test, dc=test2,dc=test3", "test-abc",
AuthenticationTypes.None);

GroupMembers.Add(directoryMember.Properties["mail"][0].ToString());
}
}

// Now that our collection is fully populated, output it to the console
foreach (string member in GroupMembers)
{
Console.WriteLine(member);
}

Performance of Entity Framework

You should always use AddRange over Add. The Add method will try to DetectChanges every time the add method is invoked while AddRange only once.

public static void SaveCombiners()
{
using (var db = new IP_dbEntities())
{
db.COMBINERs.RemoveRange(db.COMBINERs);

List<COMBINER> list = new List<COMBINER>();

foreach (var type1 in EventTypesList)
{
foreach (var type2 in EventTypesList)
{
list.Add(new COMBINER()
{
EVENTS_TYPE = db.EVENTS_TYPE.Single(type => type.event_type == type1),
EVENTS_TYPE1 = db.EVENTS_TYPE.Single(type => type.event_type == type2),
combine_status = _eventTypesCombinerCollection[type1][type2].Value == true ? "+" : "-"
});
}
}

db.COMBINERs.AddRange(list);
db.SaveChanges();
}
}

That being said, you face another performance issue.

A database round trip is required for every record to delete or to add. So, if you delete 10,000 records and add 5,000 records, 15,000 database round trips will be required which is VERY slow.

Disclaimer: I'm the owner of the project Entity Framework Extensions

This library allows you to perform Bulk Operations within Entity Framework. You simply have to change "SaveChanges" by "BulkSaveChanges" to dramatically improve performance.

public static void SaveCombiners()
{
using (var db = new IP_dbEntities())
{
db.COMBINERs.RemoveRange(db.COMBINERs);
// ... code..
db.COMBINERs.AddRange(list);

db.BulkSaveChanges();
}
}

Parallel.Foreach taking more time than normal foreach in c#

You are locking in the parallel part.
So you wait in each thread for the active thread to remove the lock.
So this is nearly the same as the sequencial foreach.

For example:

Parallel.Foreach:

Cycle1

Thread1: filingListnew.Add(Object1);

Thread2: locked

Thread3: locked

Thread4: locked

Cycle2

Thread1: locked

Thread2: filingListnew.Add(Object2);

Thread3: locked

Thread4: locked

Cycle3
...

"Normal" foreach:

Cycle1

Main Thread: filingListnew.Add(Object1);

Cycle2

Main Thread: filingListnew.Add(Object2);

Cycle3
...

As you can see in the examples you cannot gain performance the way you are using ParallelForeach.



Related Topics



Leave a reply



Submit