Best Way to Iterate Over an Array Without Blocking the Ui

Best way to iterate over an array without blocking the UI

You have a choice of with or without webWorkers:

Without WebWorkers

For code that needs to interact with the DOM or with lots of other state in your app, you can't use a webWorker so the usual solution is to break your work into chunks do each chunk of work on a timer. The break between chunks with the timer allows the browser engine to process other events that are going on and will not only allow user input to get processed, but also allow the screen to draw.

Usually, you can afford to process more than one on each timer which is both more efficient and faster than only doing one per timer. This code gives the UI thread a chance to process any pending UI events between each chunk which will keep the UI active.

function processLargeArray(array) {
// set this to whatever number of items you can process at once
var chunk = 100;
var index = 0;
function doChunk() {
var cnt = chunk;
while (cnt-- && index < array.length) {
// process array[index] here
++index;
}
if (index < array.length) {
// set Timeout for async iteration
setTimeout(doChunk, 1);
}
}
doChunk();
}

processLargeArray(veryLargeArray);

Here's a working example of the concept - not this same function, but a different long running process that uses the same setTimeout() idea to test out a probability scenario with a lot of iterations: http://jsfiddle.net/jfriend00/9hCVq/


You can make the above into a more generic version that calls a callback function like .forEach() does like this:

// last two args are optional
function processLargeArrayAsync(array, fn, chunk, context) {
context = context || window;
chunk = chunk || 100;
var index = 0;
function doChunk() {
var cnt = chunk;
while (cnt-- && index < array.length) {
// callback called with args (value, index, array)
fn.call(context, array[index], index, array);
++index;
}
if (index < array.length) {
// set Timeout for async iteration
setTimeout(doChunk, 1);
}
}
doChunk();
}

processLargeArrayAsync(veryLargeArray, myCallback, 100);

Rather than guessing how many to chunk at once, it's also possible to let elapsed time be the guide for each chunk and to let it process as many as it can in a given time interval. This somewhat automatically guarantees browser responsiveness regardless of how CPU intensive the iteration is. So, rather than passing in a chunk size, you can pass in a millisecond value (or just use an intelligent default):

// last two args are optional
function processLargeArrayAsync(array, fn, maxTimePerChunk, context) {
context = context || window;
maxTimePerChunk = maxTimePerChunk || 200;
var index = 0;

function now() {
return new Date().getTime();
}

function doChunk() {
var startTime = now();
while (index < array.length && (now() - startTime) <= maxTimePerChunk) {
// callback called with args (value, index, array)
fn.call(context, array[index], index, array);
++index;
}
if (index < array.length) {
// set Timeout for async iteration
setTimeout(doChunk, 1);
}
}
doChunk();
}

processLargeArrayAsync(veryLargeArray, myCallback);

With WebWorkers

If the code in your loop does not need to access the DOM, then it is possible to put all the time consuming code into a webWorker. The webWorker will run independently from the main browser Javascript and then when its done, it can communicate back any results with a postMessage.

A webWorker requires separating out all the code that will run in the webWorker into a separate script file, but it can run to completion without any worry about blocking the processing of other events in the browser and without the worry about the "unresponsive script" prompt that may come up when doing a long running process on the main thread and without blocking event processing in the UI.

How to do a loop in Javascript that doesn't block the UI?

Create a variable and set it to the starting value for your counter.

Create a function that:

  1. Does whatever the body of the loop does
  2. Increments the counter
  3. Tests to see if the counter is still under the length and, if it is, calls the function again

At this point you will have functionality equivalent to what you have already.

To pause between each call of the function, and allow time for other functions to fire, replace the direct call to the function with a call to setTimeout and use the function as the first argument.

var counter = 0;
function iterator () {
//Complex stuff for each element
counter++;
if (counter < array.length) {
setTimeout(iterator, 5);
}
}

Asynchronously iterate through large array of objects using JS / JQuery

A small helper might help here:

function asyncForEach(arr, cb, done) {
(function next(i) {
if(i >= arr.length) {
if(done) done();
return;
}
cb(arr[i], i, arr);
setTimeout(next, 0, i + 1); // a small trick to defer actions
})(0);
}

Or to optimize it you could chunk the results and only yield every 1000 iterations or so:

function asyncForEach(arr, cb, done) {
(function next(i) {
if(i >= arr.length) {
if(done) done();
return;
}
let stop = i + 1000;
setTimeout(next, 0, stop); // a small trick to defer actions
while(i < arr.length && i < stop)
cb(arr[i], i++, arr);
})(0);
}

Which can be used like this in your case:

asyncForEach(myArray, function(el) {
if (el.name === checkName){
$("#someElement").append(`${el.code} <br />`);
}
});

However probably the slowest part here is appending to the dom. If you don't want to have "live progress" its probably good to batch the dom update to one single call:

let result = "";
asyncForEach(myArray, function(el) {
if (el.name === checkName){
result += `${el.code} <br />`;
}
}, function() {
$("#someElement").append(result);
});

And then even the synchrobous variant might be fast enough:

let result = "";
for(const el of myArray) {
if(el.name === checkName)
result += `${el.code} <br />`;
}
$("#someElement").append(result);

How to block the UI for a long running Javascript loop

JavaScript is designed so it does not block the UI in any way, and this is one of its most important features for the browsers. The only exceptions are the popup message boxes (i.e. alert(), confirm(), and propmpt()). Even if it is possible, it's highly not recommended to block the UI.

There are many alternative ways to prevent the user from firing actions that shouldn't be fired until something else happens. Examples:

  • Disable the action's button until your processing ends then enable it back.
  • Set a flag (e.g. var processing = true) and check that flag in the click event of the action's button so it displays a message (e.g. "still processing, please wait...") when flag is true and execute the action when flag is false. Remember not to use alert() for the message otherwise you'll block the processing. Use a popup div instead.
  • Set the event handler at the beginning of the processing to a function that displays a message (e.g. "still processing, please wait...") and at the end of the processing, set the event handler to the function that will do the action. Remember not to use alert() for the message otherwise you'll block the processing. Use a popup div instead.
  • Show a modal popup div at the beginning of the processing with a message (e.g. "still processing, please wait..."), or progress bar, or some animation. The modal popup prevents the user from interacting with the page so they cannot click anything. For that to work, the modal popup must not have a close button or any other way to close it. At the end of processing, close the modal popup so the user can now continue.

Important Note: You mentioned in your comment to the other answer that the overlay (which is similar to the modal popup in my last point) is not displayed until the end of processing. That's because your processing is occupying the processor and preventing it from handling the UI thread. When you can do is delay your processing. So first display the modal popup (or overlay), then use setTimeout() to start processing 1 second later (maybe 500 millisecond or even less is enough). This gives the processor enough time to handle the UI thread before it starts your long processing.

Edit Here is an example of the last method:

function start() {  disableUI();  setTimeout(function() {    process();  }, 1000);}
function process() { var s = (new Date()).getTime(); var x = {}; for (var i = 0; i < 99999; i++) { x["x" + i] = i * i + i; } var e = new Date().getTime(); $("#results").text("Execution time: " + (e - s)); enableUI();}
function disableUI() { $("#uiOverlay").dialog({ modal: true, closeOnEscape: false, dialogClass: "dialog-no-close", });}
function enableUI() { $("#uiOverlay").dialog("close");}
.dialog-no-close .ui-dialog-titlebar {  display: none;}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script><link rel="stylesheet" href="https://ajax.googleapis.com/ajax/libs/jqueryui/1.11.4/themes/smoothness/jquery-ui.css"><script src="https://ajax.googleapis.com/ajax/libs/jqueryui/1.11.4/jquery-ui.min.js"></script>
<button type="button" onclick="start()">Start</button><div id="results"></div><div id="uiOverlay" style="display: none;">Processing... Please wait...</div>

How to wait in loop without blocking UI in C#

For what i understand of your answer maybe you want

private async void MassInvoiceExecuted()
{
foreach (Invoice invoice in Invoices)
{
DoStuff(invoice);
RefreshExecuted();
await Task.Delay(8000);
}
}

But really don't know if you have any reason to update UI only at the end of all processing.

How to iterate large arrays in NodeJS

Here are some considerations for manipulating large data sets in nodejs which come out of my experiences dealing with data sets in the billions and single arrays of 100,000,000 items.

1. Minimize Garbage Collection Work. To the best of your ability, avoid creating temporary objects in the main loop that is processing the large data set. This includes locally scoped variables (where a new variable is created through each invocation of the loop) and includes any functions/methods that return objects. If your code creates 10 objects each time through the loop and the array has 1.2 million items in it, that's 10.2 million objects the GC has to deal with. Besides all the CPU that it takes the GC to process those, it's also a lot of peak memory usage since the GC lets things accumulate until memory gets scarce or until it finds some idle time.

2. Measure the time it takes to process your worst case array and improve it as much as you can. Work on the performance of the loop processing with specific performance tests so you now know exactly what the max array processing time is.

3. Decide what latency delay is acceptable in your server. This really depends upon the application and how often this delay will be encountered so you will have to figure out what will work for you. An occasional 100ms delay is probably no big deal for lots of applications, but if that occurs frequently it becomes a problem or if you have some sort of responsiveness-critical aspect to your server (such as gaming), then 100ms would be way too long.

4. Move processing to Worker Threads. If your best performance is worse than your acceptable latency, then you will probably want to move the processing to nodejs Worker Threads. It probably makes sense to create a pool of threads (one per actual CPU core in your server) and then create a work queue that is serviced in FIFO order. When a large array job needs to be done, you put it in the queue and return a promise. If a worker thread is available, the job is immediately sent off to the Worker Thread. If all worker threads are busy, it sits in the queue until a thread is finished and is free. At that point, the oldest item in the queue (FIFO order) is sent of to the Worker Thread. When a worker thread finishes the job, the result is communicated back and a promise is resolved and the code waiting for the result gets the resolved promise notification.

5. Use SharedArrayBuffer if possible. You don't want to be copying large amounts of data back and forth between Worker Threads as that will eat up CPU and cause lots of work for the CPU. A key technique to processing large amounts of data in Worker Threads is to put that data in a SharedArrayBuffer that can be directly passed to the Worker Thread as a reference without any copying. This is hugely more efficient for CPU, GC and peak memory use.

6. Understand concurrency consequences of using a SharedArrayBuffer. A SharedArrayBuffer being operated on by Worker Threads is one place in node.js where you can be exposed to multi-thread race conditions. So, you need a design model for how you're going to do it. The simplest model is to set things up so that only one thread EVER has access to the same SharedArrayBuffer. You create it in the main thread and then when you pass it off to the Worker Thread for processing, you pass the SharedArrayBuffer reference to the WorkerThread and you completely forget about it in the main thread (store it nowhere else). This means that the main thread essentially passes temporary ownership of it to the Worker Thread. When the Worker Thread finishes, it passes ownership back (returning the SharedArrayBuffer reference in the result message it sends). This model is simple because you can't accidentally access it from two threads if you make sure that no more than one thread EVER has a reference to it at the same time.

7. Use Atomics to protect shared data. If you can't use a simple access model for the SharedArrayBuffer as discussed above, then you may need to use Atomics to protect the integrity of data.


Some other design options to consider:

1. Break up the data and process it in chunks. You can write the processing in chunks such that you program a short delay between chunks so the main thread has an opportunity to processing messages between chunks. This is how we were forced to do things before we had access to threads. See Best way to iterate over an array without blocking the UI for an example. How practical this is or how much of a rewrite this would cause really depends upon the problem and the data. On a server, I would probably tend to use threads these days rather than try to break the processing into small tiny chunks.

2. Consider if a database can help you. Databases are for managing large sets of data and they typically do it in a separate process (which helps with the server responsiveness issue).

3. Worker List class. Here's a WorkerList class that I used in order to queue up data to use a worker pool. This is part of a larger crypto test app that used multiple threads to offload large amounts of crypto work. The whole repository is here on Github.

4. Work on the data incrementally as it arrives. You mentioned "prepare them for database insertion". Depending upon the specific problem, you may not have to even accumulate large amounts of data at all. Maybe you can process the data more incrementally as it arrives and, by doing it as you go, you never end up with the giant job that interferes with your main server work.
to a point where you have 1.2 million item arrays.



Related Topics



Leave a reply



Submit