Using JavaScript Node.Js How to Parallel Process for Loops

Javascript Running two loops in parallel

only async functions* runs async** (without workers)

async function loop1(){
for (var i = 10000; i < 20000; i++) {
await console.log(i);
}
}
async function loop2(){
for (var i = 0; i < 10000; i++) {
await console.log(i);
}
}
loop1();
loop2();

How does NodeJS handle parallel execution for same functions in a loop?

Node.js runs your actual Javascript in a single thread so it does not apply more than one CPU to your actual Javascript unless you specially design your code to put the CPU intensive tasks in Worker threads or you create your own separate processes with clustering or by using the child_process module to fire up your own additional processes and then farm work out to them. Just running your node.js program, a CPU intensive operation (like your long loop of sorting) will hog the CPU and block the event loop from processing other requests. It will not involve other CPUs in doing that sorting operation and will not use other CPUs for your Javascript.

When you run an asynchronous operation, there will be native code behind that operation and that native code may or may not use additional threads or processes. For example, file I/O uses a thread pool. Networking uses native OS asynchronous support (no threads). spawn() or exec() in child_process start new processes.

If you show us the actual code for your specific situation, we can answer more specifically about how that particular operation works.

How does Node handle such parallel executions?

It depends upon what the operation is.

Are both the CPUs being using perfectly?

Probably not, but we'd need to see your specific code.

Do I need to manually fork process for each function in the loop?

It depends upon the specific situation. For applying multiple CPUs to your actual Javascript (not asynchronous operations), then you would need multiple processes running your Javascript or perhaps the newest Worker thread api. If the parallelism is all in asynchronous operations, then the event driven nature of node.js is particularly good at managing many asynchronous operations at once and you may not even benefit from getting multiple CPUs involved because most of the time all node.js is doing is waiting for I/O to complete and many, many requests can already be in flight at the same time very efficiently in node.js.

For actually getting multiple CPUs applied to running your Javascript itself, then node.js has the clustering module which is pretty purpose-built for that. You can fire up a cluster process for each actual CPU core in your computer. Or, you can also use the new Worker thread api.

Also see these answers that discuss how to address CPU intensive code in node.js:

How to process huge array of objects in nodejs

How to apply clustering/spawing child process techniques for Node.js application having bouth IO bound and CPU bound tasks?

node.js socket.io server long latency

Is it possible somehow do multithreading in NodeJS?

How cpu intensive is too much for node.js (worried about blocking event loop)

node.js for loop parallel array processing with only one callback

I would personally prefer using async library as these sort of workflows can be easily handled using async.

var FOUND = {
code: 'Custom'
item: null,
pattern: null
};

function matchingPattern(patterns, object, onMatch, onNoMatch) {
async.each(patterns, function(pattern, callback){
// check pattern as per your business logic.
// assuming matchPattern is async
matchPattern(pattern, object, function(match){
if(match){
FOUND.item = object;
FOUND.pattern = pattern;
return callback(FOUND);
}else{
return callback();
}
});
},
function (error, result){
if(error){
if(error.code == 'Custom'){
// its not an error. We have used it as an error only.
return onMatch();
}else{
// handle error here.
}
}
// all items done and we have not found any pattern matching.
return onNoMatch();
});// end of async.each();
}// end of matchingPattern()

Most optimal way to execute timed functions in parallel using node?

Nodejs runs your Javascript in a single thread unless you explicitly create a WorkerThread and run some code in that. True parallel execution where both jobs are running code that uses the CPU will only be accomplished if you either run each task in a WorkerThread or child process to get it out of the main thread.

Let me repeat, true parallel execution requires more than one thread or process in nodejs and nodejs does not do that by default so you will have to create a WorkerThread or child_process.

So, if you have code that takes more than a few ms to do its work and you want it to run at a fairly precise time, then you can't count on the main Javascript thread to do that because it might be busy at that precise time. Timers in Javascript will run your code no earlier than the scheduled time, and when that scheduled time comes around, the event loop is ready to run them, but they won't actually run until whatever was running before finishes and returns control back to the event loop so the event loop can run the code attached to your timer.

So, if all you're mostly doing is I/O kind of work (reading/writing files or network), then your actual Javascript execution time is probably only milliseconds and nodejs can be very, very responsive to run your timers pretty close to "on time". But, if you have computationally expensive things that keep the CPU busy for much longer, then you can't count on your timers to run "on time" if you run that CPU-heavy stuff in the main thread.

What you can do, is start up a WorkerThread, set the timer in the WorkerThread and run your code in the worker thread. As long as you don't ask that WorkerThread to run anything else, it should be ready to run that timer pretty much "on time".

Now WorkerThreads do share some resources with the main thread so they aren't 100% independent (though they are close to independent). If you want 100% independence, then you can start a nodejs child process that runs a node script, sets its own timers and runs its own work in that other process.


All that said, the single threaded model works very, very well at reasonably high scale for code that is predominantly I/O code because nodejs uses non-blocking I/O so while it's waiting to read or write from file or network, the main thread is free and available to run other things. So, it will often give the appearance of running things in parallel because progress is being made on multiple fronts. The I/O itself inside the nodejs library is either natively non-blocking (network I/O) or is happening in an OS-native thread (file I/O) and the programming interface to Javascript is callback or promise based so it is also non-blocking.

I mention all this because you don't say what your two operations that you want to run in parallel are (including your actual code allows us to write more complete answers). If they are I/O or even some crypto, then they may already be non-blocking and you may achieve desired parallelism without having to use additional threads or processes.



Related Topics



Leave a reply



Submit