Compile Programs on Multicore or Distributed System

Compile programs on multicore or distributed system

On distributed-memory systems, you can use distcc to farm out compile jobs to other machines. This takes a little bit of setup, but it can really speed up your build if you happen to have some extra machines around.

On shared-memory multicore systems, you can just use make -j, which will try to spawn build jobs based on the dependencies in your makefiles. You can run like this:

$ make -j

which will impose no limit on the number of jobs spawned, or you can run with an integer parameter:

$ make -j8

which will limit the number of concurrent build jobs. Here, the limit is 8 concurrent jobs. Usually you want this to be something close to the number of cores on your system.

Compiling with g++ using multiple cores

You can do this with make - with gnu make it is the -j flag (this will also help on a uniprocessor machine).

For example if you want 4 parallel jobs from make:

make -j 4

You can also run gcc in a pipe with

gcc -pipe

This will pipeline the compile stages, which will also help keep the cores busy.

If you have additional machines available too, you might check out distcc, which will farm compiles out to those as well.

How to compile C# for multiple processor machines? (With VS 2010 or csc.exe)

Since it is a cluster, you have to rely on some form of a message-passing parallelism, no compiler will transform your code automatically. At least, a good old MPI is supported: http://osl.iu.edu/research/mpi.net/

What is the difference between multicore programming in Erlang and other language?

No, it's not impossible, but Erlang does make it much easier. The key is not sharing state among the processes. Erlang achieves this by virtue of it being a functional language. The function should have no side effects nor should it access any variable state (other than the arguments passed on the stack). With these properties, computation of any function in the system can be moved off to another processor with a separate memory space and Erlang will do this for you. Erlang only needs to replicate the arguments to the function and the results between the memory spaces (note: this would not be suitable for all kinds of applications...a function that needed to operate on a very large body of input state might present performance issues when replicating that state).

A naive use of threads in a C++ application might have different processors (in a multi-core system) trying to access the same shared memory at the same time. The system then has to do a lot of work to make sure the local caches associated with each core remain consistent. This is where you can suffer huge performance hits. We have an application at work that degrades in performance when you have more than a couple cores for this very reason. In fact, I'd go so far as to say you'd be better off to design your applications to only use threads where you need to do asynchronous I/O, but where you need to have processes doing real work and not blocked for I/O, use full processes. By using full processes, you guarantee that each process has it's own memory space and no two threads will use the same memory at the same time (of course, you then need to come up with a good means of communicating between those processes). With a system like this and a discipline around managing state and distributing processing, you could achieve similar results to what Erlang provides, but you have to do a lot of infrastructure stuff that Erlang already does for you.

what's the difference between parallel and multicore programming?

Mutli-core is a kind of parallel programming. In particular, it is a kind of MIMD setup where the processing units aren't distributed, but rather share a common memory area, and can even share data like a MISD setup if need be. I believe it is even disctinct from multi-processing, in that a multi-core setup can share some level of caches, and thus cooperate more efficiently than CPUs on different cores.

General parallel programing would also include SIMD systems (like your GPU), and distributed systems.

Compile Programs on Multicore or Distributed System