R Package That Automatically Uses Several Cores

R system() process always uses same CPU, not multi-threaded/multi-core

Following on @agstudy's comment, you should get parallel to work first. On my system, this uses multiple cores:

f<-function(x)system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)
library(parallel)
mclapply(1:4,f,mc.cores=4)

I would have wrote this in a comment myself, but it is too long. I know you have said that you have tried the parallel package, but I wanted to confirm that you are using it correctly. If it doesn't work, can you confirm that a non-system call uses mclapply correctly, like this one?

a<-mclapply(rep(1e8,4),rnorm,mc.cores=4)

Reading your comments, I suspect that your pthreads Linux package is out of date and broken. On my system, I am using libpthread-2.15.so (not 2.13). If you're on Ubuntu, you can grab the latest with apt-get install libpthread-stubs0.

Also, note that you should be using parallel, not multicore. If you look at the docs for parallel, you'll note that they have incorporated the work on multicore.


Reading your next set of comments, I must insist that it is parallel and not multicore that has been included in R since 2.14. You can read about this on the CRAN Task View.

Getting parallel to work is crucial. I previously told you that you could compile it directly from source, but this is not correct. I guess the only way to recompile it would be to compile R from source.

Can you also verify that your CPU affinity is set correctly? Also can you check if R can detect the number of cores? Just run:

library(parallel)
mcaffinity()
# Should be c(1,2,3,4) for you.
detectCores()
# Should be 4 for you.

Should I use every core when doing parallel processing in R?

Any time you do parallel processing there is some overhead (which can be nontrivial, especially with locking data structures and blocking calls). For small batch jobs, running on a single core or two cores is much faster due to the fact that you're not paying that overhead.

I don't know the size of your job, but you should probably run some scaling experiments where you time your job on 1 processor, 2 processors, 4 processors, 8 processors, until you hit the max core count for your system (typically, you always double the processor count). EDIT: It looks like you're only using 4 cores, so time with 1, 2, and 4.

Run timing results for ~32 trials for each core count and get a confidence interval, then you can say for certain whether running on all cores is right for you. If your job takes a long time, reduce the # of trials, all the way down to 5 or so, but remember that more trials will give you a higher degree of confidence.

To elaborate:

Student's t-test:

The student's t-test essentially says "you calculated an average time for this core count, but that's not the true average. We can only get the true average if we had the average of an infinite number of data points. Your computed true average actually lies in some interval around your computed average"

The t-test for significance then basically compares the intervals around the true average for 2 datapoints and says whether they are significantly different or not. So you may have one average time be less than another, but because the standard deviation is sufficiently high, we can't for certain say that it's actually less; the true averages may be identical.

So, to compute this test for significance:

  • Run your timing experiments
  • For each core count:
  • Compute your mean and standard deviation. The standard deviation should be the population standard deviation, which is the square root of population variance
    Population variance is (1/N) * summation_for_all_data_points((datapoint_i - mean)^2)

Now you will have a mean and standard deviations for each core count: (m_1, s_1), (m_2, s_2), etc.
- For every pair of core counts:
- Compute a t-value: t = (mean_1 - mean_2)/(s_1/ sqrt(#dataPoints))

The example t value I showed tests whether the mean timing results for core count of 1 is significantly different than the timing results for core count of 2. You could test the other way around by saying:

t = (m_2 - m_1)/(s_2/ sqrt(#dataPoints))

After you computed these t-values, you can tell whether they're significant by looking at the critical value table. Now, before you click that, you need to know about 2 more things:

Degrees of Freedom

This is related to the number of datapoints you have. The more datapoints you have, the smaller the interval around mean probably is. Degrees of freedom kind of measures your computed mean's ability to move about, and it is #dataPoints - 1 (v in the link I provided).

Alpha

Alpha is a probability threshold. In the Gaussian (Normal, bell-curved) distribution, alpha cuts the bell-curve on both the left and the right. Any probability in the middle of the cutoffs falls inside the threshold and is an insignificant result. A lower alpha makes it harder to get a significant result. That is alpha = 0.01 means only the top 1% of probabilities are significant, and alpha = 0.05 means the top 5%. Most people use alpha = 0.05.

In the table I link to, 1-alpha determines the column you will go down looking for a critical value. (so alpha = 0.05 gives 0.95, or a 95% confidence interval), and v is your degrees of freedom, or row to look at.

If your critical value is less than your computed t (absolute value), then your result is NOT significant. If the critical value is greater than your computed t (absolute value), then you have statistical significance.

Edit: The Student's t-test assumes that variances and standard deviations are the same between the two means being compared. That is, it assumes the distribution of data points around the true mean is equal. If you DON'T want to make this assumption, then you're looking for Welch's t-test, which is slightly different. The wiki page has a good formula for computing t-values for this test.

Microsoft Open R (Revolution R) using Two CPUs each with Multple Cores

So if you are using the Microsoft RevoUtilsMath package, you will get multi-threading "for free" on a multi-processor, multicore machine. See more here. There are also CRAN packages available to support multicore. Couple of examples: here, and here.

If you use Microsoft R Server, the RevoScaleR functions are parallel.



Related Topics



Leave a reply



Submit