Why is proc upload so slow?
FTP, if available from the source server, is much faster than proc upload or proc copy. These both operate on a record-by-record basis and can be CPU-bound over fast network connections, especially for very wide datasets. A single FTP transfer will attempt to use all available bandwidth, with negligible CPU cost.
This assumes that the destination server can use the unmodified transferred file - if not, the time required to make it usable might negate the increased transfer speed of FTP.
Multiprocessing in Python not faster than doing it sequentially
Multiprocessing is faster if you have multiple cores and do the parallelization properly. In your example you create 3000 processes which causes enormous amount on context switching between them. Instead of that use Pool
to schedule the jobs for processes:
def bubbleSort(alist):
sample_list = (getRandomSample(alist, 100))
for passnum in range(len(sample_list)-1,0,-1):
for i in range(passnum):
if sample_list[i]>alist[i+1]:
temp = alist[i]
sample_list[i] = alist[i+1]
sample_list[i+1] = temp
return(sample_list)
if __name__ == '__main__':
pool = Pool(processes=4)
for x in pool.imap_unordered(bubbleSort, (myArray for x in range(3000))):
pass
I removed all the output and did some tests on my 4 core machine. As expected the code above was about 4 times faster than your sequential example.
MPI with C slower if more processes are used
The new code using MPI_Reduce() is faster and simpler than the previous one:
/*
Based on the code presented at http://condor.cc.ku.edu/~grobe/docs/intro-MPI-C.shtml
Code which calculate the sum of a vector using parallel computation.
In case of main vector does not split equally to all processes, the leftover is passed to process id 0.
Process id 0 is the root process. However, it will also perform part of calculations.
Each process will generate and calculate the partial sum of the vector values. It will be used MPI_Reduce() to calculate the total sum.
Since the processes are independent, the printing order will be different at each run.
compile as: mpicc -o vector_sum vector_sum.c -lm
run as: time mpirun -n x vector_sum
x = number of splits desired + root process. For example: if x = 3, the vector will be splited in two.
Acknowledgements: I would like to thanks Gilles Gouaillardet (https://stackoverflow.com/users/8062491/gilles-gouaillardet) for the helpful suggestion.
*/
#include<stdio.h>
#include<mpi.h>
#include<math.h>
#define vec_len 100000000
double vec2[vec_len];
int main(int argc, char* argv[]){
// defining program variables
int i;
double sum, partial_sum;
// defining parallel step variables
int my_id, num_proc, ierr, an_id, root_process;
int vec_size, rows_per_proc, leftover, num_2_gen, start_point;
vec_size = 1e8; // defining the main vector size
ierr = MPI_Init(&argc, &argv);
root_process = 0;
ierr = MPI_Comm_size(MPI_COMM_WORLD, &num_proc);
ierr = MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
rows_per_proc = vec_size/num_proc; // getting the number of elements for each process.
rows_per_proc = floor(rows_per_proc); // getting the maximum integer possible.
leftover = vec_size - num_proc*rows_per_proc; // counting the leftover.
if(my_id == 0){
num_2_gen = rows_per_proc + leftover; // if there is leftover, it is calculate in process 0
start_point = my_id*num_2_gen; // the corresponding position on the main vector
}
else{
num_2_gen = rows_per_proc;
start_point = my_id*num_2_gen + leftover; // the corresponding position on the main vector
}
partial_sum = 0;
for(i = start_point; i < start_point + num_2_gen; i++){
vec2[i] = pow(i,2) + 1.0; // defining vector values
partial_sum += vec2[i]; // calculating partial sum
}
printf("Partial sum of process id %d: %f.\n", my_id, partial_sum);
MPI_Reduce(&partial_sum, &sum, 1, MPI_DOUBLE, MPI_SUM, root_process, MPI_COMM_WORLD); // calculating total sum
if(my_id == root_process){
printf("Total sum is %f.\n", sum);
}
ierr = MPI_Finalize();
return 0;
}
Related Topics
Is There Any Way for Ioctl() in Linux to Specify Submission Queue Id for a Nvme Io Request
Why Are Several Signal Numbers Architecture-Dependent on Linux
Watchman Makes Fsnotify Spuriously Detect File Changes
"Sudo" Fails with "Sudo Requires a Tty" When Executed from Putty Command Line
Bash, Execute Command But Continue with Interactive Session
How Does Bash Script Command Substitution Work
Keep Ssh Sessions Running After Disconnection
Shell Script Linux Substract Parameter Grep
Fork, Execlp and Kill. Zombie Process
Bash - While Read Line from File Print First and Second Column
Using Sftp to Transfer Images from HTML Form to Remote Linux Server Using Perl/Cgi.Pm
What Is The Right Place for Findxxx.Cmake Files for Locally Compiled Libs
How to Have Chef Reload Global Path
Executing Shell Script from Current Directory Without '"./Filename"
Vim Pauses If Echo in .Vimrc File
Linux History of All Commands Executed During Whole Day, Everyday