Using Pthread in C++

using pthread in C?

It is not clear why you create a new thread every 10 seconds rather than just letting the original continue. Since the original thread exits, you aren't directly accumulating threads, but you aren't waiting for any of them, so there are some resources unreleased. You also aren't error checking, so you won't know when anything does go wrong; monitoring will simply stop.

You will eventually run out of space, one way or another. You have a couple of options.

  1. Don't create a new thread every 10 seconds. Leave the thread running by making a loop in the soilMoisture() function and do away with funct() — or at least the pthread_create() call in it.
  2. If you must create new threads, make them detached. You'll need to create a non-default pthread_attr_t using the functions outlined and linked to in When pthread_attr_t is not NULL.

There are a myriad issues you've not yet dealt with, notably synchronization between the two threads. If you don't have any such synchronization, you'd be better off with two separate programs — the Unix mantra of "each program does one job but does it well" still applies. You'd have one program to do the soil moisture reading, and the other to do the water level reading. You'll need to decide whether data is stored in a database or otherwise logged, and for how log such data is kept. You'll need to think about rotating logs. What should happen if sensors go off-line? How can you restart threads or processes? How can you detect when threads or processes lock up unexpectedly or exit unexpectedly? Etc.

I assume the discrepancy between 10-15 minutes mentioned in the question and 10 seconds in the code is strictly for practical testing rather than a misunderstanding of the POSIX sleep() function.

What is the meaning of using -pthread when compiling in C?

What is that and why do we need to use this (and why other libraries don't require this)?

it depends if you want to use threading in test.c. It is not mandatory (i.e. you don't "have to" specify, it depends on your application)

pthread is a flag. More about pthread from gcc man page:

-pthread Add support for multithreading using the POSIX threads library. This option sets flags for both the preprocessor and linker.
It does not affect the thread safety of object code produced by the
compiler or that of libraries supplied with it.

How does the stack work in multithreaded programs using Pthread?

share the process's memory space between all threads, that includes, stack

Well, yes and no.

There's a difference between sharing a memory address space and a specific region of memory.

While it is true that all threads share an address space, each thread has its own stack (allocation of memory).

A shared address space means that a given virtual address (pointer value) refers to the same physical memory in all threads.

But a dedicated stack means that each thread's stack pointer started off at a different place in that address space, so as to not conflict which each other.

Multithreading have NO improvement on speed - Using pthread in C - Why?

The answer in short is that the model works but you need to give each thread enough work to do to make it worthwhile to absorb the overhead of starting, stopping, and synchronizing the threads. And you must run on a computer capable of having multiple simultaneous threads running (multi-core machine).

I took the application you provided and modified it to actually compile. If I run this on a linux machine that has many CPU cores available and give the myDo2 work thread enough work to do then I see results similar to the following:

./test width height num_threads
./test 10000 10000 1
Dauer: 17,660,185 Mikrosekunden

./test 10000 10000 2
Dauer: 7,864,508 Mikrosekunden

./test 10000 10000 8
Dauer: 1,100,126 Mikrosekunden

This means with 8 threads the overall wall clock time has reduced from 17.6 seconds to 1.1 seconds which is an improvement greater than 8 times (probably due to better memory and cache usage).

Yet if I give each thread too little work, then my times don't seem to be improving and actually get worse at some point.

./test 10 10 1
Dauer: 70 Mikrosekunden

./test 10 10 2
Dauer: 60 Mikrosekunden

./test 10 10 4
Dauer: 205 Mikrosekunden

Here you see that the overhead of starting a thread, then stopping and synchronizing with that thread is greater than the amount of work being done inside of the thread.

So the programming model works but you need to utilize it correctly.

I compiled the code below on RedHat using

gcc -std=gnu99 test.c -o test -l pthread

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <pthread.h>
#include <string.h>

typedef struct _threadinfo
{
int from;
int to;
int width;
int height;
int blocksizeperthread;
char **results;
int threadno;
} threadinfo;

typedef struct _cplx
{
float re;
float im;
} Complex;

void* myDo2( void *tiptr )
{
threadinfo *mythread = (threadinfo *)tiptr;
//copy infos from struct to this thread
int l_from = mythread->from;
int l_to = mythread->to;
int l_width = mythread->width;
int l_height = mythread->height;
char **container = malloc(l_height * sizeof(char *));
for (int i = 0; i < l_height; i++)
{
container[i] = malloc(l_width * 3 * sizeof(char));
}

int x, y;
char iterate = 0;
Complex c = { 0, 0 };
Complex newz = { 0, 0 };
float imageRelation = (float)l_width / (float)l_height;
char blueGreenRed[3];
//Ist Buffer für ganze Zeile: Breite * 3 wegen den 3 Bytes pro Pixel
char zeile[l_width * 3]; //1000*3
int counter = 0;
float zoom = 1.0;
float colorLimit = 10.0;
float quadLimit = 10.0;

for (y = l_from; y <= l_to; ++y) //1..500
{
for (x = 1; x <= l_width; ++x) //1..1000
{
Complex z = { 0, 0 };
float quad = 0;

c.re = zoom * (-1.0 + imageRelation * ((x - 1.0) / (l_width - 1.0)));
c.im = zoom * (0.5 - (y - 1.0) / (l_height - 1.0));

// iterate
for (iterate = 1; iterate < colorLimit && quad < quadLimit; ++iterate)
{
quad = z.re * z.re + z.im * z.im;

newz.re = (z.re * z.re) - (z.im * z.im) + c.re;
newz.im = z.re * z.im * 2.0 + c.im;

z = newz;
}
//toRGB(iterate, blueGreenRed);
//Kopiere 3 Bytes von bgr nach zeile + (x-1)*3
//Beachte: Die Variable zeile ist ein character array daher wird
//(x-1)*3 benutzt um 3 Byte Pakete pro Pixel in die Zeile zu laden.
memcpy((zeile + (x - 1) * 3), blueGreenRed, 3);
}
memcpy(container[counter], zeile, l_width * 3);
counter++;
}

mythread->blocksizeperthread = counter - 1;
mythread->results = container;
fprintf(stderr, "Ich bin Thread-Nr. %d\n", mythread->threadno);
fprintf(stderr, "und habe eine Menge Zeilen von %d\n", mythread->blocksizeperthread);
fprintf(stderr, "und habe berechnet von %d\n", l_from);
fprintf(stderr, "und habe berechnet bis %d\n", l_to);
return NULL;
}

int main(int argc, char **argv, char **envp)
{
if (argc != 4)
{
printf("Bitte genau 3 Argumente eingeben.\n");
return 1;
}
//Structs und Variablen für die Stopuhr
struct timeval start, ende;
long ttlende, ttlstart;
int width;
int height;

width = atoi(argv[1]);
height = atoi(argv[2]);

int y;

// BMP lines must be of lengths divisible by 4
char span[4] = "\0\0\0\0";
int spanBytes = 4 - ((width * 3) % 4);
if (spanBytes == 4) spanBytes = 0;
int psize = ((width * 3) + spanBytes) * height;

//Stoppuhr starten, d.h. get time stamp

//create chunks
int threads = atoi(argv[3]);
int i;
int reminder = height % threads;
int blocksize = height / threads;
int rounds = height / blocksize;
int begin = 1;

//init structs
threadinfo *tinfoptr = malloc( sizeof(threadinfo) * rounds );
//threadinfo tinfo = *tinfoptr;
for (i = 1; i <= rounds; ++i)
{
//res = 500 * 1;
//res = 500*2;
int res = blocksize * i;
if ((i == rounds))
{
res = res + reminder;
}

//update parameters of tinfo
(*(tinfoptr + (i - 1))).from = begin;
(*(tinfoptr + (i - 1))).to = res;
(*(tinfoptr + (i - 1))).width = width;
(*(tinfoptr + (i - 1))).height = res - begin + 1;
(*(tinfoptr + (i - 1))).results = NULL;
(*(tinfoptr + (i - 1))).threadno = i;
(*(tinfoptr + (i - 1))).blocksizeperthread = -1;
//altes ende ist neuer start des nächsten blocks.
begin = res;
}

fprintf(stderr, "inti abgeschlossen, starte threads\n");

pthread_t myThread[rounds];
for (i = 1; i <= rounds; ++i)
{
fprintf(stderr, "Rufe Thread %d auf\n", i);
if (pthread_create(&myThread[i - 1], NULL, myDo2,
(void *)(tinfoptr + (i - 1))))
{
fprintf(stderr, "Error creating thread\n");
return 1;
}
}

gettimeofday(&start, NULL);
for (i = 1; i <= rounds; ++i)
{
/* wait for the second thread to finish */
if (pthread_join(myThread[i - 1], NULL))
{
fprintf(stderr, "Error joining thread\n");
return 2;
}
}
//Stoppuhr beenden, d.h. get time stamp, NULL per Doku.
gettimeofday(&ende, NULL);

ttlende = ende.tv_sec * 1000000 + ende.tv_usec;
ttlstart = start.tv_sec * 1000000 + start.tv_usec;
fprintf(stderr, "\nDauer: %ld Mikrosekunden\n", (ttlende - ttlstart));

return 0;
}

Using pthreads to speed up the processing of counting prime numbers from 0 to N. Am I using them correctly?

I see a few problems with the code:

  1. You never call pthread_join() on the threads, which means your program will exit immediately after spawning the threads, rather than waiting for them to complete -- probably not what you want. You should add a second for loop like this one to the bottom of your main() function:

     for(int i = 0; i < NUM_THREADS; i++) {
    pthread_join(&threads[i], NULL);
    }
  2. The call to pthread_exit() in main() is unnecessary, you can get rid of it. (It's meant to be called from within a spawned pthread to cause the thread to exit, there's no point to calling it from the main thread)

  3. Calling printf() from within your threads' computation-loop is going to slow down the computations greatly (to the point where you are no longer measuring the performance of your actual computations at all, rather you are only really measuring the speed at which printf() and the stdout subsystem execute)

  4. Keeping a shared/global counter that you have to guard with a mutex every time you find a new prime number isn't terribly efficient; better to declare a local/non-shared counter-variable for each thread, and increment that. Then at the end of the threads execution, you can add the thread's local-counter to the shared/global counter just one time, and thereby avoid paying the synchronization-penalty that comes with a lock()/unlock() sequence more than once per thread.

How to use pthread in C to prevent simultaneous read and write to a file on disk?

According to the description, your problem is about protecting the files on the disk, not the stream descriptors(FILE*) that represents the files. You can try to use pthread's rwlock to synchronize concurrent access between multiple threads:

FILE *fp = fopen(...);
pthread_rwlock_t rwlock = PTHREAD_RWLOCK_INITIALIZER;

// read thread:
char buf[BUFLEN];
pthread_rwlock_rdlock(&rwlock);
fread(buf, sizeof buf, 1, fp);
pthread_rwlock_unlock(&rwlock);

// write thread:
pthread_rwlock_wrlock(&rwlock);
fwrite(buf, sizeof buf, 1, fp);
pthread_rwlock_unlock(&rwlock);

Note that this protect the file from accessed by multiple threads in the same process, this does not protect it from accessing by multiple processes on the system.

Why MultiThread is slower than Single?? (Linux,C,using pthread)

"I know MultiThread is faster"

This isn't always the case, as generally you would be CPU bound in some way, whether that be due to core count, how it is scheduled at the OS level, and hardware level.

It is a balance how many threads is worth giving to a process, as you may run into an old Linux problem where you would be spending more time scheduling the processes than actually running them.

As this is very hardware and OS dependant, it is difficult to say exactly what the issue may be, but make sure you have the appropriate microcode for your CPU installed (generally installed by default in Ubuntu), but just in case, try:

sudo apt-get install intel-microcode 

Otherwise look at what other processes are being run, and it may be that a lot of other things are running on the cores that are being allocated the process.

Can I use pthreads over the same function on C?

Yes, this can be done by making use of the argument to the thread function.

Each thread needs to loop over a range of values. So create a struct definition to contain the min and max values:

struct args {
int min;
int max;
};

Define a single thread function which converts the void * argument to a pointer to this type and reads it:

void *thread_func(void *arg) 
{
struct args *myargs = arg;
int i;
long int localA = 0;
for (i = myargs->min; i < myargs->max; i++)
{
localA = localA + i*a*sqrt(a);
}
pthread_mutex_lock(&a_mutex);
a = a + localA;
pthread_mutex_unlock(&a_mutex);
return NULL;
}

(Note that the function needs to return a void * to conform to the interface pthread_create expects.)

Then in your main function create an instance of this struct for each set of arguments, and pass that to pthread_create:

int main (int argc, char **argv) 
{
pthread_t one, two;
struct args args1 = { 1, 50000000 };
struct args args2 = { 50000000 , 100000000 };
pthread_create(&one, NULL, thread_func, &args1);
pthread_create(&two, NULL, thread_func, &args2);
pthread_join(one, NULL);
pthread_join(two, NULL);
}


Related Topics



Leave a reply



Submit