How Can One Use Multi Threading in PHP Applications

Why is not a good idea to use multithreading in php?

Does forking create a Thread ?

When we fork a process, the process space, that is to say the region of memory where the libraries and code the process requires to execute reside, is duplicated, the distinct but related processes then continue to execute at the will of the operating systems scheduler in different regions of memory.

What is the difference between a Forked Process and a Thread ?

When we create a Thread we are telling the operating system that we want another unit of execution that can operate in the same region of memory as the Process that created it.

How different operating systems actually implement threads and processes is beyond the scope of this answer, and is unimportant.

Why is Forking a bad idea at the frontend ?

When you copy the whole address space, you duplicate the region of memory that the webserver is operating in too, this can obviously cause havoc for your operating system.

Why is Threading a bad idea at the frontend ?

If a client script instructs the operating system to create 8 threads in direct response to a web request, and 100 clients simultaneously request the script, you will be instructing your operating system to execute 800 threads concurrently.

CPUs and operating systems would need to look very very different to make that a good idea!

Where is Threading a good idea?

Multi-threaded software, and extremely capable hardware, is ubiquitous; computing would not be what it is without it.

In the context of Web infrastructure, mysql and other database servers are multi-threaded, indeed Apache can deploy PHP in a multi-threaded infrastructure, though I wouldn't recommend it.

When we look at how enterprising applications like mysql actually provide their extremely complex services, we can see that their process (and therefore threads) are completely isolated from the infrastructure of your web application.

This is how we use Threads in languages that support them; we design systems whose means of providing their services is via some sane form of IPC, we isolate our complex infrastructure, completely, from that which should be simple: our web applications.

Is PHP really suitable for Threads ?

The memory model for PHP is shared nothing: this means that each interpreter context, in the sense of the structures and memory PHP requires to operate, is isolated from any other context.

This always has to be true for PHP to work as intended; an implementation of threading for PHP that was ignorant of the way PHP worked simply would not function.

pthreads goes to great lengths to ensure the memory model is not broken, every Thread does indeed not share memory directly with any other Thread.

Are Threads really suitable for me ?

Firstly, seriously think about the following questions:

  • Is Threading really required ?
  • What other ways can you find to achieve whatever it is you are setting out to do ?

Multi-threaded software is complex by nature; something being complicated is no kind of excuse for avoiding it, in my opinion.

But be aware that multi-threaded software is fundamentally different to your average PHP application, you have to think about things you have never had to think about before, be aware of things that didn't matter before you started your first Thread.

You should not guess at what these things are, you should seek to educate yourself in the subject as thoroughly as possible, and even be prepared to fail, and persevere.

The complexity of anything decreases as your knowledge increases, that's how learning works, here is where it begins:

https://gist.github.com/krakjoe/6437782

It continues in the manual, in the many examples distributed with pthreads, in stackoverflow searches and questions, and results in glory, in my opinion.

Multi Threading / Multi Tasking in PHP

PHP has had a threading model for a very long time, since the first release of PHP4, May 22nd 2000.

Threading at the frontend

Creating user threads at the frontend of a web application doesn't make any sense; it is extremely difficult to scale. The thread per client model that the Apache Worker MPM binary and mod_php employ is not really something you want to use to serve your websites, certainly if you are using it, you do not want to create additional threads in direct response to any web requests.

Why are threads at the frontend a bad idea ?

You may often hear developers say threads at the frontend do not make sense, without providing the rationale for such an assertion. When you learn to think about systems in the required way the problem becomes obvious:

If a client script creates 8 threads in direct response to a web request, and 100 clients request the script simultaneously, you are requesting that your hardware execute 800 threads concurrently.

CPU's would have to look and work very very differently indeed to make that a good idea

What can we do about it ?

Enterprising solutions might well have a PHP website facing the public, but the actual brains of the system are written in languages that have good support for those things you require to build enterprising solutions such as Java, C#, C++ or whatever the language-of-the-day is.

You should use pthreads in the same way; by designing systems whose component parts are separated from one another, only connected by well designed, high performance (RPC) API's, such that the complexity inherent in designing a multi-threaded architecture is isolated completely from your public facing websites, and the simple, scalable setup that such a website will require.

U can now haz codes

Let's start at the beginning with Hello World:

<?php
class My extends Thread {
public function run() {
printf("Hello World\n");
}
}

/* create a new Thread */
$my = new My();

/* start the Thread */
$my->start();

/* do not allow PHP to manage the shutdown of your Threads */
/* if a variable goes out of scope in PHP it is destroyed */
/* joining explicitly ensures integrity of the data contained in an objects */
/* members while other contexts may be accessing them */
$my->join();
?>

Boring, but I hope you read it ;)

So in a real system, you don't really want to be creating threads so explicitly, you surely want to just submit tasks to some executor service, all of the complex systems, in the sense of their multi-tasking requirements, I have ever seen use such things ...

<?php
class My extends Threaded {
public function run() {
printf("Hello World from %s#%lu\n",
__CLASS__, Thread::getCurrentThreadId());
}
}

/* create a Pool of four threads */
/* threads in a pool are created when required */
$pool = new Pool(4);

/* submit a few tasks to the pool */
$tasks = 100;
while ($tasks--) {
$pool->submit(new My());
}

/* shutting down the pool is tantamount to joining all workers */
/* remember what I said about joining ? */
$pool->shutdown();
?>

I have given you very brief explanations of complicated things, you should endeavor to read all you can:

  • https://gist.github.com/krakjoe/6437782
  • https://gist.github.com/krakjoe/9384409
  • http://php.net/pthreads

Many examples can be found here: https://github.com/krakjoe/pthreads/tree/master/examples

Disclaimer: There's nothing really wrong with a server architecture that uses threading, but the moment you start to create additional threads, you restrict it's scalability and ability to perform as it was designed, I can imagine well designed architectures that do have the ability to thread at the frontend, but it is not an easy thing to aim for. Additionally, threading is not the only thing in the toolbox when it comes to high performance web targeted applications; research all your options.

Does PHP have threading?

There is nothing available that I'm aware of. The next best thing would be to simply have one script execute another via CLI, but that's a bit rudimentary. Depending on what you are trying to do and how complex it is, this may or may not be an option.

Is it possible to speed up scripts using multi-threading?

Multi-threading is certainly possible in DM-scripting, and it is documented in the F1 help here:

F1 help on threading

Whether or not speed improvement can be achieved depends on various things, most importantly whether or not the individual threads need access to the same resource (like the same data, or some GMS resource which is only available via the main-thread - f.e. the UI).

Also, a lot of data-processing is already multi-threaded internally when you use commands on image-expressions. You might achieve a lot more speedup by rephrasing your analytical processing in a way that doesn't requires for-loops in the scripting language but uses image-expressions instead.

Finally, doing things multi-threaded is a great way of introducing bugs and unexpected behavior which can be really hard to debug. Don't be frustrated if you run into those things while learning stuff.

That said, the below example demonstrates (at least on my machine) a speed improvement by "chunking" some data-analysis over multiple parallel background threads.

// Example showing the explicit use of multi-threading in DM scripting 

class CMultiThreadtest
{
image data, keep
number sx,sy,sz
number nChunks, chunksize, lastChunk, doneChunk

object SetData(object self, image img, number nChunks_)
{
if ( img.imagegetNumdimensions() != 3 ) throw( "only 3D data for testing please")
img.ImageGetDimensionSizes(sx,sy,sz)
nChunks = nChunks_
if ( sz % nChunks != 0 ) Throw( "Z-size needs to be integer multiple of nChunks for this test.")
chunksize = sz / nChunks

data:=img
keep = data

return self
}

void CompareResult(object self){
image dif := keep*2-data
number ok = 0==sum(dif)
Result("\n\t Result is " + (ok?"correct":"wrong"))
}

void RunOnData(object self){

// For extra-caution of thread safety, the two lines below shoud be guarded with critical sections
// but given the near-atomic nature of the call, this is omitted here.
number chunkIndex = lastChunk
lastChunk++

image work := data.slice3(0,0,chunkIndex*chunksize, 0,sx,1, 1,sy,1, 2,chunksize,1)
number startp = GetHighresTickCount()
for( number z=0;z<chunksize;z++)
for( number y=0;y<sy;y++ )
for( number x=0;x<sx;x++ ){
work[x,y,z] *= 2
}
number endp = GetHighresTickCount()
Result("\n\t\t Process (chunk "+chunkIndex+") done with " + sx*sy*chunksize + " steps in " + (endp-startp)/GetHighResTicksPerSecond())

// For extra-caution of thread safety, the line below shoud be guarded with critical sections
// but given the near-atomic nature of the call, this is omitted here.
doneChunk++
}

void RunWithSubsets(object self, image src, number nChunks_, number inbackground){
self.SetData(src, nChunks_)
lastChunk = 0
doneChunk = 0
Result("\n.....\n Running with "+nChunks+" chunks of size "+chunksize+ " on " + (inbackground?" multiple background threads":" single main thread") +":")
number startp = GetHighresTickCount()
for( number i=0; i<nChunks; i++){
if ( inbackground )
self.StartThread("runondata")
else
self.RunOnData()
}

while( doneChunk != nChunks ){
if ( ShiftDown() ){
Throw("abort")
doEvents()
}
}
number endp = GetHighresTickCount()
Result("\n Total duration:" + (endp-startp)/GetHighResTicksPerSecond())
self.CompareResult();
Result("\n.....")

}
};

void Test(){
image img := RealImage("test cub",4,50,50,10)
img = random()
clearresults()
object tester = Alloc(CMultiThreadtest)
tester.RunWithSubsets(img, 1, 0)
tester.RunWithSubsets(img, 1, 1)
tester.RunWithSubsets(img, 5, 0)
tester.RunWithSubsets(img, 5, 1)
}
test()


Related Topics



Leave a reply



Submit