PHP Multiple Curl Requests

PHP Multiple Curl Requests

  • Reuse the same cURL handler ($ch) without running curl_close. This will speed it up just a little bit.
  • Use curl_multi_init to run the processes in parallel. This can have a tremendous effect.

PHP - multiple curl requests curl_multi speed optimizations

current implementation of curl_multi_select() in php doesn't block and doesn't respect timeout parameter, maybe it will be fixed later. the proper way of waiting is not implemented in your code, it have to be 2 loops, i will post some tested code from my bot as an example:

      $running  = 1;
while ($running)
{
# execute request
if ($a = curl_multi_exec($this->murl, $running)) {
throw BotError::text("curl_multi_exec[$a]: ".curl_multi_strerror($a));
}
# check finished
if (!$running) {
break;
}
# wait for activity
while (!$a)
{
if (($a = curl_multi_select($this->murl, $wait)) < 0)
{
throw BotError::text(
($a = curl_multi_errno($this->murl))
? "curl_multi_select[$a]: ".curl_multi_strerror($a)
: 'system select failed'
);
}
usleep($wait * 1000000);# wait for some time <1sec
}
}

Get all the URLs using multi curl

Here is a function that I put together that will properly utilize the curl_multi_init() function. It is more or less the same function that you will find on PHP.net with some minor tweaks. I have had great success with this.

function multi_thread_curl($urlArray, $optionArray, $nThreads) {

//Group your urls into groups/threads.
$curlArray = array_chunk($urlArray, $nThreads, $preserve_keys = true);

//Iterate through each batch of urls.
$ch = 'ch_';
foreach($curlArray as $threads) {

//Create your cURL resources.
foreach($threads as $thread=>$value) {

${$ch . $thread} = curl_init();

curl_setopt_array(${$ch . $thread}, $optionArray); //Set your main curl options.
curl_setopt(${$ch . $thread}, CURLOPT_URL, $value); //Set url.

}

//Create the multiple cURL handler.
$mh = curl_multi_init();

//Add the handles.
foreach($threads as $thread=>$value) {

curl_multi_add_handle($mh, ${$ch . $thread});

}

$active = null;

//execute the handles.
do {

$mrc = curl_multi_exec($mh, $active);

} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc == CURLM_OK) {

if (curl_multi_select($mh) != -1) {
do {

$mrc = curl_multi_exec($mh, $active);

} while ($mrc == CURLM_CALL_MULTI_PERFORM);
}

}

//Get your data and close the handles.
foreach($threads as $thread=>$value) {

$results[$thread] = curl_multi_getcontent(${$ch . $thread});

curl_multi_remove_handle($mh, ${$ch . $thread});

}

//Close the multi handle exec.
curl_multi_close($mh);

}

return $results;

}

//Add whatever options here. The CURLOPT_URL is left out intentionally.
//It will be added in later from the url array.
$optionArray = array(

CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0',//Pick your user agent.
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_TIMEOUT => 10

);

//Create an array of your urls.
$urlArray = array(

'http://site1.com/',
'http://site2.com/',
'http://site3.com/'

);

//Play around with this number and see what works best.
//This is how many urls it will try to do at one time.
$nThreads = 20;

//To use run the function.
$results = multi_thread_curl($urlArray, $optionArray, $nThreads);

Once this is complete you will have an array containing all of the html from your list of websites. It is at this point where I would loop through them and pull out all of the urls.

Like so:

foreach($results as $page){

$dom = new DOMDocument();
@$dom->loadHTML($page);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");

for($i = 0; $i < $hrefs->length; $i++){

$href = $hrefs->item($i);
$url = $href->getAttribute('href');
$url = filter_var($url, FILTER_SANITIZE_URL);
// validate url
if(!filter_var($url, FILTER_VALIDATE_URL) === false){
echo '<a href="'.$url.'">'.$url.'</a><br />';
}

}

}

It is also worth keeping in the back of you head the ability to increase the run time of your script.

If your using a hosting service you may be restricted to something in the ball park of two minutes regardless of what you set your max execution time to. Just food for thought.

This is done by:

ini_set('max_execution_time', 120);

You can always try more time but you'll never know till you time it.

Hope it helps.

Making multiple curl requests without timeout.

Is there a way to modify endpoint API to allow to process multiple ids? If yes, that is preferred, because if you run several thousands of requests at the same time, you actually making something like DDoS attack.

However, you may want to check PHP's curl_multi_* functions (http://us3.php.net/manual/en/function.curl-multi-exec.php).

Another link that can be useful: http://www.onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blocking/

PHP Multi-cURL requests delayed until timeout

My current theory is that there is a bug in Facebook fileserver that means the connection is sometimes not being closed even though the data has been sent, so the connection stays open until the timeout. In the absence of the (optional) content-length header being sent by Facebook's fileserver, cURL can't know if the payload is complete, and so hangs.

My current solution is to 'prime' the fileserver by first making a request for the image without a body, like this:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_exec($ch);

This is a pretty quick process, since there is no image being returned. I actually do this in the background using asynchronous multi curl, so I can get on with doing some other processing.

After priming the fileserver, subsequent requests for the files are even quicker than before, as the content-length is known.

This is a bit of a clumsy approach, but in the absence of any response from Facebook so far I'm not sure what else to do.

php multiple curl urls with while loop

Try the following:

<?php

function callCURL($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$combined = curl_exec ($ch);
curl_close ($ch);
return $combined;
}

function getResult($urls) {
$return = array();

foreach ($urls as $url) {
$response = callCURL($url);
if (strlen($response) !== 0) {
$return[] = $response;
break;
}
}
return $return;
}

$urls = array("example.com/value1.php?process=$param", "example.com/value2.php?process=$param", "example.com/value3.php?process=$param")

$result = getResult($urls);

Access multiple URL at once in Curl PHP

You can use PHP multi curl https://www.php.net/manual/en/function.curl-multi-init.php

Below I write a code that open Parallel request.

$time_Start = microtime(true);

$ids = array(1,2,3,4,5,6); // You forex currency ids.
$response = php_curl_multi($ids);

echo "Time: ". (microtime(true)-$time_Start)."sec";
// Time: 0.7 sec

Function

function php_curl_multi($ids){
$parameters = "/api/forex/indicators?period=1d&access_key=API_KEY&id="; // ID will set dynamic
$url = "https://fcsapi.com".$parameters;

$ch_index = array(); // store all curl init
$response = array();

// create both cURL resources
foreach ($ids as $key => $id) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url.$id);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$ch_index[] = $ch;
}

//create the multiple cURL handle
$mh = curl_multi_init();

//add the handles
foreach ($ch_index as $key => $ch) {
curl_multi_add_handle($mh,$ch);
}

//execute the multi handle
do {
$status = curl_multi_exec($mh, $active);
if ($active) {
curl_multi_select($mh);
}
} while ($active && $status == CURLM_OK);

//close the handles
foreach ($ch_index as $key => $ch) {
curl_multi_remove_handle($mh, $ch);
}
curl_multi_close($mh);

// get all response
foreach ($ch_index as $key => $ch) {
$response[] = curl_multi_getcontent($ch);
}

return $response;
}


Related Topics



Leave a reply



Submit