How to Post a Large Amount of Data Within PHP Curl Without Memory Overhead

How to POST a large amount of data within PHP curl without memory overhead?

Use CURLOPT_INFILE

$curl = curl_init();
curl_setopt( $curl, CURLOPT_PUT, 1 );
curl_setopt( $curl, CURLOPT_INFILESIZE, filesize($tmpFile) );
curl_setopt( $curl, CURLOPT_INFILE, ($in=fopen($tmpFile, 'r')) );
curl_setopt( $curl, CURLOPT_CUSTOMREQUEST, 'POST' );
curl_setopt( $curl, CURLOPT_HTTPHEADER, [ 'Content-Type: application/json' ] );
curl_setopt( $curl, CURLOPT_URL, $url );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, 1 );
$result = curl_exec($curl);
curl_close($curl);
fclose($in);

PHP cUrl - Unable to fetch large files

what do you think this line does? curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); - it tells php to catch all stdout output of curl_exec, and save it all in memory at once, before doing anything else, that's both a very slow approach (because you don't start writing to disk before your download is 100% complete, and unless you're running on SSDs, disks are slow), and extremely memory hungry approach (because you store the entire file in memory at once), neither of those things are desirable. instead, do $fp=fopen(basename($url),'wb');curl_setopt($ch,CURLOPT_FILE,$fp); - now curl will write the content directly to the disk, thus being much faster (writing it to disk as it's being downloaded) AND just use a small amount of ram, no matter how big the download file is.

  • also note, if you're going to run large amount of slow downloads simultaneously, PHP-behind-a-webserver is simply a bad tool for the job, usually the amount of concurrent php processes you can run is very limited, and block your entire website from loading when all of them are busy, and php aborts if the client disconnect for some reason (see ignore_user_abort()), and many webservers will timeout if the script takes too long (see nginx proxy_read_timeout for example), and php often even kill itself for timeout reasons (see set_time_limit()) .. if that's the case, consider writing the downloader in another language (for example, Go's goroutines should be able to do a massive amount of concurrent slow downloads with little resource usage, unlike PHP)

Reading POST data in PHP from cUrl

if you need the zip file from the response I guess you could just write a tmp file to save the curl response to, and stream that as a workaround:
Never tried that with multipart curls, but I guess it should work.

$fh = fopen('/tmp/foo', 'w'); 
$cUrl = curl_init('http://example.com/foo');
curl_setopt($cUrl, CURLOPT_FILE, $fh); // redirect output to filehandle
curl_exec($cUrl);
curl_close($cUrl);
fclose($fh); // close filehandle or the file will be corrupted

if you do NOT need anything but the xml part of the response you might want to disable headers

curl_setopt($cUrl, CURLOPT_HEADER, FALSE);

and add option to only accept xml as a response

curl_setopt($cUrl, CURLOPT_HTTPHEADER, array('Accept: application/xml'));
//That's a workaround since there is no available curl option to do so but http allows that

[EDIT]

A Shot in the dark...
can you test with these curlopt settings to see if modifiying these help anything

$headers = array (
'Content-Type: multipart/form-data; boundary=' . $boundary,
'Content-Length: ' . strlen($requestBody),
'X-EBAY-API-COMPATIBILITY-LEVEL: ' . $compatLevel, // API version
'X-EBAY-API-DEV-NAME: ' . $devID,
'X-EBAY-API-APP-NAME: ' . $appID,
'X-EBAY-API-CERT-NAME: ' . $certID,
'X-EBAY-API-CALL-NAME: ' . $verb,
'X-EBAY-API-SITEID: ' . $siteID,
);

$cUrl = curl_init();
curl_setopt($cUrl, CURLOPT_URL, $serverUrl);
curl_setopt($cUrl, CURLOPT_TIMEOUT, 30 );
curl_setopt($cUrl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($cUrl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($cUrl, CURLOPT_HTTPHEADER, $headers);
curl_setopt($cUrl, CURLOPT_POST, 1);
curl_setopt($cUrl, CURLOPT_POSTFIELDS, $requestBody);
curl_setopt($cUrl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($cUrl, CURLOPT_FAILONERROR, 0 );
curl_setopt($cUrl, CURLOPT_FOLLOWLOCATION, 1 );
curl_setopt($cUrl, CURLOPT_HEADER, 0 );
curl_setopt($cUrl, CURLOPT_USERAGENT, 'ebatns;xmlstyle;1.0' );
curl_setopt($cUrl, CURLOPT_HTTP_VERSION, 1 ); // HTTP version must be 1.0
$response = curl_exec($cUrl);

if ( !$response ) {
print "curl error " . curl_errno($cUrl ) . PHP_EOL;
}
curl_close($cUrl);

[EDIT II]

This is just a try, as mentioned I cannot get my curled pages to respond with a multipart form data. So be gentle with me here ;)

$content_type = ""; //use last know content-type as a trigger
$tmp_cnt_file = "tmp/tmpfile";
$xml_response = ""; // this will hold the "usable" curl response
$hidx = 0; //header index.. counting the number of different headers received

function read_header($cUrl, $string)// this will be called once for every line of each header received
{
global $content_type, $hidx;
$length = strlen($string);
if (preg_match('/Content-Type:(.*)/', $string, $match))
{
$content_type = $match[1];
$hidx++;
}
/*
should set $content_type to 'application/xop+xml; charset=utf-8; type="text/xml"' for the first
and to 'application/zip' for the second response body

echo "Header: $string<br />\n";
*/
return $length;
}

function read_body($cUrl, $string)
{
global $content_header, $xml_response, $tmp_cnt_file, $hidx;
$length = strlen($string);
if(stripos ( $content_type , "xml") !== false)
$xml_response .= $string;
elseif(stripos ($content_type, "zip") !== false)
{
$handle = fopen($tmp_cnt_file."-".$hidx.".zip", "a");
fwrite($handle, $string);
fclose($handle);
}
/*
elseif {...} else{...}
depending on your needs

echo "Received $length bytes<br />\n";
*/
return $length;
}

and of course set the proper curlopts

// Set callback function for header
curl_setopt($cUrl, CURLOPT_HEADERFUNCTION, 'read_header');
// Set callback function for body
curl_setopt($cUrl, CURLOPT_WRITEFUNCTION, 'read_body');

don't forget to NOT save the curl response to a variable because of the memory issues,
hopefully all you need will be in the $xml_response above anyways.

//$response = curl_exec($cUrl);
curl_exec($cUrl);

And for parsing your code you can refer to $xml_response and the temp files you created starting with tmp/tmpfile-2 in this scenario. Again, I have not been able to test the code above in any way. So this might not work (but it should imho ;))

[EDIT III]

Say we want curl to write all incoming data directly to another (outgoing) stream, in this case a socket connection

I'm not sure if it is as easy as this:

$fs = fsockopen($host, $port, $errno, $errstr);
$cUrl = curl_init('http://example.com/foo');
curl_setopt($cUrl, CURLOPT_FILE, $fs); // redirect output to sockethandle
curl_exec($cUrl);
curl_close($cUrl);
fclose($fs); // close handle

else we will have to use our known write and header functions with just a little trick

//first open the socket (before initiating curl)
$fs = fsockopen($host, $port, $errno, $errstr);
// now for the new callback function
function socket_pipe($cUrl, $string)
{
global $fs;
$length = strlen($string);
fputs($fs, $string); // add NOTHING to the received line just send it to $fs; that was easy wasn't it?
return $length;
}
// and of course for the CURLOPT part
// Set callback function for header
curl_setopt($cUrl, CURLOPT_HEADERFUNCTION, 'socket_pipe');
// Set the same callback function for body
curl_setopt($cUrl, CURLOPT_WRITEFUNCTION, 'socket_pipe');

// do not forget to
fclose($fs); //when we're done

The thing is, not editing the result and simply piping it to $fs will make it necessary that apache is listening on a certain port which you then assign your script to.
Or you will need to add ONE header line directly after fsockopen

fputs($fp, "POST $path HTTP/1.0\n"); //where path is your script of course

PHP cURL: how to set body to binary data?

You can just set your body in CURLOPT_POSTFIELDS.

Example:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://url/url/url" );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt($ch, CURLOPT_POST, 1 );
curl_setopt($ch, CURLOPT_POSTFIELDS, "body goes here" );
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: text/plain'));

$result=curl_exec ($ch);

Taken from here

Of course, set your own header type, and just do file_get_contents('/path/to/file') for body.

How to stop PHP cURL upload inserting Boundary into the Content-Type field?

Maybe the API doesn't expects a POST multipart, but the actual contents in the body itself:

Ref: How to POST a large amount of data within PHP curl without memory overhead?

You need to use PUT method for the actual contents of the file to go inside the body - if you use POST, it will try to send as a form.

$authorization = "Authorization: Bearer [token]"; 
$file = 'C:\example\example.mp4';
$infile = fopen($file, 'r');

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "https://api.mendeley.com/file_contents");
curl_setopt($ch, CURLOPT_PUT, 1 ); // needed for file upload
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($file));
curl_setopt($ch, CURLOPT_INFILE, $infile);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST' );
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: video/mp4', 'Accept: application/vnd.mendeley-content-ticket.1+json', $authorization));

curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$result=curl_exec ($ch);

PHP cURL 'Fatal error: Allowed memory size' for large data sets

Settled for:

ini_set("memory_limit","30M");


Related Topics



Leave a reply



Submit