How to Read and Echo File Size of Uploaded File Being Written At Server in Real Time Without Blocking At Both Server and Client

How to read and echo file size of uploaded file being written at server in real time without blocking at both server and client?

You need to clearstatcache to get real file size. With few other bits fixed, your stream.php may look like following:

<?php

header("Content-Type: text/event-stream");
header("Cache-Control: no-cache");
header("Connection: keep-alive");
// Check if the header's been sent to avoid `PHP Notice: Undefined index: HTTP_LAST_EVENT_ID in stream.php on line `
// php 7+
//$lastId = $_SERVER["HTTP_LAST_EVENT_ID"] ?? 0;
// php < 7
$lastId = isset($_SERVER["HTTP_LAST_EVENT_ID"]) ? intval($_SERVER["HTTP_LAST_EVENT_ID"]) : 0;

$upload = $_GET["filename"];
$data = 0;
// if file already exists, its initial size can be bigger than the new one, so we need to ignore it
$wasLess = $lastId != 0;
while ($data < $_GET["filesize"] || !$wasLess) {
// system calls are expensive and are being cached with assumption that in most cases file stats do not change often
// so we clear cache to get most up to date data
clearstatcache(true, $upload);
$data = filesize($upload);
$wasLess |= $data < $_GET["filesize"];
// don't send stale filesize
if ($wasLess) {
sendMessage($lastId, $data);
$lastId++;
}
// not necessary here, though without thousands of `message` events will be dispatched
//sleep(1);
// millions on poor connection and large files. 1 second might be too much, but 50 messages a second must be okay
usleep(20000);
}

function sendMessage($id, $data)
{
echo "id: $id\n";
echo "data: $data\n\n";
ob_flush();
// no need to flush(). It adds content length of the chunk to the stream
// flush();
}

Few caveats:

Security. I mean luck of it. As I understand it is a proof of concept, and security is the least of concerns, yet the disclaimer should be there. This approach is fundamentally flawed, and should be used only if you don't care of DOS attacks or information about your files goes out.

CPU. Without usleep the script will consume 100% of a single core. With long sleep you are at risk of uploading the whole file within a single iteration and the exit condition will be never met. If you are testing it locally, the usleep should be removed completely, since it is matter of milliseconds to upload MBs locally.

Open connections. Both apache and nginx/fpm have finite number of php processes that can serve the requests. A single file upload will takes 2 for the time required to upload the file. With slow bandwidth or forged requests, this time can be quite long, and the web server may start to reject requests.

Clientside part. You need to analyse the response and finally stop listening to the events when the file is fully uploaded.

EDIT:

To make it more or less production friendly, you will need an in-memory storage like redis, or memcache to store file metadata.

Making a post request, add a unique token which identify the file, and the file size.

In your javascript:

const fileId = Math.random().toString(36).substr(2); // or anything more unique
...

const [request, source] = [
new Request(`${url}?fileId=${fileId}&size=${filesize}`, {
method:"POST", headers:headers, body:file
})
, new EventSource(`${stream}?fileId=${fileId}`)
];
....

In data.php register the token and report progress by chunks:

....

$fileId = $_GET['fileId'];
$fileSize = $_GET['size'];

setUnique($fileId, 0, $fileSize);

while ($uploaded = stream_copy_to_stream($input, $file, 1024)) {
updateProgress($id, $uploaded);
}
....


/**
* Check if Id is unique, and store processed as 0, and full_size as $size
* Set reasonable TTL for the key, e.g. 1hr
*
* @param string $id
* @param int $size
* @throws Exception if id is not unique
*/
function setUnique($id, $size) {
// implement with your storage of choice
}

/**
* Updates uploaded size for the given file
*
* @param string $id
* @param int $processed
*/
function updateProgress($id, $processed) {
// implement with your storage of choice
}

So your stream.php don't need to hit the disk at all, and can sleep as long as it is acceptable by UX:

....
list($progress, $size) = getProgress('non_existing_key_to_init_default_values');
$lastId = 0;

while ($progress < $size) {
list($progress, $size) = getProgress($_GET["fileId"]);
sendMessage($lastId, $progress);
$lastId++;
sleep(1);
}
.....


/**
* Get progress of the file upload.
* If id is not there yet, returns [0, PHP_INT_MAX]
*
* @param $id
* @return array $bytesUploaded, $fileSize
*/
function getProgress($id) {
// implement with your storage of choice
}

The problem with 2 open connections cannot be solved unless you give up EventSource for old good pulling. Response time of stream.php without loop is a matter of milliseconds, and it is quite wasteful to keep the connection open all the time, unless you need hundreds updates a second.

Simple AJAX/jQuery application to show files in real time during copy

Some general ideas for the client:

  • It needs to establish a session. This is both so that it can see the file copy progress of the files it is copying and no other (possibly malicious) third party can see it. This can be done with some sort of a token. This can be stored as a cookie and the server can read this to see what session the request is from.
  • You need the client to keep requesting the state at a steady interval. This is called polling.

So, all your client has to do is request to establish a session, request which folder needs to be copied (Possibly needing to request a directory tree), and then make a request for which folder needs to be copied where, and keep making requests for the progress every few seconds or minutes until it is done. In the mean time, if the user wishes to cancel it, send a cancel request to some endpoint.

On the server side, there are many technologies to do this. django is the most popular, but this seems like a smaller project, so might I recommended flask.

As for the actual task, shutil.copytree() is what you are looking for. It takes a custom copy function, which you can specify to update a sessions "currently copying" file when a new file needs to be copied:

import shutil

def copy_dir(session_id, source, destination):
def copy_fn(src, dest):
if sessions[session_id]['data'].aborted:
return # Stop copying once requested.
# Or however your session data is structured
sessions[session_id]['data'].current_copy = [src, dest]
shutil.copy2(src, dest) # Actual copy work
shutil.copytree(source, destination, copy_function=copy_fn)

To get the percentage of how much of the file has been copied, compare the size of the file that is being copied to to the file it is being copied from.

Another way to get the percentage of copying, os.walk on a directory to generate all the file names, then open the files and copy it in chunks. Update the progress every few chunks. Note that this is very error prone.

Get bytes transferred using PHP5 for POST request

You can check out this Php File Upload Progress Bar that may help you get started in case you insist on using Php to display progress. This uses the PECL extension APC to get uploaded file progress details. It is possible to calculate the number of bytes received by the server using the response of
apc_fetch() as per the first link.

Another interesting Track Upload Progress tutorial that uses Php's native Session Upload Progress feature.

Lastly, if you are a little open to using Javascript (or a JS library), that would be ideal. An easy to use, easy to setup, well known and a maintained library that I know of is FineUploader



Related Topics



Leave a reply



Submit