How to Detect the Mime Type of Data Url

Correct MIME type for data URL

Since there might not be a correct answer to this question yet, and this SO question might itself define a reasonable answer for others asking this question, I'll document here my own proposal that I'm using:

application/dataurl

This is inspired by the accepted MIME-type for JSON, which is application/json. This seems appropriate to me as both are data formats which can wrap arbitrary content.

Is it possible to use common MIME-type for Data URI?

This data URI is not valid.

Neither the docs nor the RFC precise anything about a possible /* as media subtype, nor any default subtypes, so you'll have to write a full media type.

That is indeed understandable: if a program knows that a file contains an image but don't know whether it's encoded in .jpg, .png or something else, it won't be able to open it. Similarly, browsers won't be able to understand a base-64-encoded image without a subtype parameter.

Depending on your use case, you might wanna use this library to infer content type from the raw data directly.

How to check file MIME type with JavaScript before upload?

You can easily determine the file MIME type with JavaScript's FileReader before uploading it to a server. I agree that we should prefer server-side checking over client-side, but client-side checking is still possible. I'll show you how and provide a working demo at the bottom.


Check that your browser supports both File and Blob. All major ones should.

if (window.FileReader && window.Blob) {
// All the File APIs are supported.
} else {
// File and Blob are not supported
}

Step 1:

You can retrieve the File information from an <input> element like this (ref):

<input type="file" id="your-files" multiple>
<script>
var control = document.getElementById("your-files");
control.addEventListener("change", function(event) {
// When the control has changed, there are new files
var files = control.files,
for (var i = 0; i < files.length; i++) {
console.log("Filename: " + files[i].name);
console.log("Type: " + files[i].type);
console.log("Size: " + files[i].size + " bytes");
}
}, false);
</script>

Here is a drag-and-drop version of the above (ref):

<div id="your-files"></div>
<script>
var target = document.getElementById("your-files");
target.addEventListener("dragover", function(event) {
event.preventDefault();
}, false);

target.addEventListener("drop", function(event) {
// Cancel default actions
event.preventDefault();
var files = event.dataTransfer.files,
for (var i = 0; i < files.length; i++) {
console.log("Filename: " + files[i].name);
console.log("Type: " + files[i].type);
console.log("Size: " + files[i].size + " bytes");
}
}, false);
</script>

Step 2:

We can now inspect the files and tease out headers and MIME types.

✘ Quick method

You can naïvely ask Blob for the MIME type of whatever file it represents using this pattern:

var blob = files[i]; // See step 1 above
console.log(blob.type);

For images, MIME types come back like the following:

image/jpeg

image/png

...

Caveat: The MIME type is detected from the file extension and can be fooled or spoofed. One can rename a .jpg to a .png and the MIME type will be be reported as image/png.


✓ Proper header-inspecting method

To get the bonafide MIME type of a client-side file we can go a step further and inspect the first few bytes of the given file to compare against so-called magic numbers. Be warned that it's not entirely straightforward because, for instance, JPEG has a few "magic numbers". This is because the format has evolved since 1991. You might get away with checking only the first two bytes, but I prefer checking at least 4 bytes to reduce false positives.

Example file signatures of JPEG (first 4 bytes):

FF D8 FF E0 (SOI + ADD0)

FF D8 FF E1 (SOI + ADD1)

FF D8 FF E2 (SOI + ADD2)

Here is the essential code to retrieve the file header:

var blob = files[i]; // See step 1 above
var fileReader = new FileReader();
fileReader.onloadend = function(e) {
var arr = (new Uint8Array(e.target.result)).subarray(0, 4);
var header = "";
for(var i = 0; i < arr.length; i++) {
header += arr[i].toString(16);
}
console.log(header);

// Check the file signature against known types

};
fileReader.readAsArrayBuffer(blob);

You can then determine the real MIME type like so (more file signatures here and here):

switch (header) {
case "89504e47":
type = "image/png";
break;
case "47494638":
type = "image/gif";
break;
case "ffd8ffe0":
case "ffd8ffe1":
case "ffd8ffe2":
case "ffd8ffe3":
case "ffd8ffe8":
type = "image/jpeg";
break;
default:
type = "unknown"; // Or you can use the blob.type as fallback
break;
}

Accept or reject file uploads as you like based on the MIME types expected.


Demo

Here is a working demo for local files and remote files (I had to bypass CORS just for this demo). Open the snippet, run it, and you should see three remote images of different types displayed. At the top you can select a local image or data file, and the file signature and/or MIME type will be displayed.

Notice that even if an image is renamed, its true MIME type can be determined. See below.

Screenshot

Expected output of demo


// Return the first few bytes of the file as a hex stringfunction getBLOBFileHeader(url, blob, callback) {  var fileReader = new FileReader();  fileReader.onloadend = function(e) {    var arr = (new Uint8Array(e.target.result)).subarray(0, 4);    var header = "";    for (var i = 0; i < arr.length; i++) {      header += arr[i].toString(16);    }    callback(url, header);  };  fileReader.readAsArrayBuffer(blob);}
function getRemoteFileHeader(url, callback) { var xhr = new XMLHttpRequest(); // Bypass CORS for this demo - naughty, Drakes xhr.open('GET', '//cors-anywhere.herokuapp.com/' + url); xhr.responseType = "blob"; xhr.onload = function() { callback(url, xhr.response); }; xhr.onerror = function() { alert('A network error occurred!'); }; xhr.send();}
function headerCallback(url, headerString) { printHeaderInfo(url, headerString);}
function remoteCallback(url, blob) { printImage(blob); getBLOBFileHeader(url, blob, headerCallback);}
function printImage(blob) { // Add this image to the document body for proof of GET success var fr = new FileReader(); fr.onloadend = function() { $("hr").after($("<img>").attr("src", fr.result)) .after($("<div>").text("Blob MIME type: " + blob.type)); }; fr.readAsDataURL(blob);}
// Add more from http://en.wikipedia.org/wiki/List_of_file_signaturesfunction mimeType(headerString) { switch (headerString) { case "89504e47": type = "image/png"; break; case "47494638": type = "image/gif"; break; case "ffd8ffe0": case "ffd8ffe1": case "ffd8ffe2": type = "image/jpeg"; break; default: type = "unknown"; break; } return type;}
function printHeaderInfo(url, headerString) { $("hr").after($("<div>").text("Real MIME type: " + mimeType(headerString))) .after($("<div>").text("File header: 0x" + headerString)) .after($("<div>").text(url));}
/* Demo driver code */
var imageURLsArray = ["http://media2.giphy.com/media/8KrhxtEsrdhD2/giphy.gif", "http://upload.wikimedia.org/wikipedia/commons/e/e9/Felis_silvestris_silvestris_small_gradual_decrease_of_quality.png", "http://static.giantbomb.com/uploads/scale_small/0/316/520157-apple_logo_dec07.jpg"];
// Check for FileReader supportif (window.FileReader && window.Blob) { // Load all the remote images from the urls array for (var i = 0; i < imageURLsArray.length; i++) { getRemoteFileHeader(imageURLsArray[i], remoteCallback); }
/* Handle local files */ $("input").on('change', function(event) { var file = event.target.files[0]; if (file.size >= 2 * 1024 * 1024) { alert("File size must be at most 2MB"); return; } remoteCallback(escape(file.name), file); });
} else { // File and Blob are not supported $("hr").after( $("<div>").text("It seems your browser doesn't support FileReader") );} /* Drakes, 2015 */
img {  max-height: 200px}div {  height: 26px;  font: Arial;  font-size: 12pt}form {  height: 40px;}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script><form>  <input type="file" />  <div>Choose an image to see its file signature.</div></form><hr/>

How to check a resource's MIME type in Javascript?

As it has been said in comments, the only way to check that a resource points to some playable media, is to actually try to play it.

A MIME type won't tell you much, moreover for audio/video media:

you can't be 100% sure that what the container (which is declared by the MIME type) really contains, it could very well be encoded in a codec that the browser can't decode, even if it wrapped in a container the browser should know.

This means that even though a simple HEAD request could tell you for some URLs what the server would send in the Content-Type headers, that wouldn't tell you exactly if the browser would be able to play it. And given most server won't let you read these headers anyway, that doesn't worth the cost of trying to implement this.

For your case the best is thus to try to play it.

So you can rewrite your code to

const urlParams = new URLSearchParams(window.location.search);
const audioURL = urlParams('audiourl') // get audiourl from query (example.com/?audiourl=https://example.com/audio.wav)

const audio = new Audio(audioURL);
audio.onerror = (evt) => {
//tell user invalid audio url, or unsupported audio type
};
audio.play();

Note that we could also have bound to the catch() method of the Promise returned by the HTMLMediaElement.play() method, but it could actually fire for an other reason (e.g if the user never did interact with the page before in Chrome, or if this code is not in direct response to an user-gesture in Safari). If you are sure only invalid URLs could be the reason for a failure to play, then you could do

const audio = new Audio(audioURL);
audio.play().catch( () => {
//tell user invalid audio url, or unsupported audio type
});

Is there a way to find out the mime type from a Google Drive link in JS or PHP?

In PHP you can use cURLs curl_getinfo function with the CURLINFO_CONTENT_TYPE flag

<?php
// the request
$ch = curl_init($documentURL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_exec($ch);

// get the content type
echo curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
?>

Your output will be along the lines of text/html; charset=ISO-8859-1, which you could explode to only get the MIME-type:

<?php 
$mime = explode(";", curl_getinfo($ch, CURLINFO_CONTENT_TYPE));

echo $mime[0];
?>

Pull a mime type out of a URI using PHP

Not an elegant solution, but you could do:

// assume you've set $image_uri to be the URI from the database
$image_parts = explode(";", $image_uri); // split on the ; after the mime type
$mime_type = substr($image_parts[0], 5); // get the information after the data: text

It could be done with regular expressions, but I'm not good enough at them to come up with it.



Related Topics



Leave a reply



Submit