Pdf.Js: Rendering a PDF File Using a Base64 File Source Instead of Url

Pdf.js: rendering a pdf file using a base64 file source instead of url

from the sourcecode at
http://mozilla.github.com/pdf.js/build/pdf.js

/**
* This is the main entry point for loading a PDF and interacting with it.
* NOTE: If a URL is used to fetch the PDF data a standard XMLHttpRequest(XHR)
* is used, which means it must follow the same origin rules that any XHR does
* e.g. No cross domain requests without CORS.
*
* @param {string|TypedAray|object} source Can be an url to where a PDF is
* located, a typed array (Uint8Array) already populated with data or
* and parameter object with the following possible fields:
* - url - The URL of the PDF.
* - data - A typed array with PDF data.
* - httpHeaders - Basic authentication headers.
* - password - For decrypting password-protected PDFs.
*
* @return {Promise} A promise that is resolved with {PDFDocumentProxy} object.
*/

So a standard XMLHttpRequest(XHR) is used for retrieving the document.
The Problem with this is that XMLHttpRequests do not support data: uris (eg. data:application/pdf;base64,JVBERi0xLjUK...).

But there is the possibility of passing a typed Javascript Array to the function.
The only thing you need to do is to convert the base64 string to a Uint8Array. You can use this function found at https://gist.github.com/1032746

var BASE64_MARKER = ';base64,';

function convertDataURIToBinary(dataURI) {
var base64Index = dataURI.indexOf(BASE64_MARKER) + BASE64_MARKER.length;
var base64 = dataURI.substring(base64Index);
var raw = window.atob(base64);
var rawLength = raw.length;
var array = new Uint8Array(new ArrayBuffer(rawLength));

for(var i = 0; i < rawLength; i++) {
array[i] = raw.charCodeAt(i);
}
return array;
}

tl;dr

var pdfAsDataUri = "data:application/pdf;base64,JVBERi0xLjUK..."; // shortened
var pdfAsArray = convertDataURIToBinary(pdfAsDataUri);
PDFJS.getDocument(pdfAsArray)

pdf.js rendering as PDF with base64

There are no end-to-end answers on this topic in community so here is my attempt to put something here. (maybe it will help others)

Okay, PDF.js is one way of showing PDF in browser, specially when you don't want to rely on PDF plugin to be installed. In my case, my application generates report in PDF and that can be viewed before downloading but on handheld devices it was not working because of missing PDF viewer plugin.

In my case PDF was sent to browse in base64 string, that I can use to display PDF with <object src="base64-data"...></object>. This works like charm on Chrome / FF but switch to mobile view and it stops working.

<object type="application/pdf" id="pdfbin" width="100%" height="100%" title="Report.pdf">
<p class="text-center">Looks like there is no PDF viewer plugin installed, try one of the below approach...</p>
</object>

In above code it will try to show the PDF or fall back to <p> and show error message. And I Was planning to add the PDF viewer at this point, PDF.js was the choice but was not able to display it. One example on PDF.js with Base64 data shows how to do this but that renders it as an Image not PDF, and I was not able to find solution for that and hence the question, here is what I did,

  1. First add the JavaScript code to convert base64 to array

  2. convert to blob and use viewer.html file packaged with PDF.js to display it as PDF

In case if you are wondering why base64 data, then answer is simple I can create the PDF, read it, send the data to client and delete the file, I don't have to run any cleaner service/cron job to delete generated PDF files

Few Things To Note

  1. Below code is using Flask + Jinja2, change the way base64 is read in html if you are using something else
  2. viewer.html needs to be changed to have required js & css files in proper location (by default their location is relative; you need them to be referred from static folder)
  3. viewer.js looks for pdf.worker.js in predefined location, change that in case its throwing error as above file not found.
  4. viewer.js might throw file origin does not match viewer error in that case as a quick fix comment the code which throws this error and see if that solves the issue (look for that error in viewer.js)
  5. I am not the author of below code, I have just put it together from different places.

Now to the code (so PDF will be displayed when user clicks on button with id="open_id")


Jquery

var pdfDataX = '{{ base64Pdf }}';
var BASE64_MARKER = ';base64,';
PDFJS.workerSrc = "{{ url_for('static', filename='js/pdf.worker.js') }}";

$('#open_id').click(function() {
PDFJS.disableWorker = true;
var pdfAsDataUri = "data:application/pdf;base64," + pdfDataX ;
PDFJS.workerSrc = "{{ url_for('static', filename='js/pdf.worker.js') }}";

// Try to show in the viewer.html

var blob = base64toBlob(pdfDataX, 'application/pdf');
var url = URL.createObjectURL(blob);
var viewerUrl = "{{ url_for('static', filename='viewer.html') }}" + '?file=' + encodeURIComponent(url);
$('#pdfViewer').attr('src', viewerUrl);

// Finish

var mdObj = $('#pdfbin');

mdObj.hide();
mdObj.attr('data', pdfAsDataUri);
mdObj.show();

$('#myModal').modal();
});

var base64toBlob = function(b64Data, contentType, sliceSize) {
contentType = contentType || '';
sliceSize = sliceSize || 512;

var byteCharacters = atob(b64Data);
var byteArrays = [];

for (var offset = 0; offset < byteCharacters.length; offset += sliceSize) {
var slice = byteCharacters.slice(offset, offset + sliceSize);

var byteNumbers = new Array(slice.length);
for (var i=0; i<slice.length; i++) {
byteNumbers[i] = slice.charCodeAt(i);
}

var byteArray = new Uint8Array(byteNumbers);

byteArrays.push(byteArray);
}
var blob = new Blob(byteArrays, {type: contentType});
return blob;
}

$('.save').click(function(e) {
e.preventDefault();
var blob = base64toBlob(pdfDataX, 'application/pdf');
saveAs(blob, 'abcd.pdf'); // requires https://github.com/eligrey/FileSaver.js/
return false;
});

HTML

<object type="application/pdf" id="pdfbin" width="100%" height="100%" title="Resume.pdf">    <p class="text-center">Looks like there is no PDF viewer plugin installed, try one of the below approach...</p>    <iframe id="pdfViewer" style="width: 100%; height: 100%;" allowfullscreen="" webkitallowfullscreen=""></iframe>                  </object>

pdf.js - base64 string instead of URL ? (FF works, Safari and Chrome don´t)

PDFView.open call accepts typed array. Copy "Base64 / binary data / UTF-8 strings utilities" from https://developer.mozilla.org/en-US/docs/Web/JavaScript/Base64_encoding_and_decoding and decode base64 content using base64DecToArr function. You are already modifying the viewer, so adding couple of functions shall not be a problem.

passin raw data to pdfJS instead of url?

The PDFJS.getDocument method is defined pretty clearly, and you can read it for yourself: http://mozilla.github.io/pdf.js/api/draft/PDFJS.html#getDocument

You can open a PDF using a URL:

PDFJS.getDocument('/url/to/file.pdf').then(function(pdf){ 

var pageNumber = 1;

pdf.getPage(pageNumber).then(function (page) {
var scale = 1;
var viewport = page.getViewport(scale);
var canvas = document.getElementById('the-canvas');
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
page.render({canvasContext: context, viewport: viewport});
});
});

In this case, PDFJS makes a call to the URL for you, but if you already have the PDF file, you can just give the data to PDFJS, like this:

var docInitParams = { data: myPdfContent };
PDFJS.getDocument(docInitParams).then(function(pdf){
//render a page here
});

The question is what format is your raw PDF data in?

If you just retrieved the file from your webserver using an Ajax call, then perhaps the data is in an ArrayBuffer. If so, you'll need to make it accessible to PDFJS by putting it in a Uint8Array, as follows:

//However you get the data, let's say it ends up here in this variable
var arrayBufferOfPdfData = blah..blah..blah;

var myData = new Uint8Array(arrayBufferOfPdfData); //put it in a Uint8Array

var docInitParams = {data: myData};
PDFJS.getDocument(docInitParams).then(function(pdf){
//render a page here
});

In the end, you need to know what format your data is in so that you can get it into an acceptable format for PDFJS to make use of. I didn't describe what to do if your data is encoded as a Base64 string, but there is another SO question that answers exactly that.

My answer is much more descriptive than your one-liner question. So, if I haven't addressed your question well enough, then you should provide more detail.

PDF.JS: Render PDF using an ArrayBuffer or Blob instead of URL

You're not passing the response data to PDF.js, but an instance of the resource:

var myPdf = myService.$getPdf({ Id: 123 });
myPdf.$promise.then(function() {
var docInitParams = {
data: myPdf

You haven't shown your code for $getPdf, but I guess that it is something equivalent to

var myService = $resource('/foo', {}, {$getPdf:{responseType: 'arraybuffer'}});
var myPdf = myService.$getPdf();

By default, an AngularJS $resource treats the response as an object (deserialized from JSON) and copies any properties from the object to the resource instance (myPdf in the previous snippet).

Obviously, since your response is an array buffer (or Blob, typed array or whatever), this is not going to work. One of the ways to get the desired response is to use transformResponse to wrap the response object in an object:

var myService = $resource('/foo', {}, {
$getPdf: {
responseType: 'arraybuffer',
transformResponse: function(data, headersGetter) {
// Stores the ArrayBuffer object in a property called "data"
return { data : data };
}
}
});
var myPdf = myService.$getPdf();
myPdf.$promise.then(function() {
var docInitParams = {
data: myPdf.data
};

PDFJS.getDocument(docInitParams).then(function (pdf) {
// ...
});
});

Or simply the following (avoided unnecessary local variables):

myService.$getPdf().$promise.then(function(myPdf) {
PDFJS.getDocument({
data: myPdf.data
}).then(function (pdf) {
// ...
});
});


Related Topics



Leave a reply



Submit