PhantomJS: pipe input
You can do what you're looking for very simply (it's just not really documented) directly in PhantomJS.
var page = require('webpage').create(),
fs = require('fs');
page.viewportSize = { width: 600, height: 600 };
page.paperSize = { format: 'Letter', orientation: 'portrait', margin: '1cm' };
page.content = fs.read('/dev/stdin');
window.setTimeout(function() {
page.render('/dev/stdout', { format: 'pdf' });
phantom.exit();
}, 1);
(May need to increase the timeout if you have images that need loading, etc.)
HTML comes in stdin, PDF binary goes out stdout. You can test it like:
echo "<b>test</b>" | phantomjs makepdf.js > test.pdf && open test.pdf
PhantomJS: exported PDF to stdout
As pointed out by Niko you can use renderBase64()
to render the web page to an image buffer and return the result as a base64-encoded string.
But for now this will only work for PNG, JPEG and GIF.
To write something from a phantomjs script to stdout just use the filesystem API.
I use something like this for images :
var base64image = page.renderBase64('PNG');
var fs = require("fs");
fs.write("/dev/stdout", base64image, "w");
I don't know if the PDF format for renderBase64()
will be in a future version of phanthomjs but as a workaround something along these lines may work for you:
page.render(output);
var fs = require("fs");
var pdf = fs.read(output);
fs.write("/dev/stdout", pdf, "w");
fs.remove(output);
Where output
is the path to the pdf file.
phantomjs pdf to stdout
When writing output to /dev/stdout/
or /dev/stderr/
on Windows, PhantomJS
goes through the following steps (as seen in the render
method in \phantomjs\src\webpage.cpp):
- In absence of
/dev/stdout/
and/dev/stderr/
a temporary file path is allocated. - Call
renderPdf
with the temporary file path. - Render the web page to this file path.
- Read the contents of this file into a
QByteArray
. - Call
QString::fromAscii
on the byte array and write tostdout
orstderr
. - Delete the temporary file.
To begin with, I built the source for PhantomJS
, but commented out the file deletion. On the next run, I was able to examine the temporary file it had rendered, which turned out to be completely fine. I also tried running phantomjs.exe rasterize.js http://google.com > test.png
with the same results. This immediately ruled out a rendering issue, or anything specifically to do with PDFs, meaning that the problem had to be related to the way data is written to stdout
.
By this stage I had suspicions about whether there was some text encoding shenanigans going on. From previous runs, I had both a valid and invalid version of the same file (a PNG in this case).
Using some C# code, I ran the following experiment:
//Read the contents of the known good file.
byte[] bytesFromGoodFile = File.ReadAllBytes("valid_file.png");
//Read the contents of the known bad file.
byte[] bytesFromBadFile = File.ReadAllBytes("invalid_file.png");
//Take the bytes from the valid file and convert to a string
//using the Latin-1 encoding.
string iso88591String = Encoding.GetEncoding("iso-8859-1").GetString(bytesFromGoodFile);
//Take the Latin-1 encoded string and retrieve its bytes using the UTF-8 encoding.
byte[] bytesFromIso88591String = Encoding.UTF8.GetBytes(iso88591String);
//If the bytes from the Latin-1 string are all the same as the ones from the
//known bad file, we have an encoding problem.
Debug.Assert(bytesFromBadFile
.Select((b, i) => b == bytesFromIso88591String[i])
.All(c => c));
Note that I used ISO-8859-1 encoding as QT
uses this as the default encoding for c-strings. As it turned out, all those bytes were the same. The point of that exercise was to see if I could mimic the encoding steps that caused valid data to become invalid.
For further evidence, I investigated \phantomjs\src\system.cpp and \phantomjs\src\filesystem.cpp.
- In
system.cpp
, theSystem
class holds references to, among other things,File
objects forstdout
,stdin
andstderr
, which are set up to useUTF-8
encoding. - When writing to
stdout
, thewrite
function of theFile
object is called. This function supports writing to both text and binary files, but because of the way theSystem
class initializes them, all writing will be treated as though it were going to a text file.
So the problem boils down to this: we need to be performing a binary write to stdout
, yet our writes end up being treated as text and having an encoding applied to them that causes the resulting file to be invalid.
Given the problem described above, I can't see any way to get this working the way you want on Windows without making changes to the PhantomJS
code. So here they are:
This first change will provide a function we can call on File
objects to explicitly perform a binary write.
Add the following function prototype in \phantomjs\src\filesystem.h
:
bool binaryWrite(const QString &data);
And place its definition in \phantomjs\src\filesystem.cpp
(the code for this method comes from the write
method in this file):
bool File::binaryWrite(const QString &data)
{
if ( !m_file->isWritable() ) {
qDebug() << "File::write - " << "Couldn't write:" << m_file->fileName();
return true;
}
QByteArray bytes(data.size(), Qt::Uninitialized);
for(int i = 0; i < data.size(); ++i) {
bytes[i] = data.at(i).toAscii();
}
return m_file->write(bytes);
}
At around line 920 of \phantomjs\src\webpage.cpp
you'll see a block of code that looks like this:
if( fileName == STDOUT_FILENAME ){
#ifdef Q_OS_WIN32
_setmode(_fileno(stdout), O_BINARY);
#endif
((File *)system->_stderr())->write(QString::fromAscii(name.constData(), name.size()));
#ifdef Q_OS_WIN32
_setmode(_fileno(stdout), O_TEXT);
#endif
}
Change it to this:
if( fileName == STDOUT_FILENAME ){
#ifdef Q_OS_WIN32
_setmode(_fileno(stdout), O_BINARY);
((File *)system->_stdout())->binaryWrite(QString::fromAscii(ba.constData(), ba.size()));
#elif
((File *)system->_stderr())->write(QString::fromAscii(name.constData(), name.size()));
#endif
#ifdef Q_OS_WIN32
_setmode(_fileno(stdout), O_TEXT);
#endif
}
So what that code replacement does is calls our new binaryWrite
function, but does so guarded by a #ifdef Q_OS_WIN32
block. I did it this way so as to preserve the old functionality on non-Windows systems which don't seem to exhibit this problem (or do they?). Note that this fix only applies to writing to stdout
- if you want to you could always apply it to stderr
but it may not matter quite so much in that case.
In case you just want a pre-built binary (who wouldn't?), you can find phantomjs.exe
with these fixes on my SkyDrive. My version is around 19MB whereas the one I downloaded earlier was only about 6MB, though I followed the instructions here, so it should be fine.
Execute a script in phantomjs interactive (REPL) mode
Looks like REPL mode is borked and and an overhaul is underway:
https://github.com/ariya/phantomjs/issues/11180
Parse output of spawned node.js child process line by line
Try this:
cspr.stdout.setEncoding('utf8');
cspr.stdout.on('data', function(data) {
var str = data.toString(), lines = str.split(/(\r?\n)/g);
for (var i=0; i<lines.length; i++) {
// Process the line, noting it might be incomplete.
}
});
Note that the "data" event might not necessarily break evenly between lines of output, so a single line might span multiple data events.
Related Topics
How to Run the Cron Job as a User Instead of Root User
Component Based Web Project Directory Layout with Git and Symlinks
On X64 Linux, Differencebetween Syscall, Int 0X80 and Ret to Exit a Program
Compiling 32 Bit Assembler on 64 Bit Ubuntu
How to Set Environment Variables on Ec2 Instance via User Data
Differencebetween Ldd and Objdump
How to Create Virtual Environment for Python 3.7.0
How to Find the Main Function's Entry Point of Elf Executable File Without Any Symbolic Information
Add a Header to a Tab Delimited File
Libaio.So.1: Cannot Open Shared Object File
How to Get Pid from Forked Child Process in Shell Script
How to Name the 'Screen' Logfile from the -L Flag
Find All Writable Files in the Current Directory
How to Emulate Raspberry Pi Raspbian with Qemu