Casperjs/Phantomjs Doesn't Load Https Page

CasperJS/PhantomJS doesn't load https page

The problem may be related to the recent discovery of a SSLv3 vulnerability (POODLE). Website owners were forced to remove SSLv3 support from their websites. Since PhantomJS < v1.9.8 uses SSLv3 by default, you should use TLSv1:

casperjs --ssl-protocol=tlsv1 yourScript.js

The catchall solution would be to use any for when newer PhantomJS versions come along with other SSL protocols. But this would make the POODLE vulnerability exploitable on sites which haven't yet disabled SSLv3.

casperjs --ssl-protocol=any yourScript.js

Alternative method: Update to PhantomJS 1.9.8 or higher. Note that updating to PhantomJS 1.9.8 leads to a new bug, which is especially annoying for CasperJS.

How to verify: Add a resource.error event handler like this at the beginning of your script:

casper.on("resource.error", function(resourceError){
console.log('Unable to load resource (#' + resourceError.id + 'URL:' + resourceError.url + ')');
console.log('Error code: ' + resourceError.errorCode + '. Description: ' + resourceError.errorString);
});

If it is indeed a problem with SSLv3 the error will be something like:

Error code: 6. Description: SSL handshake failed


As an aside, you also might want to run with the --ignore-ssl-errors=true commandline option, when there is something wrong with the certificate.

Cant open https web using Slimerjs, casperjs, phantomjs

The issue is not because of PhantomJS as such. The site you are checking is protected by a F5 network protection

https://devcentral.f5.com/articles/these-are-not-the-scrapes-youre-looking-for-session-anomalies

So its not that the page doesn't load. It is that the protection mechanism detects that PhantomJS is a bot based on checks they have implemented

Page Loaded

The easiest of fixes is to use Chrome instead of PhantomJS. Else it means a decent amount of investigation time

Some similar unanswered/answered question in the past

Selenium and PhantomJS : webpage thinks Javascript is disabled

PhantomJS get no real content running on AWS EC2 CentOS 6

file_get_contents while bypassing javascript detection

Python POST Request Not Returning HTML, Requesting JavaScript Be Enabled

I will update this post with more details that I find. But my experience says, go with what works instead of wasting time on such sites which don't work under PhantomJS

Update-1

I have tried to import the browser cookies to PhantomJS and it still won't work. Which means there is some hard checks

Cookies

CasperJS not loading page

Okay, so after playing around with this for quite some time, Artom B.'s comment finally lead me in the right direction. CasperJS requires using a version of PhantomJS 1.8.2 or higher, but less than 2.0.0.

So I uninstalled PhantomJS, installed version 1.9.8 and it still didn't work. So, next I uninstalled CasperJS and installed the development version and ran my script with

casperjs --ssl-protocol=tlsv1 --ignore-ssl-errors=true --cookies-file=/tmp/cookies.txt JScraper.js

This did the trick.

PhantomJS failing to open HTTPS site

I tried Fred's and Cameron Tinker's answers, but only --ssl-protocol=any option seem to help me:

phantomjs --ssl-protocol=any test.js

Also I think it should be way safer to use --ssl-protocol=any as you still are using encryption, but --ignore-ssl-errors=true will ignore (duh) all ssl errors, including malicious ones.

PhantomJS 2.0.0 doesn't wait for page to load

PhantomJS doesn't define when in the page load process the page.open callback is called. So, there's nothing actually wrongly claimed.

It could be that you can add a static wait amount with setTimeout() which should help for dynamic sites. There are also approaches where you can see if there are pending requests by counting how many requests where sent with page.onResourceRequested and how many requests finished with page.onResourceReceived/page.onResourceTimeout/page.onResourceError.

If it is actually a PhantomJS bug, then there is not much can to besides try some of the command line switches.



Related Topics



Leave a reply



Submit