CasperJS/PhantomJS doesn't load https page
The problem may be related to the recent discovery of a SSLv3 vulnerability (POODLE). Website owners were forced to remove SSLv3 support from their websites. Since PhantomJS < v1.9.8 uses SSLv3 by default, you should use TLSv1:
casperjs --ssl-protocol=tlsv1 yourScript.js
The catchall solution would be to use any
for when newer PhantomJS versions come along with other SSL protocols. But this would make the POODLE vulnerability exploitable on sites which haven't yet disabled SSLv3.
casperjs --ssl-protocol=any yourScript.js
Alternative method: Update to PhantomJS 1.9.8 or higher. Note that updating to PhantomJS 1.9.8 leads to a new bug, which is especially annoying for CasperJS.
How to verify: Add a resource.error
event handler like this at the beginning of your script:
casper.on("resource.error", function(resourceError){
console.log('Unable to load resource (#' + resourceError.id + 'URL:' + resourceError.url + ')');
console.log('Error code: ' + resourceError.errorCode + '. Description: ' + resourceError.errorString);
});
If it is indeed a problem with SSLv3 the error will be something like:
Error code: 6. Description: SSL handshake failed
As an aside, you also might want to run with the --ignore-ssl-errors=true
commandline option, when there is something wrong with the certificate.
Cant open https web using Slimerjs, casperjs, phantomjs
The issue is not because of PhantomJS as such. The site you are checking is protected by a F5 network protection
https://devcentral.f5.com/articles/these-are-not-the-scrapes-youre-looking-for-session-anomalies
So its not that the page doesn't load. It is that the protection mechanism detects that PhantomJS is a bot based on checks they have implemented
The easiest of fixes is to use Chrome instead of PhantomJS
. Else it means a decent amount of investigation time
Some similar unanswered/answered question in the past
Selenium and PhantomJS : webpage thinks Javascript is disabled
PhantomJS get no real content running on AWS EC2 CentOS 6
file_get_contents while bypassing javascript detection
Python POST Request Not Returning HTML, Requesting JavaScript Be Enabled
I will update this post with more details that I find. But my experience says, go with what works instead of wasting time on such sites which don't work under PhantomJS
Update-1
I have tried to import the browser cookies to PhantomJS and it still won't work. Which means there is some hard checks
CasperJS not loading page
Okay, so after playing around with this for quite some time, Artom B.'s comment finally lead me in the right direction. CasperJS requires using a version of PhantomJS 1.8.2 or higher, but less than 2.0.0.
So I uninstalled PhantomJS, installed version 1.9.8 and it still didn't work. So, next I uninstalled CasperJS and installed the development version and ran my script with
casperjs --ssl-protocol=tlsv1 --ignore-ssl-errors=true --cookies-file=/tmp/cookies.txt JScraper.js
This did the trick.
PhantomJS failing to open HTTPS site
I tried Fred's and Cameron Tinker's answers, but only --ssl-protocol=any option seem to help me:
phantomjs --ssl-protocol=any test.js
Also I think it should be way safer to use --ssl-protocol=any
as you still are using encryption, but --ignore-ssl-errors=true
will ignore (duh) all ssl errors, including malicious ones.
PhantomJS 2.0.0 doesn't wait for page to load
PhantomJS doesn't define when in the page load process the page.open
callback is called. So, there's nothing actually wrongly claimed.
It could be that you can add a static wait amount with setTimeout()
which should help for dynamic sites. There are also approaches where you can see if there are pending requests by counting how many requests where sent with page.onResourceRequested
and how many requests finished with page.onResourceReceived
/page.onResourceTimeout
/page.onResourceError
.
If it is actually a PhantomJS bug, then there is not much can to besides try some of the command line switches.
Related Topics
Accessing Private Member Variables from Prototype-Defined Functions
How to Convert a Currency String to a Double with JavaScript
Using Settimeout on Promise Chain
Getting a Better Understanding of Callback Functions in JavaScript
Jquery.Parsejson Single Quote VS Double Quote
How to Dynamically Create '@-Keyframe' CSS Animations
Adding Multiple Event Listeners to One Element
How to Solve Uncaught Rangeerror When Download Large Size JSON
Creating Dynamic Button with Click Event in JavaScript
Enter Key Press Behaves Like a Tab in JavaScript
JavaScript Filter Array Multiple Conditions
Dynamically Set Property of Nested Object
Difference Between HTMLcollection, Nodelists, and Arrays of Objects
Scope of Sessionstorage and Localstorage