htmlunit always gives multiple javascript exceptions after running the project
The setJavaScriptEnabled(false)
should be called before getting the page, otherwise the JavaScript will be executed.
So, your code should be:
final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED);
webClient.getOptions().setJavaScriptEnabled(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setRedirectEnabled(true);
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF);
final HtmlPage page1 = webClient.getPage("https://login.yahoo.com/account/create?specId=yidReg&lang=en-JO&src=ym&done=https%3A%2F%2Fmail.yahoo.com&display=login&intl=xa");
How to overcome error from a single javascript while enabling others in HTMLUnit?
You can use
webClient.getOptions.setThrowExceptionOnScriptError(false);
For more have a look at https://htmlunit.sourceforge.io/javascript-howto.html.
BTW: this is an error in the page itself because you got the same in real browsers.
java htmlunit throwing ScriptException on Clicking Submit Button of a Page
In the first time, tell webclient not to throw exception for script error by enabling the option
webClient.getOptions().setThrowExceptionOnScriptError(false);
Java HTMLUnit WebClient ScriptException errors
Without knowing the page and more details about you your code i can only try to give some advice
- you HtmlUnit version is really outdated (2.19 is from Nov 12, 2015) and we are now at 2.35.0. Please use the latest one....
- check the browser log from real browsers to see if the error is there also
- webClient.getOptions().setThrowExceptionOnScriptError(false); changes the behavior of HtmlUnit to not throw an exception if a unhandled js exception is detected. This is more or less the same way of handling js exceptions as real browsers do. But (comparable to real browsers) HtmlUnit still logs this exceptions. If you don't like to get informed about this problem you have to configure the logger.
HtmlUnit ScriptException errors
Many questions are asked referencing this kind of issues. The ScriptException
is raised because you have a syntactical error in your javascript. Most browsers manage to interpret the JS even with some kind of errors but HtmlUnit is a bit inflexible in that sense.
Your options are:
- Correct JS code
- Disable JS in the
WebClient
- Don't use HtmlUnit. Use a different framework with better JS support such as PhantomJS (note it is not a Java-based framework)
htmlunit Cannot read property push from undefined
I've encountered a similar problem before. This is an issue with HTML Unit being designed as a test harness framework rather than a web scraping one. Are you running the latest version of HTML Unit?
I was able to run your code by adding both the setThrowExceptionOnScriptError(false)
(as mentioned in Coffee Converter's answer) line as well as addingjava.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(java.util.logging.Level.OFF);
at the top of the method to disable the log dump. This yielded an output of:
Royal Filmpalast München München | kinoheld.de
Full code is as follows:
public static void main(String[] args) throws IOException {
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(java.util.logging.Level.OFF);
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);
String url = "https://www.kinoheld.de/kino-muenchen/royal-filmpalast/vorstellung/280823/?mode=widget&showID=280828#panel-seats";
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.waitForBackgroundJavaScript(9000);
HtmlPage response = webClient.getPage(url);
System.out.println(response.getTitleText());
}
This was run on RedHat command line with HTML Unit 2.2.1. Hope this helps.
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot call method “appendChild” of null
if the result.html without
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK"/>
<title>login</title>
</head>
or add body tag in login.html
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK" />
</head>
<body>
<script>
function getContent(){
var url= "result.html";
var xhr=new (window.XMLHttpRequest||window.ActiveXObject)("Microsoft.XMLHTTP");
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 && xhr.status == 200) {
document.write(xhr.responseText);
document.close();
}
};
xhr.open("GET",url,false); xhr.send();
}
getContent();
</script>
</body>
</html>
will be success. who can tell me why?
result.html javascript add
d.write("d:" + d + "<br/>");
d.write("b:" + b + "<br/>");
ouput:
d:[object HTMLDocument]
<br/>
b:[object HTMLBodyElement]
<br/>
<div>
<div>
I was appended...
</div>
</div>
Related Topics
Date Format Conversion Android
Android:Save a Bitmap to Bmp File Format
Android Studio Mailto Intent Doesn't Show Subject and Mail Body
Detecting Device Type in a Web Application
Pdfbox:Pdpagecontentstream's Append Mode Misbehaving
How to Install "Android Support Library" to Deploy a Gluon Mobile Application to Android
Aes Gcm Implementation with Authentication Tag in Java
Integration Testing Frameworks for Testing a Distributed System
Is Ruby Pass-By-Value or Pass-By-Reference
How to Run an R Program from Java
Java Object Analogue to R Data.Frame
Google Maps API and Custom Polyline Route Between Markers