How to get rendered html (processed by Javascript) in WebBrowser control?
Here is one solution I found to get to the rendered HTML(DOM) after javascript was run:
Place a WebBrowser control named webBrowser1 on the Form of class Form1.
[Form1.cs[Design]]
Then for code use:
[Form1.cs]
using System;
using System.Runtime.InteropServices;
using System.Windows.Forms;
namespace WebBrowserTest
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
this.webBrowser1.ObjectForScripting = new MyScript();
}
private void Form1_Load(object sender, EventArgs e)
{
webBrowser1.Navigate("http://localhost:6489/Default.aspx");
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
webBrowser1.Navigate("javascript: window.external.CallServerSideCode();");
}
[ComVisible(true)]
public class MyScript
{
public void CallServerSideCode()
{
var doc = ((Form1)Application.OpenForms[0]).webBrowser1.Document;
}
}
}
}
Change the webBrowser1.Navigate("http://localhost:6489/Default.aspx") parameter in Form1_Load to the page whose DOM after being processed by javascript you wish to obtain.
You can access the modified DOM in the CallServerSideCode() method, for example:
doc.GetElementById("myDataTable");
Or you can access the rendered HTML like this:
var renderedHtml = doc.GetElementsByTagName("HTML")[0].OuterHtml;
How to get HTML from WebBrowser control
Your samples refer to the WinForms-WebBrowserControl.
Add a reference to Microsoft.mshtml (via add-reference dialog->search) to your project.
Cast the Document-Property to
HTMLDocument
in order to access methods and properties (as stated on MSDN).
See also my GitHub-Sample:
private void WebBrowser_Navigated(object sender, NavigationEventArgs e) {
var document = (HTMLDocument)_Browser.Document;
_Html.Text = document.body.outerHTML;
}
Get dynamically generated (rendered) HTMl from IE
Im using internet explorer 9 but the process should be the same or very similar for ie 11:
- Navigate to the webpage
- Press F12 to launch developer tools (Also available from the tools menu)
- Right click the opening HTML tag
- Select copy outerHTML
You should now have all the dynamic HTML in your clipboard to paste where you like
Get HTML Source after JavaScript manipulations
The trick is going to be finding a way to notify the control about whether the JS is done running. You might be able to do that by having the JS set a form element' value (isJSComplete) when it has completed and polling with the web browser control.
Use the following code to check a form value to see if it is ready
MyBrowserControl.document.getElementById('isJSComplete');
Use the following code to pull the HTML from the page.
MyBrowserControl.Document.documentElement.OuterHTML
Better yet, here is an article showing how to wire up JS events to be handled by the WebBrowser control. You could just fire an event when the JS is done and have your code trap that event and then pull the HTML using the above approach.
Get HTML from Frame using WebBrowser control - unauthorizedaccessexception
Thanks to the Noseratio's comments I managed to do that with the WebBrowser control. Here are some major points that might help others who have similar questions:
1) DocumentCompleted event should be used. For Navigated event body of the document is NULL.
2) Following answer helped a lot: WebBrowserControl: UnauthorizedAccessException when accessing property of a Frame
3) I was not aware about IHTMLWindow2 similar interfaces, for them to work correctly I added references to following COM libs: Microsoft Internet Controls (SHDocVw), Microsoft HTML Object Library (MSHTML).
4) I grabbed the html of the frame with the following code:
void WebBrowserMain_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (e.Url.OriginalString == Constants.FINAL_URL)
{
try
{
var doc = (IHTMLDocument2) WebBrowserMain.Document.DomDocument;
var frame = (IHTMLWindow2) doc.frames.item(0);
var document = CrossFrameIE.GetDocumentFromWindow(frame);
var html = document.body.outerHTML;
var dataParser = new DataParser(html);
//my logic here
}
5) For the work with Html, I used the fine HTML Agility Pack that has some pretty good XPath search.
Related Topics
How to Copy Data to Clipboard in C#
Call Ruby or Python API in C# .Net
Difference Between "\N" and Environment.Newline
Validateantiforgerytoken Purpose, Explanation and Example
Should I Call Close() or Dispose() for Stream Objects
Order of Event Handler Execution
How to Find If a Native Dll File Is Compiled as X64 or X86
Thread Safe C# Singleton Pattern
How to Check If a String Contains Any of Some Strings
Bundler Not Including .Min Files
How to Populate/Instantiate a C# Array with a Single Value
Change C# Dllimport Target Code Depending on X64/X86
Is There Any Benefit to This Switch/Pattern Matching Idea
What Does the Tilde Before a Function Name Mean in C#
Open Image from File, Then Release Lock
Are Static Class Instances Unique to a Request or a Server in ASP.NET