Getting Unparsed (Raw) HTML with JavaScript

Getting unparsed (raw) HTML with JavaScript

What you have should work:

Element test:

<div id="myE">How to fix</div>​

JavaScript test:

alert(document.getElementById("myE​​​​​​​​").innerHTML); //alerts "How to fix"

You can try it out here. Make sure that wherever you're using the result isn't show   as a space, which is likely the case. If you want to show it somewhere that's designed for HTML, you'll need to escape it.

Getting raw text content of HTML element with HTML uninterpreted

To quote bobince

When you ask the browser for an element node's innerHTML, it doesn't
give you the original HTML source that was parsed to produce that
node, because it no longer has that information. Instead, it generates
new HTML from the data stored in the DOM. The browser decides on how
to format that HTML serialisation; different browsers produce
different HTML, and chances are it won't be the same way you formatted
it originally.

In summary: innerHTML/innerText/text/textContent/nodeValue/indexOf, none of them will give you the unparsed text.

The only possible way to do this is with regex, or you can do an ajax post to the page itself, but that is a bad practice.

Get unparsed element content

The most simple approach would be to use a "backend" language to transform your given char since most of them (for example PHP and .net) provide a built-in function for that.

Unfortunately there is no existing vanilla function that implements this in javascript (at least not to my knowledge).

For example in php this would be:

string htmlentities(string $string....)

For javascript you could create your own solution.
Basically create a list of your needed html entities or take a pre-defined list like this (https://raw.githubusercontent.com/w3c/html/master/entities.json) and work from there on.

Iterate over each object and check if your searched character (é) is present in an object.

To speed up the process I'd save the JSON file to your webhost and maybe reduce its size by removing not needed entities.

It may not be the most beautiful solution but definetly does the job pretty well.

let element = document.getElementById('test').textContent;

fetch("https://raw.githubusercontent.com/w3c/html/master/entities.json") .then((resp) => resp.json()) // Transform the data into json .then(function(data) { Object.keys(data).forEach(function(key){ if (data.hasOwnProperty(key)){ if(data[key].characters === element) { console.log(key); } }});})
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script><div id="test">é</div>

How to display raw HTML code in PRE or something like it but without escaping it

You can use the xmp element, see What was the <XMP> tag used for?. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).

Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, </xmp>.

Otherwise xmp is rendered like pre.

When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:

<pre><![CDATA[
This is a demo, tags like <p> will
appear literally.
]]></pre>

I don’t see how you could combine xmp and CDATA section to achieve so-called polyglot markup

keep indentation structure when putting html into a variable

$.ajax({
url: document.location,
dataType: "html" // get plain source
}).done(function(text) {
$(function() {
$("body").text(text).html(function(_, old) {
return old.replace(/<\/?td>/g, '<span class="red">$&</span>');
}).css({"white-space":"pre-wrap", "text-align":"left", "font-family":"monospace"});
});
});

Get raw HTTP response in NodeJS

I'm curious what problem you're really trying to solve because there's probably a better way.

But, if you just want to hack into a given response to see exactly what is being sent over that socket, you can monkey patch the socket.write() method to do something like this:

const app = require('express')();

app.get("/", (req, res) => {
// monkey patch socket.write
// so we can log every sent over the socket
const socket = req.socket;
socket.origWrite = socket.write;
socket.write = function(data, encoding, callback) {
if (Buffer.isBuffer(data)) {
console.log(data.toString());
} else {
console.log(data);
}
return socket.origWrite(data, encoding, callback);
}
res.cookie("color", "blue");
res.send("Hi. This is my http response.")
});

app.listen(80);

When I ran that and made a browser request to that route, I saw this in my console:

HTTP/1.1 200 OK
X-Powered-By: Express
Set-Cookie: color=blue; Path=/
Content-Type: text/html; charset=utf-8
Content-Length: 30
ETag: W/"1e-eJoRAEkyvi+cvBVvRkYOHolFbNc"
Date: Wed, 15 Dec 2021 19:43:20 GMT
Connection: keep-alive
Keep-Alive: timeout=5

Hi. This is my http response.

Which matches exactly what the Chrome debugger shows the http response was on the receiving end of things.

I spent a fair amount of time looking for some debug flags built into nodejs that would output this automatically, but could not find any. The 'net' module does have some debugging, but it has to do with socket events and lifetime, not with actual data being sent/received.


FYI, you could also "inspect" the raw network data using a network analyzer such as WireShark (there are many others also) which patches into your network adapter and can be configured to watch things and show you exactly what data is being sent/received.



Related Topics



Leave a reply



Submit