Fastest Method to Escape HTML Tags as HTML Entities

Fastest method to escape HTML tags as HTML entities?

You could try passing a callback function to perform the replacement:

var tagsToReplace = {
'&': '&',
'<': '<',
'>': '>'
};

function replaceTag(tag) {
return tagsToReplace[tag] || tag;
}

function safe_tags_replace(str) {
return str.replace(/[&<>]/g, replaceTag);
}

Here is a performance test: http://jsperf.com/encode-html-entities to compare with calling the replace function repeatedly, and using the DOM method proposed by Dmitrij.

Your way seems to be faster...

Why do you need it, though?

Can I escape HTML special chars in JavaScript?

Here's a solution that will work in practically every web browser:

function escapeHtml(unsafe)
{
return unsafe
.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/"/g, """)
.replace(/'/g, "'");
}

If you only support modern web browsers (2020+), then you can use the new replaceAll function:

const escapeHtml = (unsafe) => {
return unsafe.replaceAll('&', '&').replaceAll('<', '<').replaceAll('>', '>').replaceAll('"', '"').replaceAll("'", ''');
}

Escaping and in JavaScript code when assigning it to InnerHtml

You can either use HTML codes of > and < which are < and >

OR

You can wrap your script within CDATA like this:

<script>
<![CDATA[
--YOUR SCRIPT--
]]>
</script>

Encode HTML entities in JavaScript

You can use regex to replace any character in a given unicode range with its html entity equivalent. The code would look something like this:

var encodedStr = rawStr.replace(/[\u00A0-\u9999<>\&]/g, function(i) {
return '&#'+i.charCodeAt(0)+';';
});

This code will replace all characters in the given range (unicode 00A0 - 9999, as well as ampersand, greater & less than) with their html entity equivalents, which is simply &#nnn; where nnn is the unicode value we get from charCodeAt.

See it in action here: http://jsfiddle.net/E3EqX/13/ (this example uses jQuery for element selectors used in the example. The base code itself, above, does not use jQuery)

Making these conversions does not solve all the problems -- make sure you're using UTF8 character encoding, make sure your database is storing the strings in UTF8. You still may see instances where the characters do not display correctly, depending on system font configuration and other issues out of your control.

Documentation

  • String.charCodeAt - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt
  • HTML Character entities - http://www.chucke.com/entities.html

Escaping HTML strings with jQuery

Since you're using jQuery, you can just set the element's text property:

// before:
// <div class="someClass">text</div>
var someHtmlString = "<script>alert('hi!');</script>";

// set a DIV's text:
$("div.someClass").text(someHtmlString);
// after:
// <div class="someClass"><script>alert('hi!');</script></div>

// get the text in a string:
var escaped = $("<div>").text(someHtmlString).html();
// value:
// <script>alert('hi!');</script>

HTML Entity Decode

You could try something like:

var Title = $('<textarea />').html("Chris' corner").text();console.log(Title);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

What's the easiest way to escape HTML in Python?

html.escape is the correct answer now, it used to be cgi.escape in python before 3.2. It escapes:

  • < to <
  • > to >
  • & to &

That is enough for all HTML.

EDIT: If you have non-ascii chars you also want to escape, for inclusion in another encoded document that uses a different encoding, like Craig says, just use:

data.encode('ascii', 'xmlcharrefreplace')

Don't forget to decode data to unicode first, using whatever encoding it was encoded.

However in my experience that kind of encoding is useless if you just work with unicode all the time from start. Just encode at the end to the encoding specified in the document header (utf-8 for maximum compatibility).

Example:

>>> cgi.escape(u'<a>bá</a>').encode('ascii', 'xmlcharrefreplace')
'<a>bá</a>

Also worth of note (thanks Greg) is the extra quote parameter cgi.escape takes. With it set to True, cgi.escape also escapes double quote chars (") so you can use the resulting value in a XML/HTML attribute.

EDIT: Note that cgi.escape has been deprecated in Python 3.2 in favor of html.escape, which does the same except that quote defaults to True.

Escape HTML entities and render URL dynamically

Here you have your sandbox working: https://iframe-dynamic-src-pmxqbb.stackblitz.io

I've fixed it by:

  <iframe
src={decodeURIComponent(
encodeURIComponent(brokeUrl.replace(/&/g, "&"))
)}
width="800"
height="600"
frameborder="0"
scrolling="no"
content
/>

Decoding an encoded URL that replaces globally the ampersands (&) by &.



Related Topics



Leave a reply



Submit