Encode HTML entities in JavaScript
You can use regex to replace any character in a given unicode range with its html entity equivalent. The code would look something like this:
var encodedStr = rawStr.replace(/[\u00A0-\u9999<>\&]/g, function(i) {
return ''+i.charCodeAt(0)+';';
});
This code will replace all characters in the given range (unicode 00A0 - 9999, as well as ampersand, greater & less than) with their html entity equivalents, which is simply nnn;
where nnn
is the unicode value we get from charCodeAt
.
See it in action here: http://jsfiddle.net/E3EqX/13/ (this example uses jQuery for element selectors used in the example. The base code itself, above, does not use jQuery)
Making these conversions does not solve all the problems -- make sure you're using UTF8 character encoding, make sure your database is storing the strings in UTF8. You still may see instances where the characters do not display correctly, depending on system font configuration and other issues out of your control.
Documentation
String.charCodeAt
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt- HTML Character entities - http://www.chucke.com/entities.html
Native JavaScript or ES6 way to encode and decode HTML entities?
There is no native function in the JavaScript API that convert ASCII characters to their "html-entities" equivalent.
Here is a beginning of a solution and an easy trick that you may like
HTML Entity Decode
You could try something like:
var Title = $('<textarea />').html("Chris' corner").text();console.log(Title);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
Javascript encode HTML entities on server
Since I asked this question, I learned JavaScript and AJAX. So, my suggestion will be using AJAX and JSON for communication between browser and server-side.
Convert special characters to HTML in JavaScript
You need a function that does something like
return mystring.replace(/&/g, "&").replace(/>/g, ">").replace(/</g, "<").replace(/"/g, """);
But taking into account your desire for different handling of single/double quotes.
Encode HTML entities
Basically you should encode your html entities into html as such:
var encodedStr = data['1']['result']['content'];
var a = $("#content").html(encodedStr).text();
Then get the encoded text and apply it as html() as such:
$("#content").html(a);
That should work.
Demo: http://jsbin.com/ihadam/9/edit
Unescape HTML entities in JavaScript?
EDIT: You should use the DOMParser API as Wladimir suggests, I edited my previous answer since the function posted introduced a security vulnerability.
The following snippet is the old answer's code with a small modification: using a textarea
instead of a div
reduces the XSS vulnerability, but it is still problematic in IE9 and Firefox.
function htmlDecode(input){
var e = document.createElement('textarea');
e.innerHTML = input;
// handle case of empty input
return e.childNodes.length === 0 ? "" : e.childNodes[0].nodeValue;
}
htmlDecode("<img src='myimage.jpg'>");
// returns "<img src='myimage.jpg'>"
Basically I create a DOM element programmatically, assign the encoded HTML to its innerHTML and retrieve the nodeValue from the text node created on the innerHTML insertion. Since it just creates an element but never adds it, no site HTML is modified.
It will work cross-browser (including older browsers) and accept all the HTML Character Entities.
EDIT: The old version of this code did not work on IE with blank inputs, as evidenced here on jsFiddle (view in IE). The version above works with all inputs.
UPDATE: appears this doesn't work with large string, and it also introduces a security vulnerability, see comments.
A plain JavaScript way to decode HTML entities, works on both browsers and Node
There are many similar questions and useful answers in stackoverflow but I can't find a way works both on browsers and Node.js. So I'd like to share my opinion.
For html codes like
<
>
'
and even Chinese characters.
I suggest to use this function. (Inspired by some other answers)
function decodeEntities(encodedString) {
var translate_re = /&(nbsp|amp|quot|lt|gt);/g;
var translate = {
"nbsp":" ",
"amp" : "&",
"quot": "\"",
"lt" : "<",
"gt" : ">"
};
return encodedString.replace(translate_re, function(match, entity) {
return translate[entity];
}).replace(/(\d+);/gi, function(match, numStr) {
var num = parseInt(numStr, 10);
return String.fromCharCode(num);
});
}
This implement also works in Node.js environment.
decodeEntities("哈哈 '这个'&"那个"好玩<>") //哈哈 '这个'&"那个"好玩<>
As a new user, I only have 1 reputation :(
I can't make comments or answers to existing posts so that's the only way I can do for now.
Edit 1
I think this answer works even better than mine. Although no one gave him up vote.
How to encode html tag entities - JavaScript
Using String prototype
A possible solution is defining a replaceAll function, e.g. in the prototype of String
:
String.prototype.replaceAll = function(search, replace) {
return this.replace(new RegExp('[' + search + ']', 'g'), replace);
};
After this, you only need to iterate over the properties of charsToReplace
:
for (var prop in charsToReplace) {
if (charsToReplace.hasOwnProperty(prop)) {
str = str.replaceAll(prop, charsToReplace[prop]));
}
}
The final str
can be assigned to the innerHTML
.
Using normal function
If for some reason, you do not want to mess with the prototype, you may define a normal JavaScript function for the same task:
var replaceAll = function (str, search, replace) {
return str.replace(new RegeExp('[' + search + ']', 'g'), replace);
}
This works really the same way, you just need to pass the string instance to it:
str = replaceAll(str, prop, charsToReplace[prop]));
Another approach
If you would use these methods often, you might consider storing the regex patterns in your charsToReplace
object, like this:
var charsToReplace = {
'&': {
pattern: new RegExp('[' + '&' + ']', 'g'),
replace: '&'
}
...
}
So your replaceAll
function would look like this:
var replaceAll = function (str, replacement) {
return str.replace(replacement.pattern, replacement.replace);
}
This way you would not need to recreate the regular expressions every time, which could save some processing time.
Related Topics
How to Programmatically Set the Value of a Select Box Element Using JavaScript
Google Maps API V3: How to Remove All Markers
How to Wait For the 'End' of 'Resize' Event and Only Then Perform an Action
How to Detect Ctrl+V, Ctrl+C Using JavaScript
Html Text-Overflow Ellipsis Detection
Sanitize/Rewrite HTML on the Client Side
How to Pick Element Inside Iframe Using Document.Getelementbyid
How to Create a New Line in JavaScript
Difference Between Node Object and Element Object
How to Change the Text of a Span Element Using JavaScript
Make Header and Footer Files to Be Included in Multiple HTML Pages
Jquery: Get Height of Hidden Element in Jquery
Adding Input Elements Dynamically to Form
How to Impose Maxlength on Textarea in HTML Using JavaScript
How to Prevent Unicode Characters from Rendering as Emoji in HTML from JavaScript
How to Link a JavaScript File to a HTML File
How to Add/Update an Attribute to an HTML Element Using JavaScript