Javascript: Unicode string to hex
Remember that a JavaScript code unit is 16 bits wide. Therefore the hex string form will be 4 digits per code unit.
usage:
var str = "\u6f22\u5b57"; // "\u6f22\u5b57" === "漢字"
alert(str.hexEncode().hexDecode());
String to hex form:String.prototype.hexEncode = function(){
var hex, i;
var result = "";
for (i=0; i<this.length; i++) {
hex = this.charCodeAt(i).toString(16);
result += ("000"+hex).slice(-4);
}
return result
}
Back again:String.prototype.hexDecode = function(){
var j;
var hexes = this.match(/.{1,4}/g) || [];
var back = "";
for(j = 0; j<hexes.length; j++) {
back += String.fromCharCode(parseInt(hexes[j], 16));
}
return back;
}
Encode String to HEX
I solved it by downloading utf8.js
https://github.com/mathiasbynens/utf8.js
then using the String2Hex
function from above:
alert(String2Hex(utf8.encode('守护村子')));
It gives me the output I want:e5ae88e68aa4e69d91e5ad90
Convert 16-bit Unicode hex string in Javascript
As String.prototype.substr()
is a non-standard method, you should avoid its use like in the following example:
const str = "680065006C006C006F00", strLen = str.length;
let decoded = "";
for (const i = 0; i < strLen; i+=4) {
let c = parseInt(str[i+2] + str[i+3] + str[i] + str[i+1], 16)
decoded += String.fromCodePoint(c);
}
Although I would define some functions to handle it instead:const decodeUTF16BytePair = (p) => String.fromCodePoint(parseInt(p[2]+p[3]+p[0]+p[1], 16));
const decodeUTF16ByteString = (str) => {
let decoded = "", strLen = str.length;
if (strLen % 4 != 0) {
throw new Error("Unexpected byte string length");
}
for (let i = 0; i < strLen; i+=4) {
decoded += decodeUTF16BytePair(str.slice(i, i+4));
}
return decoded;
}
Because String.fromCodePoint()
can coerce it's argument to a number, we could also use:const decodeUTF16ByteString = (str) => {
let decoded = "", strLen = str.length;
if (strLen % 4 != 0) {
throw new Error("Unexpected byte string length");
}
for (let i = 0; i < strLen; i+=4) {
decoded += String.fromCodePoint("0x"+str[i+2]+str[i+3]+str[i]+str[i+1]);
}
return decoded;
}
Importantly, with the approach as above, you are switching the endianness of the input bytes but some platforms may not need this inversion.Live Example
const decodeUTF16ByteString = (str) => {
let decoded = "", strLen = str.length;
if (strLen % 4 != 0) {
throw new Error("Unexpected byte string length");
}
for (let i = 0; i < strLen; i+=4) {
decoded += String.fromCodePoint("0x"+str[i+2]+str[i+3]+str[i]+str[i+1]);
}
return decoded;
}
const input = "680065006C006C006F00";
const output = decodeUTF16ByteString(input)
console.log({ input, output });
JavaScript - Encode/Decode UTF8 to Hex and Hex to UTF8
Your utf8toHex is using encodeURIComponent, and this won't make everything HEX.
So I've slightly modified your utf8toHex to handle HEX.
Update
Forgot toString(16) does not pre-zero the hex, so if they was
values less 16, eg. line feeds etc it would fail
So, to added the 0 and sliced to make sure.Update 2,
Use TextEncoder, this will handle UTF-8 much better than use charCodeAt.
function hexToUtf8(s)
{
return decodeURIComponent(
s.replace(/\s+/g, '') // remove spaces
.replace(/[0-9a-f]{2}/g, '%$&') // add '%' before each 2 characters
);
}
const utf8encoder = new TextEncoder();
function utf8ToHex(s)
{
const rb = utf8encoder.encode(s);
let r = '';
for (const b of rb) {
r += ('0' + b.toString(16)).slice(-2);
}
return r;
}
var hex = "d7a452656c6179204f4e214f706572617465642062792030353232";
var utf8 = hexToUtf8(hex);
var hex2 = utf8ToHex(utf8);
console.log("Hex: " + hex);
console.log("UTF8: " + utf8);
console.log("Hex2: " + hex2);
console.log("Is conversion OK: " + (hex == hex2));
Converting unicode to hex
Note that in your console.log(code, codeHex);
you have no space between the two values code
and codeHex
, so you'll get to see a seemingly big value (1040410
).
So separate like this:
console.log(code, ' ', codeHex);
and if you want a nice hex formatting, do this:console.log(code, ' ', '0x' + ('0000' + codeHex).substr(-4));
Snippet:var code = "А".charCodeAt(0);var codeHex = code.toString(16).toUpperCase();document.write(code, ' ', '0x' + ('0000' + codeHex).substr(-4));
How to convert decimal to hexadecimal in JavaScript
Convert a number to a hexadecimal string with:
hexString = yourNumber.toString(16);
And reverse the process with:yourNumber = parseInt(hexString, 16);
convert string or character to hex codes in javascript
As I searched and searched and tested, the only way was to convert charecter by character in a very long if-else-if
statement !! It was also printed backward so I had to revert the string first. then numbers and english letters would be reversed so I had to made them reverse again!! It was a headache but worked!
Convert hex value to unicode character
Most emojis require two code units, including that one. fromCharCode
works in code units (JavaScript's "characters" are UTF-16 code units except invalid surrogate pairs are tolerated), not code points (actual Unicode characters).
In modern environments, you'd use String.fromCodePoint
or just a Unicode codepoint escape sequence (\u{XXXXX}
rather than \uXXXX
, which is for code units). There's also no need for parseInt
:
console.log(String.fromCodePoint(0x1f600));console.log("\u{1f600}");
Related Topics
Unexpected Token Illegal in Webkit
How to Calculate the Number of Years Between Two Dates
How to Check If an Embedded Svg Document Is Loaded in an HTML Page
Why Does This Foreach Return Undefined When Using a Return Statement
Prototyping Object in JavaScript Breaks Jquery
How to Access an Access Array Item by Index in Handlebars
How to Deeply Merge Two Object Values by Keys
What Is the JavaScript Mime Type for the Type Attribute of a Script Tag
JavaScript Summing Large Integers
What Is Event Pooling in React
JavaScript Time Zone Is Wrong for Past Daylight Saving Time Transition Rules
Is There a Mechanism to Loop X Times in Es6 (Ecmascript 6) Without Mutable Variables
Check If File Exists But Prevent 404 Error in Console from Showing Up