PHP: Convert unicode codepoint to UTF-8
$utf8string = html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "\\1;", $string), ENT_NOQUOTES, 'UTF-8');
is probably the simplest solution.
Convert unicode special characters to UTF-8
At the very least your regular expression is looking for an uppercase U
, while all your escape sequences use lower-case.
But your conversion script goes from javascript-escaped unicode characters, to HTML entities, back to a PHP string. This might be a saner solution (for this string):
$unicode = '\u0411. \u0426\u044d\u0446\u044d\u0433\u0441\u04af\u0440\u044d\u043d';
echo json_decode('"' . $unicode . '"');
Be careful though, as this might break if the input string contains newlines or quotes.
How to Convert string to utf-8 codepoint in php
I found the answer but it return array here
I Edit the function to return String.
function utf8_to_unicode($str) {
$unicode = array();
$values = array();
$lookingFor = 1;
for ($i = 0; $i < strlen($str); $i++) {
$thisValue = ord($str[$i]);
if ($thisValue < 128)
$unicode[] = str_pad(dechex($thisValue), 4, "0", STR_PAD_LEFT);
else {
if (count($values) == 0) $lookingFor = ($thisValue < 224) ? 2 : 3;
$values[] = $thisValue;
if (count($values) == $lookingFor) {
$number = ($lookingFor == 3) ?
(($values[0] % 16) * 4096) + (($values[1] % 64) * 64) + ($values[2] % 64):
(($values[0] % 32) * 64) + ($values[1] % 64);
$number = strtoupper(dechex($number));
$unicode[] = str_pad($number, 4, "0", STR_PAD_LEFT);
$values = array();
$lookingFor = 1;
} // if
} // if
} // for
$str="";
foreach ($unicode as $key => $value) {
$str .= $value;
}
return ($str);
} // utf8_to_unicode
UTF-8 to Unicode Code Points
Converting one character set to another can be done with iconv:
http://php.net/manual/en/function.iconv.php
Note that UTF is already an Unicode encoding.
Another way is simply using htmlentities with the right character set:
http://php.net/manual/en/function.htmlentities.php
PHP Unicode codepoint to character
You don't need to convert integer to hexadecimal string, instead use IntlChar::chr:
echo IntlChar::chr(127468);
Directly from docs of IntlChar::chr
:
Return Unicode character by code point value
How to convert a UTF-8 string to HEX codepoint in PHP?
I take json_encode for multibyte characters and assemble it for the ASCII characters.
function utf8toUnicode($str){
$unicode = "";
$len = mb_strlen($str);
for($i=0;$i<$len;$i++){
$utf8char = mb_substr($str,$i,1);
$unicode .= strlen($utf8char)>1
?trim(json_encode($utf8char),'"')
:('\\u00'.bin2hex($utf8char))
;
}
return $unicode;
}
$str = 'sÆs';
echo utf8toUnicode($str); // \u0073\u00c6\u0073
PHP - convert unicode to character
"%uXXXX" is a non-standard scheme for URL-encoding Unicode characters. Apparently it was proposed but never really used. As such, there's hardly any standard function that can decode it into an actual UTF-8 sequence.
It's not too difficult to do it yourself though:
$string = '%u05E1%u05E2';
$string = preg_replace('/%u([0-9A-F]+)/', '$1;', $string);
echo html_entity_decode($string, ENT_COMPAT, 'UTF-8');
This converts the %uXXXX
notation to HTML entity notation XXXX;
, which can be decoded to actual UTF-8 by html_entity_decode
. The above outputs the characters "סע" in UTF-8 encoding.
Related Topics
Convert a Comma-delimited String into Array of Integers
PHP Get All Subdirectories of a Given Directory
Is Micro-Optimization Worth the Time
How to Create a Copy of an Object in PHP
Send Value of Submit Button When Form Gets Posted
Finding the Subsets of an Array in PHP
Facebook Graph API, How to Get Users Email
How to Run a PHP File in a Scheduled Task (Windows Task Scheduler)
How Are Echo and Print Different in PHP
How to Show Multiple Recaptchas on a Single Page
PHP to Search Within Txt File and Echo the Whole Line
Error Message Strict Standards: Non-Static Method Should Not Be Called Statically in PHP
How to Resize Pngs With Transparency in PHP