PHP Unicode codepoint to character
You don't need to convert integer to hexadecimal string, instead use IntlChar::chr:
echo IntlChar::chr(127468);
Directly from docs of IntlChar::chr
:
Return Unicode character by code point value
How to get the character from unicode code point in PHP?
header('Content-Encoding: UTF-8');
function mb_html_entity_decode($string)
{
if (extension_loaded('mbstring') === true)
{
mb_language('Neutral');
mb_internal_encoding('UTF-8');
mb_detect_order(array('UTF-8', 'ISO-8859-15', 'ISO-8859-1', 'ASCII'));
return mb_convert_encoding($string, 'UTF-8', 'HTML-ENTITIES');
}
return html_entity_decode($string, ENT_COMPAT, 'UTF-8');
}
function mb_ord($string)
{
if (extension_loaded('mbstring') === true)
{
mb_language('Neutral');
mb_internal_encoding('UTF-8');
mb_detect_order(array('UTF-8', 'ISO-8859-15', 'ISO-8859-1', 'ASCII'));
$result = unpack('N', mb_convert_encoding($string, 'UCS-4BE', 'UTF-8'));
if (is_array($result) === true)
{
return $result[1];
}
}
return ord($string);
}
function mb_chr($string)
{
return mb_html_entity_decode('' . intval($string) . ';');
}
var_dump(hexdec('010F'));
var_dump(mb_ord('ó')); // 243
var_dump(mb_chr(243)); // ó
Unicode character in PHP string
Because JSON directly supports the \uxxxx
syntax the first thing that comes into my mind is:
$unicodeChar = '\u1000';
echo json_decode('"'.$unicodeChar.'"');
Another option would be to use mb_convert_encoding()
echo mb_convert_encoding('က', 'UTF-8', 'HTML-ENTITIES');
or make use of the direct mapping between UTF-16BE (big endian) and the Unicode codepoint:
echo mb_convert_encoding("\x10\x00", 'UTF-8', 'UTF-16BE');
How to get code point number for a given character in a utf-8 string?
Scott Reynen wrote a function to convert UTF-8 into Unicode. I found it looking at the PHP documentation.
function utf8_to_unicode( $str ) {
$unicode = array();
$values = array();
$lookingFor = 1;
for ($i = 0; $i < strlen( $str ); $i++ ) {
$thisValue = ord( $str[ $i ] );
if ( $thisValue < ord('A') ) {
// exclude 0-9
if ($thisValue >= ord('0') && $thisValue <= ord('9')) {
// number
$unicode[] = chr($thisValue);
}
else {
$unicode[] = '%'.dechex($thisValue);
}
} else {
if ( $thisValue < 128)
$unicode[] = $str[ $i ];
else {
if ( count( $values ) == 0 ) $lookingFor = ( $thisValue < 224 ) ? 2 : 3;
$values[] = $thisValue;
if ( count( $values ) == $lookingFor ) {
$number = ( $lookingFor == 3 ) ?
( ( $values[0] % 16 ) * 4096 ) + ( ( $values[1] % 64 ) * 64 ) + ( $values[2] % 64 ):
( ( $values[0] % 32 ) * 64 ) + ( $values[1] % 64 );
$number = dechex($number);
$unicode[] = (strlen($number)==3)?"%u0".$number:"%u".$number;
$values = array();
$lookingFor = 1;
} // if
} // if
}
} // for
return implode("",$unicode);
} // utf8_to_unicode
can I get the unicode value of a character or vise versa with php?
function _uniord($c) {
if (ord($c[0]) >=0 && ord($c[0]) <= 127)
return ord($c[0]);
if (ord($c[0]) >= 192 && ord($c[0]) <= 223)
return (ord($c[0])-192)*64 + (ord($c[1])-128);
if (ord($c[0]) >= 224 && ord($c[0]) <= 239)
return (ord($c[0])-224)*4096 + (ord($c[1])-128)*64 + (ord($c[2])-128);
if (ord($c[0]) >= 240 && ord($c[0]) <= 247)
return (ord($c[0])-240)*262144 + (ord($c[1])-128)*4096 + (ord($c[2])-128)*64 + (ord($c[3])-128);
if (ord($c[0]) >= 248 && ord($c[0]) <= 251)
return (ord($c[0])-248)*16777216 + (ord($c[1])-128)*262144 + (ord($c[2])-128)*4096 + (ord($c[3])-128)*64 + (ord($c[4])-128);
if (ord($c[0]) >= 252 && ord($c[0]) <= 253)
return (ord($c[0])-252)*1073741824 + (ord($c[1])-128)*16777216 + (ord($c[2])-128)*262144 + (ord($c[3])-128)*4096 + (ord($c[4])-128)*64 + (ord($c[5])-128);
if (ord($c[0]) >= 254 && ord($c[0]) <= 255) // error
return FALSE;
return 0;
} // function _uniord()
and
function _unichr($o) {
if (function_exists('mb_convert_encoding')) {
return mb_convert_encoding(''.intval($o).';', 'UTF-8', 'HTML-ENTITIES');
} else {
return chr(intval($o));
}
} // function _unichr()
PHP: Convert unicode codepoint to UTF-8
$utf8string = html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "\\1;", $string), ENT_NOQUOTES, 'UTF-8');
is probably the simplest solution.
Unicode codepoint escape syntax
You cannot use "\u{}" notation for conversion, use mb_chr() instead.
Example:
$unicode= 0x1f606;
echo mb_chr($unicode);
PHP - convert unicode to character
"%uXXXX" is a non-standard scheme for URL-encoding Unicode characters. Apparently it was proposed but never really used. As such, there's hardly any standard function that can decode it into an actual UTF-8 sequence.
It's not too difficult to do it yourself though:
$string = '%u05E1%u05E2';
$string = preg_replace('/%u([0-9A-F]+)/', '$1;', $string);
echo html_entity_decode($string, ENT_COMPAT, 'UTF-8');
This converts the %uXXXX
notation to HTML entity notation XXXX;
, which can be decoded to actual UTF-8 by html_entity_decode
. The above outputs the characters "סע" in UTF-8 encoding.
convert any string or a character in php to unicode code points
Use this function
function Unicode_decode($text) {
return implode(unpack('H*', iconv("UTF-8", "UCS-4BE", $text)));
}
If you want to have U+0000
use this :
for ($i=0; $i < strlen($word); $i++)
{
$wordConvert = Unicode_decode($word[$i]);
$result .= "U+" . substr($wordConvert, -4, 4) . "<br/>";
}
echo $result;
Related Topics
How to Run Cronjobs More Often Than Once Per Minute
Check If Download Is Completed
Mechanisms for Tracking Db Schema Changes
Cakephp 2.0 - How to Make Custom Error Pages
Int((0.1+0.7)*10) = 7 in Several Languages. How to Prevent This
Woocommerce: Assigning an Endpoint to a Custom Template in My Account Pages
Remove .PHP Extension with PHP
PHP Array_Filter with Arguments
Call a C Program from PHP and Read Program Output
How to Protect from Downloading a Video from a Site
How to Retrieve Utf-8 Accented Characters from Access via Pdo_Odbc
How to Convert String to Boolean PHP
PHP Create a Multidimensional Array from an Array with Relational Data