Why does the PHP json_encode function convert UTF-8 strings to hexadecimal entities?
Since PHP/5.4.0, there is an option called JSON_UNESCAPED_UNICODE
. Check it out:
https://php.net/function.json-encode
Therefore you should try:
json_encode( $text, JSON_UNESCAPED_UNICODE );
json_encode return encoded utf-8 string
Just use the JSON_UNESCAPED_UNICODE
flag:
echo json_encode($str, JSON_UNESCAPED_UNICODE);
Any way to return PHP `json_encode` with encode UTF-8 and not Unicode?
{"a":"\u00e1"}
and {"a":"á"}
are different ways to write the same JSON document; The JSON decoder will decode the unicode escape.
In php 5.4+, php's json_encode
does have the JSON_UNESCAPED_UNICODE
option for plain output. On older php versions, you can roll out your own JSON encoder that does not encode non-ASCII characters, or use Pear's JSON encoder and remove line 349 to 433.
json_encode returns JSON_ERROR_UTF8 when converting array to JSON
"If I try to connect to the database using charset=utf8 or charset=utf8mb4, it retrieves Cartão(bad codification), instead of Cartão (good codification)"
You are using latin1 as the display encoding, so that UTF-8 encoded, correct, text is displayed incorrectly.
Add charset=utf8
to the connection string and also set the response charset to UTF-8:
header('Content-Type: text/html;charset=utf-8');
How to keep json_encode() from dropping strings with invalid characters
php does try to spew an error, but only if you turn display_errors off. This is odd because the display_errors
setting is only meant to control whether or not errors are printed to standard output, not whether or not an error is triggered. I want to emphasize that when you have display_errors
on, even though you may see all kinds of other php errors, php doesn't just hide this error, it will not even trigger it. That means it will not show up in any error logs, nor will any custom error_handlers get called. The error just never occurs.
Here's some code that demonstrates this:
error_reporting(-1);//report all errors
$invalid_utf8_char = chr(193);
ini_set('display_errors', 1);//display errors to standard output
var_dump(json_encode($invalid_utf8_char));
var_dump(error_get_last());//nothing
ini_set('display_errors', 0);//do not display errors to standard output
var_dump(json_encode($invalid_utf8_char));
var_dump(error_get_last());// json_encode(): Invalid UTF-8 sequence in argument
That bizarre and unfortunate behavior is related to this bug https://bugs.php.net/bug.php?id=47494 and a few others, and doesn't look like it will ever be fixed.
workaround:
Cleaning the string before passing it to json_encode may be a workable solution.
$stripped_of_invalid_utf8_chars_string = iconv('UTF-8', 'UTF-8//IGNORE', $orig_string);
if ($stripped_of_invalid_utf8_chars_string !== $orig_string) {
// one or more chars were invalid, and so they were stripped out.
// if you need to know where in the string the first stripped character was,
// then see http://stackoverflow.com/questions/7475437/find-first-character-that-is-different-between-two-strings
}
$json = json_encode($stripped_of_invalid_utf8_chars_string);
http://php.net/manual/en/function.iconv.php
The manual says
//IGNORE
silently discards characters that are illegal in the target
charset.
So by first removing the problematic characters, in theory json_encode() shouldnt get anything it will choke on and fail with. I haven't verified that the output of iconv with the //IGNORE
flag is perfectly compatible with json_encodes notion of what valid utf8 characters are, so buyer beware...as there may be edge cases where it still fails. ugh, I hate character set issues.
Edit
in php 7.2+, there seems to be some new flags for json_encode
:JSON_INVALID_UTF8_IGNORE
and JSON_INVALID_UTF8_SUBSTITUTE
There's not much documentation yet, but for now, this test should help you understand expected behavior:
https://github.com/php/php-src/blob/master/ext/json/tests/json_encode_invalid_utf8.phpt
And, in php 7.3+ there's the new flag JSON_THROW_ON_ERROR
. See http://php.net/manual/en/class.jsonexception.php
Æøå in returned JSON result - the data doesn't look like it's supposed to
What you receive in $result
is an utf-8 string that seems to represent an url of some sort. Anyhow, json_encode
will escape any unicode character to \u008E
strings.
If you don't want to escape utf-8 character, this question is relevent to you : Why does the PHP json_encode function convert UTF-8 strings to hexadecimal entities?
Everything seems to work fine from what I see. Although, the string you have provided us seem to be troncated but I guess this is an error on your part.
difficulty passing Japanese characters(UTF-8) via json_encode
\u611b\u77e5\u770c = 愛知県 (Aichi Prefecture)
\u611b\u5a9b\u770c = 愛媛県 (Ehime Prefecture)
Both are correct Japanese Prefecture name.
So string conversion part has no problem.
The perp is hiding in the later phase.
Related Topics
Intl Extension: Installing PHP_Intl.Dll
How to Generate a Screenshot of a Webpage Using a Server-Side Script
Upgrading PHP in Xampp For Windows
Can't Access Global Variable Inside Function
Adding Months to a Date in JavaScript
Page Redirect After Certain Time PHP
Why Use a Psr-0 or Psr-4 Autoload in Composer If Classmap Is Actually Faster
Execute Raw SQL Using Doctrine 2
Mock Private Method With PHPunit
Difference Between 2 Dates in Seconds
Remove Warning Messages in PHP
Convert PHP Date to MySQL Format
PHP Array Merge Two Arrays on Same Key
PHPunit Assert That an Exception Was Thrown