PHP Utf-8 to Windows Command Line Encoding

PHP UTF-8 to Windows command line encoding

You put me on the right track but there was kinddof a problem (I love Windows \o/) :

C:\php>chcp 65001
Page de codes active : 65001
C:\php>php -c C:\WINDOWS\php.ini -f mysqldump.php | more
Mémoire insuffisante.

Mémoire insuffisante = not enough memory.

If I try

C:\php>chcp 1252
C:\php>php -c C:\WINDOWS\php.ini -f mysqldump.php
C:\php>ééîîïïÂÂÂÂâûü

it works. Only God knows why. But it works. Thanks for putting me on the right track !!

By the way the php code to go properly form UTF8 to command prompt is :

  echo mb_convert_encoding($utf8_string, "pass", "auto");

How to send UTF-8 command line data from PHP to Java for correct encoding

The problem has been solved using the solution provided here:

Unicode to PHP exec

Everyone's help got me on the right track. It was indeed a locale issue, but not at the OS level. Instead it was with PHP's locale.

Another user had a similar issue and it was fixed with by adding the following code to the PHP script before executing the command line that calls the Java program:

$locale = 'en_US.utf-8';
setlocale(LC_ALL, $locale);
putenv('LC_ALL='.$locale);

So now, in the Java code, when I view the args[0] param, that is now displayed correctly and also the processed text stored in a file and then sent back to and received into the PHP script properly. It took a bit of looking up the byte values, corresponding UTF-8 encodings, and the like before I could start to see the issue was that PHP was translating what was a correct string just before exec, into a different string during the exec() call. During this call the UTF-8 \0xc3 0xa9 bytes for "é" (Unicode \u00E9) into \3f \3f (two ASCII question mark chars).

During my searching here on stackoverflow I saw a warning not the use literals (e.g. "Présentation") and once I backtracked the data to the caller it became evident that the issue involved the actual call to exec().

Hopefully another new to Unicode processing can benefit from this information.

Thanks for everyone's input which pointed me in the right direction.

How to convert output of windows shell exec in php to utf8

function sapi_windows_cp_get() accept string as argument

$output = sapi_windows_cp_conv(sapi_windows_cp_get('oem'), 65001, $output);

PHP: How to configure windows shell codepage so that STDOUT of proc_open() does not garble?

TLTR

Use the PHP function sapi_windows_cp_conv as follows.

$res = stream_get_contents($pipes[1]);
$res = sapi_windows_cp_conv(sapi_windows_cp_get('ansi'), 65001, $res);

Long Answer

The solution refers to this SO answer. In fact, PHP communicates with the default command shell of windows (cmd.exe, pwsh.exe, ...), whose codepage might be set to ANSI instead of UTF-8.

First, try to modify the default codepage of cmd.exe following these steps. However, if encoding issues persist, you might need to look at the next alternative.

To enforce conversion from one codepage to another directly from PHP, use sapi_windows_cp_conv(sapi_windows_cp_get($kind), 65001, $res), where 65001 refers to UTF-8 encoding (see chcp). Please refer to the sapi_windows_cp_conv documentation here. Note that $kind needs to be specified as 'ansi' or 'oem' as per the documentation.

EDIT: To set ANSI/OEM of cmd/powershell to UTF-8 at the system level, check out these steps.

ANSI encoded file converting to UTF-8 encoded file with php script?

Firstly, ANSI is not a type of character encoding. With ANSI, you need to find out what the encoding options are for the particular file that you're trying to read. First you should find out first if the file is already UTF-8 encoded, and if not, then simply encode it. Below, we check the encoding and if successful we return the file.

$output = false;
if( !mb_check_encoding( $myFile, 'UTF-8', true ) ):
$output = mb_convert_encoding( $myFile, 'UTF-8' );
endif;

Then simply check if the encoding worked.

return $output ? $output : 'Failed encoding file!';


Related Topics



Leave a reply



Submit