Getting Data with Utf-8 Charset from Mssql Server Using PHP Freetds Extension

Getting data with UTF-8 charset from MSSQL server using PHP FreeTDS extension

MSSQL and UTF-8 are quite a pain in the ... sometimes. I had to convert it manually.
The problem: MSSQL doesn't actually know and support UTF-8.

Convert from database value to UTF-8:

mb_detect_encoding($value, mb_detect_order(), true) === 'UTF-8' ? $value : mb_convert_encoding($value, 'UTF-8');

Converting from UTF-8 to database value:

mb_convert_encoding($value, 'UCS-2LE', mb_detect_encoding($value, mb_detect_order(), true));

Fortunately I was using Doctrine so all I had was to create a custom StringType implementation.

PHP + SQL Server - How to set charset for connection?

Client charset is necessary but not sufficient:

ini_set('mssql.charset', 'UTF-8');

I searched for two days how to insert UTF-8 data (from web forms) into MSSQL 2008 through PHP. I read everywhere that you can't, you need to convert to UCS2 first (like cypher's solution recommends).
On Windows SQLSRV said to be a good solution, which I couldn't try, since I am developing on Mac OSX.

However, FreeTDS manual (what PHP mssql uses on OSX) says to add a letter "N" before the opening quote:

mssql_query("INSERT INTO table (nvarcharField) VALUES (N'űáúőűá球最大的采购批发平台')", +xon);

According to this discussion, N character tells the server to convert to Unicode.

How to store and retrieve extended ASCII characters in MSSQL

You might try base64 encoding the input, this is fairly trivial to handle with PHP's base64_encode() and base64_decode() and it should handle what ever your users throw at it.

(edit: You can apparently also do the base64 encoding on the SQL Server side. This doesn't seem like something it should be responsible for imho, but it's an option.)

FreeTDS: How to set charset of parameters running stored procedure

After a lot of attempts, I couldn't figure out why freetds.conf settings (client charset and tds version) are not being respected. At least, when I append TDS_Version=8.0;ClientCharset=UTF-8 into the connection string, it works!

Record is stored correctly when changed the connection string


Also, the header of freetds log file is changed, mentioning UTF-8 conversion:

log.c:196:Starting log file for FreeTDS 0.91
on 2016-05-18 15:58:49 with debug flags 0x4fff.
iconv.c:330:tds_iconv_open(0xaeb19118, UTF-8)
iconv.c:353:Using trivial iconv
iconv.c:187:local name for ISO-8859-1 is ISO-8859-1
iconv.c:187:local name for UTF-8 is UTF-8
iconv.c:187:local name for UCS-2LE is UCS-2LE
iconv.c:187:local name for UCS-2BE is UCS-2BE
iconv.c:349:setting up conversions for client charset "UTF-8"
iconv.c:351:preparing iconv for "UTF-8" <-> "UCS-2LE" conversion
iconv.c:391:preparing iconv for "ISO-8859-1" <-> "UCS-2LE" conversion
iconv.c:394:tds_iconv_open: done

Related Topics

Leave a reply