Getting Data with Utf-8 Charset from Mssql Server Using PHP Freetds Extension

Getting data with UTF-8 charset from MSSQL server using PHP FreeTDS extension

MSSQL and UTF-8 are quite a pain in the ... sometimes. I had to convert it manually.
The problem: MSSQL doesn't actually know and support UTF-8.

Convert from database value to UTF-8:

mb_detect_encoding($value, mb_detect_order(), true) === 'UTF-8' ? $value : mb_convert_encoding($value, 'UTF-8');

Converting from UTF-8 to database value:

mb_convert_encoding($value, 'UCS-2LE', mb_detect_encoding($value, mb_detect_order(), true));

Fortunately I was using Doctrine so all I had was to create a custom StringType implementation.

PHP + SQL Server - How to set charset for connection?

Client charset is necessary but not sufficient:

ini_set('mssql.charset', 'UTF-8');

I searched for two days how to insert UTF-8 data (from web forms) into MSSQL 2008 through PHP. I read everywhere that you can't, you need to convert to UCS2 first (like cypher's solution recommends).
On Windows SQLSRV said to be a good solution, which I couldn't try, since I am developing on Mac OSX.

However, FreeTDS manual (what PHP mssql uses on OSX) says to add a letter "N" before the opening quote:

mssql_query("INSERT INTO table (nvarcharField) VALUES (N'űáúőűá球最大的采购批发平台')", +xon);

According to this discussion, N character tells the server to convert to Unicode.
https://softwareengineering.stackexchange.com/questions/155859/why-do-we-need-to-put-n-before-strings-in-microsoft-sql-server

How to store and retrieve extended ASCII characters in MSSQL

You might try base64 encoding the input, this is fairly trivial to handle with PHP's base64_encode() and base64_decode() and it should handle what ever your users throw at it.

(edit: You can apparently also do the base64 encoding on the SQL Server side. This doesn't seem like something it should be responsible for imho, but it's an option.)

FreeTDS: How to set charset of parameters running stored procedure

After a lot of attempts, I couldn't figure out why freetds.conf settings (client charset and tds version) are not being respected. At least, when I append TDS_Version=8.0;ClientCharset=UTF-8 into the connection string, it works!

Record is stored correctly when changed the connection string

"Driver={FreeTDS};Server=%s;Port=%s;Database=%s;UID=%s;PWD=%s;APP=%s;TDS_Version=8.0;ClientCharset=UTF-8"

Also, the header of freetds log file is changed, mentioning UTF-8 conversion:

log.c:196:Starting log file for FreeTDS 0.91
on 2016-05-18 15:58:49 with debug flags 0x4fff.
iconv.c:330:tds_iconv_open(0xaeb19118, UTF-8)
iconv.c:353:Using trivial iconv
iconv.c:187:local name for ISO-8859-1 is ISO-8859-1
iconv.c:187:local name for UTF-8 is UTF-8
iconv.c:187:local name for UCS-2LE is UCS-2LE
iconv.c:187:local name for UCS-2BE is UCS-2BE
iconv.c:349:setting up conversions for client charset "UTF-8"
iconv.c:351:preparing iconv for "UTF-8" <-> "UCS-2LE" conversion
iconv.c:391:preparing iconv for "ISO-8859-1" <-> "UCS-2LE" conversion
iconv.c:394:tds_iconv_open: done


Related Topics



Leave a reply



Submit