Freetds and Unixodbc Character Converting

FREETDS and UNIXODBC character converting

You need to set

clientcharset = UTF-8

in your freetds.conf file and read it from odbc.ini by setting the Servername option to the connection definition in freetds.com. See this explanation and my blog post.

Difference between FreeTDS and unixodbc?

unixODBC is a 'DriverManager' for ODBC. You can use unixODBC when on a Linux or *nix system to connect to any ODBC-capable database. Doing so means that you can write one lot of database queries which you should be able to use between different databases. If you were not on Unix, you would use a different Driver Manager, for example the built-in MS Office one.

To make all the components clear: if you're using a language, let's say Python, to connect to SQL Server, your connection might pass from Python's pyodbc (translates python objects to and from unixODBC), to unixODBC (manages drivers, such as FreeTDS), to FreeTDS (translates unixODBC objects to and from the TDS protocol, which Microsoft embraces) to SQL Server.

The unixODBC website http://www.unixodbc.org/ says:

An ODBC application makes ODBC calls to the DriverManager. The DriverManager carries out a number of tasks for the application such as:

  • ensuring the proper driver is loaded/unloaded
  • validation tasks
  • 3.5 to 3.0 to 2.0 call and data mapping

Most calls to the DriverManager get passed onto the loaded Driver to be further processed but that is of little concern to the application.

Some advantages to using an ODBC DriverManager include:

  • portable data access code
  • runtime binding to a Data Source
  • ability to easily change the Data Source

Briefly, it is the Driver Manager which reads your DSN, looks at the configured data sources, and decides where and how to connect.

Depending on which database you use, you will need a different driver. This piece of code 'translates' your requests made using ODBC to the right protocol for the relevant database management system. This is the component that would need to be different for different data sources. In your case, TDS is the protocol used by MS SQL Server. FreeTDS is a free software implementation of this protocol.

See also Wikipedia https://en.wikipedia.org/wiki/Open_Database_Connectivity (emphasis kept):

ODBC accomplishes DBMS independence by using an ODBC driver as a translation layer between the application and the DBMS. The application uses ODBC functions through an ODBC driver manager with which it is linked, and the driver passes the query to the DBMS. An ODBC driver can be thought of as analogous to a printer driver or other driver, providing a standard set of functions for the application to use, and implementing DBMS-specific functionality. An application that can use ODBC is referred to as "ODBC-compliant". Any ODBC-compliant application can access any DBMS for which a driver is installed. Drivers exist for all major DBMSs, many other data sources like address book systems and Microsoft Excel, and even for text or CSV files.

Issue with returning Cyrillic symbols from MSSQL via unixODBC and FreeTDS

Ok, I have made all this modules chain work:

MSSQL <-> FreeTDS <-> unixODBC <-> pyodbc <-> django-pyodbc

I just have added 'unicode_results':True in DATABASES options in django settings:

DATABASES = {
'default': {
'ENGINE': 'sql_server.pyodbc', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
'NAME': 'name', # Or path to database file if using sqlite3.
'USER': 'user', # Not used with sqlite3.
'PASSWORD': 'pwd', # Not used with sqlite3.
'HOST': 'server-name', # Set to empty string for localhost. Not used with sqlite3.
'PORT': 'port', # Set to empty string for default. Not used with sqlite3.
'OPTIONS': {
'unicode_results':True,
'driver': 'FreeTDS',
'host_is_server': True,
'extra_params': 'TDS_VERSION=8.0'
}

But pyodbc and isql still doesn't work correctly - maybe I have missed other unicode-specific parameters. Going to check how odbc and pyodbc use this unicode_results parameter later. Anyway site now able to show Cyrillic symbols.

FreeTDS: How to set charset of parameters running stored procedure

After a lot of attempts, I couldn't figure out why freetds.conf settings (client charset and tds version) are not being respected. At least, when I append TDS_Version=8.0;ClientCharset=UTF-8 into the connection string, it works!

Record is stored correctly when changed the connection string

"Driver={FreeTDS};Server=%s;Port=%s;Database=%s;UID=%s;PWD=%s;APP=%s;TDS_Version=8.0;ClientCharset=UTF-8"

Also, the header of freetds log file is changed, mentioning UTF-8 conversion:

log.c:196:Starting log file for FreeTDS 0.91
on 2016-05-18 15:58:49 with debug flags 0x4fff.
iconv.c:330:tds_iconv_open(0xaeb19118, UTF-8)
iconv.c:353:Using trivial iconv
iconv.c:187:local name for ISO-8859-1 is ISO-8859-1
iconv.c:187:local name for UTF-8 is UTF-8
iconv.c:187:local name for UCS-2LE is UCS-2LE
iconv.c:187:local name for UCS-2BE is UCS-2BE
iconv.c:349:setting up conversions for client charset "UTF-8"
iconv.c:351:preparing iconv for "UTF-8" <-> "UCS-2LE" conversion
iconv.c:391:preparing iconv for "ISO-8859-1" <-> "UCS-2LE" conversion
iconv.c:394:tds_iconv_open: done

Issue with saving cyrillic symbols to MSSQL via unixODBC and FreeTDS

I finally solved a problem.
The correct database configuration in settings.py file is:

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'db',
'USER': 'user',
'PASSWORD': 'pass',
'HOST': '192.168.11.11',
'PORT': '3306',
},
'mssql': {
'ATOMIC_REQUESTS': True,
'NAME': 'DB',
'ENGINE': 'django_pyodbc',
'USER': 'admin',
'PASSWORD': 'pass',
'OPTIONS': {
'dsn': 'DBDSN',
'host_is_server': True,
'collation': 'Cyrillic_General_CI_AS',
'driver_supports_utf8': True,
'unicode_results': True,
'autocommit': True,
},
}
}

/etc/freetds/freetds.conf

[global]
tds version = 7.3
text size = 64512

[SRV]
host = mssql.local
port = 1433

/etc/odbc.ini

[ODBC Data Sources]
DBDSN = Microsoft SQL Server

[DBDSN]
Driver = FreeTDS
Servername = SRV
Trace = No
Database = DB

/etc/odbcinst.ini

[FreeTDS]
Description = FreeTDS
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
UsageCount = 1

The important point is that odbc.ini and odbsinst.ini files should NOT contain spaces or tabs in the begining of the lines.
These settings allows cyrillic symbols to be normally inserted into database.

ODBC query on MS SQL Server returning first 255 characters only in PHP PDO (FreeTDS)

According to the FreeTDS User Guide, the issue seems to be that FreeTDS can only handle varchar up to 255 characters when talking to SQL Server "due to limitations inherent in the protocol definition". Anything bigger than that needs to be data type text.

You can resolve the issue either by modifying your schema accordingly, or converting the data type during your query, like this:

SELECT CAST(mycol as TEXT) FROM mytable


Related Topics



Leave a reply



Submit