PHP Pdo: Charset, Set Names

PHP PDO: charset, set names?

You'll have it in your connection string like:

"mysql:host=$host;dbname=$db;charset=utf8mb4"

HOWEVER, prior to PHP 5.3.6, the charset option was ignored. If you're running an older version of PHP, you must do it like this:

$dbh = new PDO("mysql:host=$host;dbname=$db",  $user, $password);
$dbh->exec("set names utf8mb4");

Is PDO ... SET NAMES utf8 dangerous?

Are you really still using PHP >= version 3.6 and < 5.3.6 ?

Assuming you have 5.3.6 or later...

Character sets
and PDO_MYSQL DSN
say that you should use

$pdo = new PDO("mysql:host=localhost;dbname=mydb;charset=utf8",
'my_user', 'my_pass');

And implies (not clearly enough) that utf8 should be replaced by utf8mb4 if appropriate.

PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8' is not as good, but was the alternative before 5.3.6.

I think "dangerous" is too strong a word, even pre-5.3.6.

A related technique: Using init_command = SET NAMES ... in my.cnf is bad because init_command is not executed when connecting as root.

utf8mb4 is the preferred CHARACTER SET for UTF-8 because it includes Emoji and some Chinese characters that were missing from utf8. That charset is available starting with MySQL version 5.5.3.

How to specify collation with PDO without SET NAMES?

Here is a two in one answer.

You can set this in the DSN or as MYSQL_ATTR_INIT_COMMAND (connection options).

DSN is better, i think.

$connect = new PDO(
"mysql:host=$host;dbname=$db;charset=utf8",
$user,
$pass,
array(
PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
)
);

If you specify UTF-8 you are working with the default collation of utf8_general_ci, unless your db table or field uses something different.

If you want the whole server to respond with this default collation then use configuration directives:

collation_server=utf8_unicode_ci 
character_set_server=utf8

So you don't have to specify it on connection everytime.

The collations affect the sorting of chars and is set on the table and fields in your database.
These settings are respected, when querying the table. Make sure they are set.
Use UTF-8 names with the collation set in your db.


Your comment:

"People should know char set and collation are 2 different things."

Let's Quote from the MySQL Manual to proof this:

A SET NAMES 'charset_name' statement is equivalent to these three
statements:

SET character_set_client = charset_name;
SET character_set_results = charset_name;
SET character_set_connection = charset_name;

Setting character_set_connection to charset_name also implicitly sets collation_connection to the default collation for
charset_name.

My answer: It works implicitly, unless your tables changes this explicitly.


Question from comment:

How to make sure I don't mess things up as my tables are not the
default collation utf8_general_ci?

Example: Column collation overrides table collation

CREATE TABLE t1
(
col1 CHAR(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci
) CHARACTER SET latin1 COLLATE latin1_bin;

If both CHARACTER SET X and COLLATE Y are specified on a column, character set X and collation Y are used. The column has character set utf8 and collation utf8_unicode_ci as specified in the table column, while the table is in latin1 + latin1_bin.

Example: in general table collation is used

If collation is not explicitly specified on a column/Field, then the table collation is used:

CREATE TABLE t1
(
col1 CHAR(10)
) CHARACTER SET latin1 COLLATE latin1_bin;

col1 has collation latin1_bin.

If you want utf8_unicode_ci collation, set it to your tables in general or to the columns/fields.

PHP What is the default charset for pdo mysql

The option character_set_client is what MySQL uses for the character set of queries and data that the client sends.

The default is utf8 in MySQL 5.5, 5.6, and 5.7, and utf8mb4 in 8.0.

It can also be changed globally in your my.cnf options file, or per session by a SET NAMES statement.

It's good to set the option explicitly when you connect, so you don't have to assume its default value.


Re your comment:

I'm afraid you're confusing two different cases of SQL injection. There is a risk when using those specific five character sets, but it is not related to second-order SQL injection.

The character set risk is due to some multi-byte character sets. It's common to insert a backslash to escape a literal quote character. But in some character sets, the backslash byte gets merged into the preceding byte, forming a multi-byte character. That leaves the quote unescaped.

Second-order SQL injection is totally different. It can occur with any character set. This is when an attacker adds data to your database through legitimate means, like filling out a form. Inserting the data is handled without error. But the values they insert contains syntax designed to exploit some later SQL query.

It relies on developers believing that data that has already been saved safely to their database is somehow "safe" for use without proper parameterization.

An example of second-order SQL injection that is merely accidental instead of malicious could be that a person has the last name "O'Reilly," and the name is read by the code and used in a subsequent query.

$name = $db->query("SELECT last_name FROM people WHERE id = 123")->fetchColumn();
$sql = "SELECT * FROM accounts WHERE account_owner_last_name = '$name'";

If the name contains a literal apostrophe, it would mess up the second query in that example.

How to make PDO run SET NAMES utf8 each time I connect, In ZendFramework

Itay,

A very good question. Fortunately for you the answer is very simple:

database.params.driver_options.1002 = "SET NAMES utf8"

1002 is the value of constant PDO::MYSQL_ATTR_INIT_COMMAND

You can't use the constant in the config.ini

PDO charset from iso-8859-2 to UTF8

Prior to PHP 5.3.6, this is how you should set the connection character set:

$options = array(
PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8',
);

$pdo = new PDO('mysql:host=localhost;dbname=xx', 'xxx', 'xxx', $options);

Note that the encoding must be ASCII compatible.

PHP, PDO and legacy DB - change charset on the fly?

When using PDO to connect to MySQL it is wise to explicitly set the character set to utf8 (of course, only when using utf8 is the charset). In the MySQL or MySQLi extension I would normally execute the query SET NAMES utf8 to set it.

In PDO the charset can be specified in the connection string:

$conn = new PDO("mysql:host=$host;dbname=$db;charset=utf8", $user, $pass);

The charset option is only used since PHP 5.3.6, so take this into account when running an older version of PHP. In that case you should run the following statement after constructing the PDO object:

$conn->exec('SET NAMES utf8');

But you should’t be running such an old version of PHP anyway.

PDO connection : UTF-8 declaration with SET NAMES / CHARACTER SET?

Setting it in DSN is the only proper way (although it is only supported since 5.3).

You can this one and SET NAMES at the same time.

All the other ways will make infamous half-fictional GBK injection possible.

Please note that your setting for error_reporting() is utterly wrong. it have to be unconditional -1.
If you concerned about displaying errors - there is a proper ini setting for this, called display_errors, can be set at runtime.

While error_reporting sets level of the error and should be at max all the time.



Related Topics



Leave a reply



Submit