Accented Characters in MySQL Table

Accented characters in mySQL table

I experienced that same problem before, and what I did are the following

1) Use notepad++(can almost adapt on any encoding) or eclipse and be sure in to save or open it in UTF-8 without BOM.

2) set the encoding in PHP header, using header('Content-type: text/html; charset=UTF-8');

3) remove any extra spaces on the start and end of my PHP files.

4) set all my table and columns encoding to utf8mb4_general_ci or utf8mb4_unicode_ci via PhpMyAdmin or any mySQL client you have. A comparison of the two encodings are available here

5) set mysql connection charset to UTF-8 (I use PDO for my database connection )

  PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET utf8"

or just execute the SQL queries before fetching any data

6) use a meta tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

7) use a certain language code for French
<meta http-equiv="Content-language" content="fr" />

8) change the html element lang attribute to the desired language

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr">

and will be updating this more because I really had a hard time solving this problem before because I was dealing with Japanese characters in my past projects

9) Some fonts are not available in the client PC, you need to use Google fonts to include it on your CSS

10) Don't end your PHP source file with ?>

NOTE:

but if everything I said above doesn't work, try to adjust your encoding depending on the character-set you really want to display, for me I set everything to SHIFT-JIS to display all my japanese characters and it really works fine. But using UFT-8 must be your priority

Accented characters and MySQL searching

Here are some results from my tests. You can compare to yours:

CREATE TABLE `test` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(32) default NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8

Table contents:

mysql> select * from test;
+----+---------+
| id | name |
+----+---------+
| 1 | bla |
| 2 | blater |
| 3 | bláter |
| 4 | bhei |
+----+---------+
4 rows in set (0.00 sec)

Search results;

mysql> select * from test where name like '%bla%';
+----+---------+
| id | name |
+----+---------+
| 1 | bla |
| 2 | blater |
| 3 | bláter |
+----+---------+
3 rows in set (0.00 sec)

Search with accent:

mysql> select * from test where name like '%blá%';;
+----+---------+
| id | name |
+----+---------+
| 3 | bláter |
+----+---------+
1 row in set (0.00 sec)

I get the same results even with COLLATE=utf8_hungarian_ci

how to deal with accents and strange characters in a database?

Collation affects text sorting only, it has no effect on actual character set of stored data.

I would recommend this configuration:

  1. Set the character set for the whole DB only, so you don't have to set it for each table separately. Character set is inherited from DB to tables to columns. Use utf8 as the character set.

  2. Set the character set for the DB connection. Execute these queries after you connect to the database:

    SET CHARACTER SET 'utf8'
    SET NAMES 'utf8'
  3. Set the character set for the page, using HTTP header and/or HTML meta tag. One of these is enough. Use utf-8 as the charset.

This should be enough.

If you want to have proper sorting of Spanish strings, set collation for the whole database. utf8_spanish_ci should work (ci means Case Insensitive). Without proper collation, accented Spanish characters would be sorted always last.

Note: it's possible that the character set of data you already have in a table is broken, because you character set configuration was wrong previously. You should check it using some DB client first to exclude this case. If it's broken, just re-insert your data with the right character set configuration.

How does character set work in a database

  • objects have a character set attribute, which can be set explicitly or it's inherited (server > database > table > column), so the best option is to set it for the whole database

  • client connection has also a character set attribute and it's telling the database in which encoding you're sending the data

If client connection's and target object's character sets are different, the data you're sending to the database are automatically converted from the connection's character set to the object's character set.

So if you have for example the data in utf8, but client connection set to latin1, the database will break the data, because it'll try to convert utf8 like it's latin1.

Accented characters stored in MySQL database

Maybe you could take a look to utf8_encode() and utf8_decode()

Replace accented letters in mysql column not working

Follow this -> Accented characters in mySQL table

I think that's what you are needing to do for your DB to accept accented characters.

How to conduct an Accent Sensitive search in MySql

If your searches on that field are always going to be accent-sensitive, then declare the collation of the field as utf8_bin (that'll compare for equality the utf8-encoded bytes) or use a language specific collation that distinguish between the accented and un-accented characters.

col_name varchar(10) collate utf8_bin

If searches are normally accent-insensitive, but you want to make an exception for this search, try;

WHERE col_name = 'abád' collate utf8_bin

Update for MySQL 8.0, plus addressing some of the Comments and other Answers:

  • The CHARACTER SET matches the beginning of the COLLATION.
  • Any COLLATION name ending in _bin will ignore both upper/lower case and accents. Examples: latin1_bin, utf8mb4_bin.
  • Any COLLATION name containing _as_ will ignore accents, but do case folding or not based on _ci vs _cs.
  • To see the collations available (on any version), do SHOW COLLATION;.
  • utf8mb4 is now the default charset. You should be using that instead of utf8.
  • It is better to have the CHARACTER SET and COLLATION set 'properly' on each column (or defaulted by the table definition) than to dynamically use any conversion routine such as CONVERT().

How to filter a column with Non-accented characters using select query

The utf8_bin collation is what you need for your requirement to handle accents

I don't want to use 'BINARY' OR 'COLLATE utf8_bin' because it returns only case sensitive search.

This is easier (and more performant) to solve with utf8_bin than solving the accent issue with another collation

SELECT * FROM test WHERE LOWER(name) like '%aa%' COLLATE utf8_bin

-> added after comments

The query above assumes that the query parameters are minuscule but if you cant modify the params to always be minuscules then you can also use this variation

SELECT * FROM test WHERE LOWER(name) like LOWER('%ÚÙ%') COLLATE utf8_bin

Mysql & PHP - Query doesn't find keywords with accented characters even under utf8_general_ci collation

You ned to set the correct charset.

mysql_set_charset('utf8');
mysqli_set_charset('utf8');

etc. If everything still fails, try use RLIKE without %% instead of LIKE.

How to insert special characters in MySQL

Almost all the characters you list can be directly inserted in to a text or char field without problems.

Depending on the quote character you use you only need to "escape" the same character, so when using single quotes ' you don't have to escape double quotes ". With the quote you can either escape it with a back slash \ or double up the character ''.

Back slash characters also need to be escaped or doubled up.

Here's an example of how to handle the different characters: https://www.db-fiddle.com/f/qiV1AsQdRYLZkUE4HAzWgu/1

create table accident_report (id integer, description text);

insert accident_report(id, description) values
(1, '@#$%^&*()_+[{}]|:;"<>,.?/'),
(2, '\\'),
(3, '\''),
(4, '''');

select * from accident_report;

Interestingly the markdown code colouring doesn't quite get it right with the double quoted characters.



Related Topics



Leave a reply



Submit