Save Accents in MySQL Database

Save Accents in MySQL Database

Personally I solved the same issue by adding after the MySQL connection code:

mysql_set_charset("utf8");

or for mysqli:

mysqli_set_charset($conn, "utf8");

or the mysqli OOP equivalent:

$conn->set_charset("utf8");

And sometimes you'll have to define the main php charset by adding this code:

mb_internal_encoding('UTF-8');

On the client HTML side you have to add the following header data :

<meta http-equiv="Content-type" content="text/html;charset=utf-8" />

In order to use JSON AJAX results (e.g. by using jQuery), you should define the header by adding :

header("Content-type: application/json;charset=utf8");
json_encode(
some_data
);

This should do the trick

Accented characters stored in MySQL database

Maybe you could take a look to utf8_encode() and utf8_decode()

how to deal with accents and strange characters in a database?

Collation affects text sorting only, it has no effect on actual character set of stored data.

I would recommend this configuration:

  1. Set the character set for the whole DB only, so you don't have to set it for each table separately. Character set is inherited from DB to tables to columns. Use utf8 as the character set.

  2. Set the character set for the DB connection. Execute these queries after you connect to the database:

    SET CHARACTER SET 'utf8'
    SET NAMES 'utf8'
  3. Set the character set for the page, using HTTP header and/or HTML meta tag. One of these is enough. Use utf-8 as the charset.

This should be enough.

If you want to have proper sorting of Spanish strings, set collation for the whole database. utf8_spanish_ci should work (ci means Case Insensitive). Without proper collation, accented Spanish characters would be sorted always last.

Note: it's possible that the character set of data you already have in a table is broken, because you character set configuration was wrong previously. You should check it using some DB client first to exclude this case. If it's broken, just re-insert your data with the right character set configuration.

How does character set work in a database

  • objects have a character set attribute, which can be set explicitly or it's inherited (server > database > table > column), so the best option is to set it for the whole database

  • client connection has also a character set attribute and it's telling the database in which encoding you're sending the data

If client connection's and target object's character sets are different, the data you're sending to the database are automatically converted from the connection's character set to the object's character set.

So if you have for example the data in utf8, but client connection set to latin1, the database will break the data, because it'll try to convert utf8 like it's latin1.

how to store accent marks over characters in my database

UTF-8 is (generally) a “safe” encoding for any character set in the world. (Not always the most efficient, and there are some arguments to be made that Unicode under-represents the CJK scripts with its “unified han” model, but moving on…)

However, it's likely that your interface program(s) are not translating to/from UTF-8 properly. For example, ó => ó looks like the UTF-8 data (where one character can be spread across a varying number of bytes) is being presented to you using a single-byte European encoding, like ISO-8859-15 or MS-CP-1451 or similar.

You are probably storing the data correctly, but loading it incorrectly. If you're just using the mysql terminal program or similar, make sure that your terminal is set to use UTF-8 (on a Unix/Linux system, locale should probably be something ending in .utf8, e.g. mine has LANG=en_US.utf8)

If you're pulling data using a GUI tool or similar, check its Settings/Preferences panel for the character set.

If you're getting the mistranslated characters back into an application you've written, look at your language's tools for setting the locale. (Perhaps, the INSERT routines have it right, but the SELECT routines have it wrong?)

And, if this is being sent to the Web, make sure your (XML|HTML|XHTML) files have charset=utf8 declared in the appropriate place(s), or translate back from UTF-8 to the character set of your document (if possible) using something like iconv when inserting text from the database. (Most non-Unicode character sets can only represent a subset of Unicode, of course; e.g. the ISO-8859-15 set does a decent job at covering European languages, but has no support for Cyrillic, Arabic, or CJK writing systems, so it's possible to fail to translate a character.) In Perl, you can use pass arguments to open or use binmode to set up a transparent character set translation layer on a "filehandle" stream.

Accented characters in mySQL table

I experienced that same problem before, and what I did are the following

1) Use notepad++(can almost adapt on any encoding) or eclipse and be sure in to save or open it in UTF-8 without BOM.

2) set the encoding in PHP header, using header('Content-type: text/html; charset=UTF-8');

3) remove any extra spaces on the start and end of my PHP files.

4) set all my table and columns encoding to utf8mb4_general_ci or utf8mb4_unicode_ci via PhpMyAdmin or any mySQL client you have. A comparison of the two encodings are available here

5) set mysql connection charset to UTF-8 (I use PDO for my database connection )

  PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET utf8"

or just execute the SQL queries before fetching any data

6) use a meta tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

7) use a certain language code for French
<meta http-equiv="Content-language" content="fr" />

8) change the html element lang attribute to the desired language

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr">

and will be updating this more because I really had a hard time solving this problem before because I was dealing with Japanese characters in my past projects

9) Some fonts are not available in the client PC, you need to use Google fonts to include it on your CSS

10) Don't end your PHP source file with ?>

NOTE:

but if everything I said above doesn't work, try to adjust your encoding depending on the character-set you really want to display, for me I set everything to SHIFT-JIS to display all my japanese characters and it really works fine. But using UFT-8 must be your priority

How to conduct an Accent Sensitive search in MySql

If your searches on that field are always going to be accent-sensitive, then declare the collation of the field as utf8_bin (that'll compare for equality the utf8-encoded bytes) or use a language specific collation that distinguish between the accented and un-accented characters.

col_name varchar(10) collate utf8_bin

If searches are normally accent-insensitive, but you want to make an exception for this search, try;

WHERE col_name = 'abád' collate utf8_bin

Update for MySQL 8.0, plus addressing some of the Comments and other Answers:

  • The CHARACTER SET matches the beginning of the COLLATION.
  • Any COLLATION name ending in _bin will ignore both upper/lower case and accents. Examples: latin1_bin, utf8mb4_bin.
  • Any COLLATION name containing _as_ will ignore accents, but do case folding or not based on _ci vs _cs.
  • To see the collations available (on any version), do SHOW COLLATION;.
  • utf8mb4 is now the default charset. You should be using that instead of utf8.
  • It is better to have the CHARACTER SET and COLLATION set 'properly' on each column (or defaulted by the table definition) than to dynamically use any conversion routine such as CONVERT().

How to save the accent character é as é in a mysql database when it is inserted in a form in Symfony2 ?

I do not know why you want to do that
but if you want to do this, simply comment that line "charset: UTF8"

doctrine:
dbal:
driver: "%database_driver%"
host: "%database_host%"
port: "%database_port%"
dbname: "%database_name%"
user: "%database_user%"
password: "%database_password%"
# charset: UTF8

How to make an accent insensitive `MATCH() AGAINST()` sentence?

Solved!

After some discussion in the comments, I realized that the error was not in the MATCH() AGAINST() statement, since it does not distinguish diacritics by default.

So the problem had to do with how the diacritics were stored in MySQL, in my case, they were stored like this COLÁGENO -> COLÃ<0x81>GENO. Therefore, it was necessary to find out how to save the tildes correctly without corrupting the table.

Encodings

I tried making encoding changes by executing in phpmyadmin the instruction:

ALTER TABLE products CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

However, there were no changes in performance.

I then tried changing the encoding in the exported table file by changing DEFAULT CHARSET=latin1 to DEFAULT CHARSET=utf8mb4, but no change in results either.

Modify the accents manually

My other attempt was, once the encoding change was made, to manually modify the cells that had characters like Ã<0x81> by their corresponding character with tilde Á. But sadly this seemed to corrupt queries to the table (I was still able to access the other tables normally).

So I thought about what masterguru said in the comments about encoding changes altering the way scripts connect to the table, and apparently when I manually modified a character to put the tilde, the scripts kept accessing the table. table with the previous encoding.

Solution

The scripts were in PHP so I had to find the solution in that language.

I found this answer in English SO where it said how to save tildes in the database correctly. To do this, you had to write...

mysqli_set_charset($connection, "utf8");

...this after the connection to the database. Finally, I had to change the rare characters in my database to their corresponding tilde character for the MATCH AGAINST to work, and voila!


Many thanks to masterguru, Triby and aeportugal for the help provided in the comments!

Original post: https://es.stackoverflow.com/questions/511745/como-hacer-que-la-b%c3%basqueda-match-against-ignore-los-tildes-o-acentos



Related Topics



Leave a reply



Submit