MySQL Diacritic Insensitive Search (Spanish Accents)

MySQL diacritic insensitive search (spanish accents)

Character sets & collations, not my favorites, but they DO work:

mysql> SET NAMES latin1;
mysql> SELECT 'lápiz' LIKE 'lapiz';
+-----------------------+
| 'lápiz' LIKE 'lapiz' |
+-----------------------+
| 0 |
+-----------------------+
1 row in set (0.01 sec)

mysql> SET NAMES utf8;
mysql> SELECT 'lápiz' LIKE 'lapiz';
+-----------------------+
| 'lápiz' LIKE 'lapiz' |
+-----------------------+
| 1 |
+-----------------------+

mysql> SET NAMES latin1;
mysql> SELECT _utf8'lápiz' LIKE _utf8'lapiz' ;
+---------------------------------+
| _utf8'lápiz' LIKE _utf8'lapiz' |
+---------------------------------+
| 1 |
+---------------------------------+

A nice chapter to read in the manual:Character Set Support

MySQL: Why can I search some accent marks but not others?

Well, the reason might be because ñ is not exactly n.

"The character “ñ” is a precomposed character because it is treated as an individual Unicode character and has a Unicode code point of U+00F1. Technically, this character can be decomposed into an equivalent string of a base character “n” (U+006E) and a combining tilde “~” (U+0303)."
Reference

There is a way to make ñ recognizable as n by adding COLLATE in the query, such as:

SELECT * FROM baptism WHERE place LIKE '%n%' COLLATE utf8mb4_general_ci;

Try here

Accents insensitive search in MySql

I believe you need to specify COLLATE as part of the SQL statement.

SELECT  *
FROM peliculas
WHERE nombre LIKE '%circulo%' COLLATE utf8_unicode_ci;

example which returns:

+-----+------------+
| ID | NOMBRE |
+-----+------------+
| 2 | El círculo |
+-----+------------+

(Not I didn't use í within the query, but still was able to return El círculo)

I should note, if you're database encoding isn't setup and you're actually storing every "accent" as à you're going to have a lot more problems. This only works on the basis that the database itself has the original unreplaced value.

Alternatively, a full-text search would probably do the trick.

p.s. Look in to using PDO and stop placing $var in a query.

Accent insensitive search on a problematic database

The problem being that the data was inserted using the wrong connection encoding, you can fix it by

  1. Exporting the data using the wrong connection encoding, just like you have used it thus far, followed by
  2. Importing the data using the correct utf8 connection encoding.

That will fix the encoding problem, after which search will work as expected.

Diacritic insensitive mysql search?

It's a bit like case-insensitivity problem.

SELECT * FROM blah WHERE UPPER(foo) = "THOMAS"

Just convert both strings to diacritic-free before comparing.

MySQL Case Insensitive but Accent Sensitive UTF8 Unique Key

The only thing I can think of (without finding a collation that fits your needs) is to change something at the application layer (outside of MySQL) that will take care of the differentiation.

For instance, since you don't care about case, you can do something programmatically to lower the case of all the rows in the database. Then change the collation to utf8_bin.

Then you can, in the application, convert everything to lowercase before it enters the database (I'm guessing this will not affect the diacritic characters). That way, you will still get errors if people try to enter multiple cases, you should only have to change a few lines of code to precondition stuff entering the table, and you won't have the diacritic problem.

Search with accented characters

If you can't get the COLLATE Latin1_general_CI_AI or pg setup to work, try this. Duplicate columns, one accented, one not. Credit goes to this and this answer.

class Song < ApplicationRecord
before_save :i18n

def self.search(query)
return all unless query.present?

like = Rails.env.production? ? 'ILIKE' : 'LIKE'

q = ["%#{query}%"] * 3

search_phrase = "number #{like} ? OR ai_title #{like} ? OR ai_lyrics #{like} ?"

where([search_phrase] + q)
end

def i18n
self.ai_title = I18n.transliterate title
self.ai_lyrics = I18n.transliterate lyrics
end
end

With my migration looking like:

class CreateSongs < ActiveRecord::Migration[5.0]
def change
create_table :songs do |t|
...
t.string :title
t.string :ai_title, index: true
t.text :lyrics
t.text :ai_lyrics, index: true
...
end
end
end

It works for many database setups. I find this real useful.



Related Topics



Leave a reply



Submit