Character encoding for French Accents
If intérêt
shows up as intérêt
you likely (i.e. short of corruption due to double encoding) have UTF-8 encoded text being shown up as if it were ISO-8859-1.
Make sure the headers are correctly formed and present the content as being UTF-8 encoded.
Why French characters don't work using utf-8 with Java?
You have to know the encoding of the text file before you read it. Apparently, it is originally an HTML file without meta charset.
You guessed UTF-8. It's not UTF-8 because reading it detected bytes that don't correspond to UTF-8 and therefore were replaced with the Unicode replacement character U+FFFD �, which you are then displaying(?) using the incorrect encoding, turning � into the Mojibake "�".
So, you'd have to go back to the sender/writer to find out what the encoding is. Then you can write a program to read it.
UTF-8 Charset displaying french characters incorrectly.
A common issue when collecting Unicode DATA is leaving the Connection and database/table/column character set configurad as ISO-8859-1, but then inserting data that is actually utf-8. The database is essentially told, "here's some 8859-1-encoded data, store it in this 8859-1 table". It doesn't do any conversions because it doesn't realize the data isn't in 8859-1. So the data is utf-8 but the database has essentially been told it's in 8859-1.
It's an insidious problem because, as you say, the database will convert them wrongly if you change your charset to UtF-8, since it will convert the "8859-1" data (remmember the databae thinks it's 8859-1) to utf-8 - a conversion that fails of course, as the data really is in utf-8.
So basically the problem is that phpmyadmin is in 8859-1 but you told it to insert the data in 8859-1 and then told it you were providing data in 8859-1, and then gave it utf-8 data. The database thinks it's 8859-1 so the only easy way to solve the problem is to a) keep acting like it's 8859-1 even though it's not, and hope you never have to deal with sorting, searching, collation, etc ( may work in your case), or b) pulling out the data as 8859-1 ( leaving it unconverted ), then re-inserting it after setting the database and connection to utf-8 so the database knows what character set the data really is in.
Hope that makes sense. Let me know if it doesn't. This is a hard one to wrap your head around.
accented French characters
What's the problem, exactly? Have you set @Codepage=65001 in the page directives at the top of your file? Have you marked the content-type with the correct encoding so that the client knows what its getting?
If you see question marks, it's probable that you haven't set the response code page correctly. If you see two unrelated characters in place of a single character with a diacritic , you haven't told the client what it needs to know to treat the page as UTF-8, e.g.
Response.CodePage = 65001 ;
Response.CharSet = "utf-8" ;
There are slight differences between asp.net and asp handling of encoding, so it would also be helpful if you were more specific about which technology you're using, but that should get you most of the way there.
In ASP.Net, you can set the encoding site-wide in your web.config file, so you can avoid messing with Response.CodePage and Request.CodePage on every page. You still want to mark the Response Charset using the meta http-equiv content-type element in your HTML or using Response.Charset.
<globalization
requestEncoding="utf-8"
responseEncoding="utf-8" />
If you don't want to use web.config for this for some reason, you'd use <%@CodePage=65001 %> in your .aspx file before you output any text, in the page directives.
It looks like the page in question contains incorrectly encoded UTF-8. Is the content coming straight from the .aspx file or is it being pulled from a database or something?
UTF-8 French accented characters issue
This is quite common charset issue, you need to set connection encoding manually for MySQL connection (those should be first queries you execute after establishing connection):
SET NAMES utf8;
SET CHARACTER SET utf8;
And also make sure every table has CHARACTER SET
set to UTF-8
.
Or you could also update server configuration.
Encoding in MySQL with french accents
The reason your ALTER
statements are not working is that they only set rules for how newly created tables will encode their text. For your tables which already exist, the ALTER
statements won't change anything.
I found this great blog post which describes how to use iconv
to convert an existing MySQL database from latin1
to utf8
. Here is the command:
mysqldump --add-drop-table my_database | replace CHARSET=latin1
CHARSET=utf8 | iconv -f latin1 -t utf8 | mysql my_database
The other answers which mentioned the distinction between LENGTH()
and CHAR_LENGTH()
are correct and you should also pay attention to this.
Related Topics
Calculate Difference Between 2 Times in Hours in PHP
How to Replace Newline or \R\N With <Br/>
On a Function That Gets Settings from a Db I Ran into the Error
Easiest Way to Replace All Characters in Even Positions in a String.
Dompdf Remote Image Is Not Displaying in Pdf
How to Get the Path to the Laravel Storage Folder
Laravel: Pdoexception: Could Not Find Driver
In Laravel How to Get Data in One Query from 3 Tables
How to Redirect to the Same Page in PHP
Getting Check Box Values as Checked from Database Codeigniter
Php: How to Detect If a Session Has Expired Automatically
How to Use a Findby Method With Comparative Criteria
Session Data Not Preserved After Redirection
Laravel - Form Input - Multiple Select for a One to Many Relationship
Laravel 5.5 Error 500 in Cpanel Shared Hosting
How to Get Username from Facebook Id