PHP messing with HTML Charset Encoding
You have probably come to mix encoding types.
For example. A page that is sent as iso-8859-1, but get UTF-8 text encoding from MySQL or XML would typically fail.
To solve this problem you must keep control on input ecodings type in relation to the type of encoding you have chosen to use internal.
If you send it as an iso-8859-1, your input from the user is also iso-8859-1.
header("Content-type:text/html; charset: iso-8859-1");
And if mysql sends latin1 you do not have to do anything.
But if your input is not iso-8859-1 you must converted it, before it's sending to the user or to adapt it to Mysql before it's store.
mb_convert_encoding($text, mb_internal_encoding(), 'UTF-8'); // If it's UTF-8 to internal encoding
Short it means that you must always have input converted to fit internal encoding and convereter output to match the external encoding.
This is the internal encoding I have chosen to use.
mb_internal_encoding('iso-8859-1'); // Internal encoding
This is a code i use.
mb_language('uni'); // Mail encoding
mb_internal_encoding('iso-8859-1'); // Internal encoding
mb_http_output('pass'); // Skip
function convert_encoding($text, $from_code='', $to_code='')
{
if (empty($from_code))
{
$from_code = mb_detect_encoding($text, 'auto');
if ($from_code == 'ASCII')
{
$from_code = 'iso-8859-1';
}
}
if (empty($to_code))
{
return mb_convert_encoding($text, mb_internal_encoding(), $from_code);
}
return mb_convert_encoding($text, $to_code, $from_code);
}
function encoding_html($text, $code='')
{
if (empty($code))
{
return htmlentities($text, ENT_NOQUOTES, mb_internal_encoding());
}
return mb_convert_encoding(htmlentities($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}
function decoding_html($text, $code='')
{
if (empty($code))
{
return html_entity_decode($text, ENT_NOQUOTES, mb_internal_encoding());
}
return mb_convert_encoding(html_entity_decode($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}
HTML Website Character Encoding Mess
I found a similar post here
- Make sure the database charset/coallition is UTF-8
- On the page you insert these russian characters ( the form, textarea ), make sure the encoding is UTF-8, by setting Content-Type to
text/html; charset=utf-8
. Enter in russian text directly to the form input. - On the processing page that handles this form, which inserts it into the database, make sure to do
SET NAMES utf8
so it's stored as UTF-8 before you insert the data, in a separate query beforehand. - When you render the content from the database in a view, make sure the
Content-Type
istext/html; charset=utf-8
.
Make sure that the content-type is not windows-1251 or iso-8859-1/latin1. Make sure the database charset/coallition is NOT ISO-8859-1/Latin1.
Utf8 in html correct and php html output messed up
To close this question myself (because I feel rather stupid right now), the one who actually solved this is Marc B as his comments made me understand the process of text encoding.
After setting the header (Content Type and charset) as well as setting the meta tag in HTML I discovered, just like Marc suspected that my IDE had encoded the php file in another encoding than UTF8. Saving the file as UTF8 and replacing the messed up specialchars fixed my issue.
Please excuse this, I wasn't fully aware of what I was doing.
Using PHP in HTML pages without messing up Character Encoding
I think what the problem may be now, is the type of editor you are using.
I created a file with plain Windows Notepad, and it did not show the characters correctly.
However, when I pasted my codes into Notepad++ and saved it as "Encoding/Encode in UTF-8 without BOM" (byte order mark),
it displayed correctly.
Visit notepad-plus-plus.org to download it. It has different encoding formats.
Content-Type charset won't change
It appeared I had a Chrome encoding extension installed, which was set to Windows-1251: https://chrome.google.com/webstore/detail/set-character-encoding/bpojelgakakmcfmjfilgdlmhefphglae
Charset=utf8 not working in my PHP page
sounds like you don't serve your content as utf-8. do this by setting the correct header:
header('Content-type: text/html; charset=utf-8');
in addition to be really sure the browser understands, add a meta-tag:
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
note that, depending on where the text comes from, you might have to check some other things too (database-connection, source-file-encoding, ...) - i've listed a lot of them in one of my answers to a similar question.
Fix incorrectly displayed encoding on an html document with php
- You need to save the page with
UTF-8 without BOM
encoding. Add this header on top of your script:
header("Content-Type: text/html; charset=UTF-8");
[EDIT]: How to Save Files as UTF-8 without BOM :
On OP request, here's how you can do on Windows:
- Download Notepad++. It is an awesome text-editor that you should be using.
- Install it.
- open the PHP script in Notepad++ that contains this code. The page where you are doing all the coding. Yes, that file on your computer.
- In Notepad++, from the Encoding menu at the top, select "Convert to UTF-8 without BOM".
- Save the file.
- Upload to your webserver by FTP or whatever you use.
- Now, run that script.
Related Topics
How Do Detect That Transaction Has Already Been Started
File_Get_Contents() Give Me 403 Forbidden
How to Extract Images from a PDF File
Understanding Nested PHP Ternary Operator
PHP Regex - Valid Float Number
Display Float Value W/O Scientific Notation
Extending the Controller Class in Codeigniter
Using PHP Variables Inside HTML Tags
Date Function Output in a Local Language
PHP Exec() Will Not Execute Shell Command When Executed via Browser
Find Out Which Class Called a Method in Another Class
How to Handle Error for Duplicate Entries
Remove Exif Data from Jpg Using PHP