≪Meta Charset="Utf-8"≫ VS ≪Meta Http-Equiv="Content-Type"≫

meta charset=utf-8 vs meta http-equiv=Content-Type

In HTML5, they are equivalent. Use the shorter one, as it is easier to remember and type. Browser support is fine since it was designed for backwards compatibility.

What is meta http-equiv=Content-Type content=text/html; charset=utf-8 /?

According to HTML Dog:

The charset attribute can be used as a shorthand method to define an HTML document's character set, which is always a good thing to do. <meta charset="utf-8"> is the same as <meta http-equiv="content-type" content="text/html; charset=utf-8">.

So it's basically used to define the charset of your HTML document.

The reason why Visual Studio 2017 adds both the meta tags may be because this way your HTML will be maximum compatible with older browsers.

<meta http-equiv="content-type" content="text/html; charset=utf-8"> is the old way to define the charset.

<meta charset="utf-8"> is the new and shorter way to do the same thing.

IE 10 does not load page in UTF-8

What really causes the problem is that the HTTP headers sent by the server include

Content-Type: text/html; charset=windows-1251

This overrides any meta tags. You should of course fix the errors with the meta tag as pointed out in other answers, and run a markup validator to check your code, but to fix the actual problem, you need to fix the .htaccess file. Without seeing the file and other server-side issues, it is impossible to tell how to fix that (e.g., server settings might prevent the effect of a per-directory .htaccess file and apply one global file set by the server admin). Note that the file name must have two c's, not one (.htaccess, not `.htacess').

You can check what headers are e.g. using Rex Swain’s HTTP Viewer.

The reason why things work on other browsers is that they apply the modern HTML5 principle “BOM wins them all”. That is, an HTTP header wins a meta tag in specifying the character encoding, but if the actual data begins with three bytes that constitute the UTF-8 encoded form of the Byte Order Mark (BOM), then, no matter what, the data will be interpreted as UTF-8 encoded. For some unknown reason, IE 10 does not do that (and neither does IE 11).

But this won’t be a problem if you just make the server send an HTTP header that declares UTF-8.

If the server has been set to declare windows-1251 and you cannot possibly change that, then you just need to live with it. Transcode your HTML files to windows-1251 then, and declare windows-1251 in a meta tag. This means that if you need any characters outside the limited repertoire representable in windows-1251, you need to represent them using character references.

IE 10 does not load page in UTF-8

What really causes the problem is that the HTTP headers sent by the server include

Content-Type: text/html; charset=windows-1251

This overrides any meta tags. You should of course fix the errors with the meta tag as pointed out in other answers, and run a markup validator to check your code, but to fix the actual problem, you need to fix the .htaccess file. Without seeing the file and other server-side issues, it is impossible to tell how to fix that (e.g., server settings might prevent the effect of a per-directory .htaccess file and apply one global file set by the server admin). Note that the file name must have two c's, not one (.htaccess, not `.htacess').

You can check what headers are e.g. using Rex Swain’s HTTP Viewer.

The reason why things work on other browsers is that they apply the modern HTML5 principle “BOM wins them all”. That is, an HTTP header wins a meta tag in specifying the character encoding, but if the actual data begins with three bytes that constitute the UTF-8 encoded form of the Byte Order Mark (BOM), then, no matter what, the data will be interpreted as UTF-8 encoded. For some unknown reason, IE 10 does not do that (and neither does IE 11).

But this won’t be a problem if you just make the server send an HTTP header that declares UTF-8.

If the server has been set to declare windows-1251 and you cannot possibly change that, then you just need to live with it. Transcode your HTML files to windows-1251 then, and declare windows-1251 in a meta tag. This means that if you need any characters outside the limited repertoire representable in windows-1251, you need to represent them using character references.

How to save Russian characters in a UTF-8 encoded file

if your doctype is html declare <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'> but if your doctype is xhtml then declare <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />.

Never assume that end-user will act correctly during your designs

If you already have some document, edit your document's meta tag for charset declaration and use notepad++ encoding>convert to UTF-8 without BOM, save your document, safely go on with your multilingual structure from now on.

php tag is irrelevant for your question since you don't mention about any database char setting.

Do I need to declare content type/charset in both HTML and PHP?

If you are sending the charset in the headers, the is no need to repeat it in the HTML markup.

It is better to send this information in one place (DRY principle), as if the charsets conflict (ie. a header with UTF-8 and a meta with iso-8859-1), the browser will probably go to quirks mode.

Having said that, some automated tools (web scrapers) may not look at the header and deduce the page encoding only by the meta tag.

It is important to keep both the header and meta tag the same for each page - mixing different charsets may confuse browsers and cause display issues.

meta charset = UTF-8 vs charset = iso-8859-1

UTF-8

UTF-8 (UCS Transformation Format 8) is the World Wide Web's most common character encoding. UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character. The first 128 UTF-8 characters precisely match the first 128 ASCII characters (numbered 0-127), meaning that existing ASCII text is already valid UTF-8. After the first 128 code points, it utilizes a multibyte approach for additional characters.

ISO-8859-1

By contrast, ISO-8859-1 is a single-byte encoding scheme. The major downfall of this type of encoding is its inability to accommodate languages that are composed of more than 128 symbols.

Source: MDN entry on UTF-8



Related Topics



Leave a reply



Submit