Question Mark Characters Display Within Text. Why Is This

Question mark characters display within text. Why is this?

The following articles will be useful:

10.3 Specifying Character Sets and Collations

10.4 Connection Character Sets and Collations

After you connect to the database, issue the following command:

SET NAMES 'utf8';

Ensure that your web page also uses the UTF-8 encoding:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

PHP also offers several functions that will be useful for conversions:

  • iconv
  • mb_convert_encoding

Why does a diamond with a questionmark in it � appear in my HTML?

This specific character � is usually the sign of an invalid (non-UTF-8) character showing up in an output (like a page) that has been declared to be UTF-8. It happens often when

  • a database connection is not UTF-8 encoded (even if the tables are)

  • a HTML or script source file is stored in the wrong encoding (e.g. Windows-1252 instead of UTF-8) - make sure it's saved as a UTF-8 file. The setting is often in the "Save as..." dialog.

  • an online source (like a widget or a RSS feed) is fetched that isn't serving UTF-8

Why does question mark show up in web browser?

There is a question mark because the encoding process recognizes that the encoding can't support the character, and substitutes a question mark instead. By "if you're really good," he means, "if you have a newer browser and proper font support," you'll get a fancier substitution character, a box.

In Joel's case, he isn't trying to display a real character, he literally included the Unicode replacement character, U+FFFD REPLACEMENT CHARACTER.

What does it mean when my text is displayed as Question Marks?

In Windows there are 2 common display problems that occur when trying to display Unicode characters:

  1. text sometimes appears as question marks

    • This occurs when Unicode data is converted to an 8-bit character set encoding (or technically multi-byte characters) usually via the system codepage (but other code pages can be specified in the conversion calls). If the target 8-bit character set doesn't include the characters needed, any characters not representable in the target character set get converted to question marks.
  2. text sometimes appears as boxes

    • This is a problem with the font not having the glpyh for a particular character. Boxes show up when there is a mismatch between Unicode characters in the document and those supported by the font. Specifically, the boxes represent characters not supported by the selected font.


Related Topics



Leave a reply



Submit