PHP: Convert any string to UTF-8 without knowing the original character set, or at least try
What you're asking for is extremely hard. If possible, getting the user to specify the encoding is the best. Preventing an attack shouldn't be much easier or harder that way.
However, you could try doing this:
iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $text);
Setting it to strict might help you get a better result.
How to convert any character encoding to UTF8 on PHP
Rather than blindly trying to detect the encoding, you should first check if the page that you downloaded has a listed character set. The character set may be set in the HTTP response header, for example:
Content-Type:text/html; charset=utf-8
Or in the HTML as a meta tag, for example:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Only if neither are available then try to guess the encoding with mb_detect_encoding() or other methods.
Set character set and convert to utf-8 without bom
PHP does not have any concept of character encodings; strings are binary data. The trick that makes everything seem to work is setting the output device, whether it's a web page or a terminal, to the correct character encoding.
If you are generating a web page, you can send the content-type header to tell the browser how the page is encoded.
header("Content-type: text/html;charset=utf-8");
Related Topics
Is Closing the MySQL Connection Important
How to Have a 64-Bit Integer in PHP
PHP Error: Fatal Error: Constant Expression Contains Invalid Operations
How to Use Shell_Exec Without Waiting For the Command to Complete
PHP Multidimensional Array Searching (Find Key by Specific Value)
PHP, Get File Name Without File Extension
How to Remove File Extension from a Website Address
Generating a Random Password in PHP
Where to Put Password_Verify in Login Script
How to Call Shell Script from PHP That Requires Sudo
Warning: MySQLi_Select_Db() Expects Exactly 2 Parameters, 1 Given
How to Declare a Global Variable in PHP