Email from PHP Has Broken Subject Header Encoding

Email from PHP has broken Subject header encoding

Update   For a more practical and up-to-date answer, have a look at Palec’s answer.


The specified character encoding in Content-Type does only describe the character encoding of the message body but not the header. You need to use the encoded-word syntax with either the quoted-printable encoding or the Base64 encoding:

encoded-word = "=?" charset "?" encoding "?" encoded-text "?="

You can use imap_8bit for the quoted-printable encoding and base64_encode for the Base64 encoding:

"Subject: =?UTF-8?B?".base64_encode($subject)."?="
"Subject: =?UTF-8?Q?".imap_8bit($subject)."?="

Problems with PHP Mailer header encoding

Use

$mail->CharSet = 'UTF-8';

instead of

$mail->CharSet = "utf8";

PHP mail special characters in subject field

Try for subject:

$sub = '=?UTF-8?B?'.base64_encode($subject).'?=';

And then:

mail($to, $sub, $message, $headers);

UTF-8 character encoding for email headers with PHP

Seems there is a bug that ignores the second parameter, I get the correct result when I add internal encoding:

<?php
$rarr = "\xe2\x86\x92";
mb_internal_encoding( "UTF-8");
echo mb_encode_mimeheader($rarr, 'UTF-8'); //=?UTF-8?B?4oaS?=

But

<?php
$rarr = "\xe2\x86\x92";

mb_encode_mimeheader($rarr, 'UTF-8'); //=?UTF-8?B?w6LChsKS?=

Just setting internal encoding is enough:

<?php
$rarr = "\xe2\x86\x92";
mb_internal_encoding( "UTF-8");
echo mb_encode_mimeheader($rarr); //=?UTF-8?B?4oaS?=

PHP mail utf-8 issues

Email is a very archaic, outdated, and difficult technology. This is especially true with character encoding. I would suggest using a library, such as PHPMailer. If you insist upon doing this from scratch, the true answer will go beyond the scope of your question. Browsers, clients, and much more complicate this issue. I can't tell you the cause of your issue, but the solution is PHPMailer.

Escape email subject line

&...; are HTML/XML entities and have nothing to do with email. You will not be able to reliably have these translated into the desired symbol, and I would consider anything that did translate them to be the result of a bug.

Also, there is not such thing as an "ASCII è". "ASCII" isn't a real encoding, "extended ASCII" is a misapplication of ISO8859 and or Microsoft cp12XX encodings. If your client can't support anything other than unaccented english text, then that's all you can use.

That said, while all email headers must, according to spec, be 7-bit-safe "ASCII" text there is a provision for encoding header containing text in other charsets. UTF, ISO, MS CP, etc.

function encode_subject($input, $charset, $method='B') {
switch($method) {
case 'B':
$encoded = base64_encode($input);
break;
case 'Q':
$encoded = quoted_printable_encode($input);
break;
default:
throw new Exception('Unknonw encoding method: ' . $method);
}

return sprintf('=?%s?%s?%s?=', $charset, $method, $encoded);
}

$input = 'Welcome to the fancy è club!'; // utf8
$utf8 = $input;
$iso8859_1 = mb_convert_encoding($input, 'iso-8859-1', 'utf-8');
$cp1252 = mb_convert_encoding($input, 'cp1252', 'utf-8');

var_dump(
$utf8,
encode_subject($utf8, 'utf-8', 'B'),
encode_subject($utf8, 'utf-8', 'Q'),
$iso8859_1,
encode_subject($iso8859_1, 'iso-8859-1', 'B'),
encode_subject($iso8859_1, 'iso-8859-1', 'Q'),
$cp1252,
encode_subject($cp1252, 'cp1252', 'B'),
encode_subject($cp1252, 'cp1252', 'Q')
);

Output:

string(29) "Welcome to the fancy è club!"
string(52) "=?utf-8?B?V2VsY29tZSB0byB0aGUgZmFuY3kgw6ggY2x1YiE=?="
string(45) "=?utf-8?Q?Welcome to the fancy =C3=A8 club!?="

string(28) "Welcome to the fancy � club!"
string(57) "=?iso-8859-1?B?V2VsY29tZSB0byB0aGUgZmFuY3kg6CBjbHViIQ==?="
string(47) "=?iso-8859-1?Q?Welcome to the fancy =E8 club!?="

string(28) "Welcome to the fancy � club!"
string(53) "=?cp1252?B?V2VsY29tZSB0byB0aGUgZmFuY3kg6CBjbHViIQ==?="
string(43) "=?cp1252?Q?Welcome to the fancy =E8 club!?="

So whatever charset you're sending your emails as, use that to encode the subject as well. If your recipients are using old, busted mail clients that cant properly decode text in the language that they probably speak, then they have much larger problems that you have nothing to do with.

Hot Take

UTF-8 everywhere, for everything. Anything that doesn't support UTF8 in 2020 is defective and not your problem. Unless your target market is people using Windows ME or a Palm Pilot from 2004, use UTF-8.

PHP mail: normal characters from link are being replaced

This is RFC2045 quoted-printable encoding and is entirely normal. The problem is that you're declaring a content-transfer-encoding, but not encoding the content to match, so anything that looks like QP-encoding is getting decoded incorrectly. You need to apply it to the entire MIME part (which in your case is the whole message), not just the URL, using quoted_printable_encode like this:

mail($email, $subject, quoted_printable_encode($message), $headers);

Calling this will also wrap your text to 76 character lines, but that will not affect the appearance of the delivered message as the encoding is lossless.

And please don't tag your questions as PHPMailer if you're not using PHPMailer.



Related Topics



Leave a reply



Submit