Send emails with international accent and special characters
You need to use MIME. Add mail headers:
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
(If you are already using a MIME multipart/alternative
to put HTML and text in the same mail, you put the Content-Type: text/plain;charset=utf-8
on the sub-headers of the text part instead.)
This is assuming that the encoding you'll be sending your “international” characters in is UTF-8. If you are expecting to cater for multiple countries UTF-8 is the only reasonable choice of encoding to use throughout your application, but if you haven't really thought about that yet your site may be defaulting to a Western European encoding. Check that things like Chinese characters work correctly in your site and database before worrying about them in mail.
Derail: there are locales where sending mail in UTF-8 isn't the most effective thing. I don't know about China, but in Japan there are still some backwards and ridiculous mail systems (especially webmail) that can't cope with Unicode and have to be given a locale-specific encoding such as Shift-JIS instead. If you are concentrating on those markets you'll often end up having to use iconv
to create specially-encoded versions of the mail. Unpleasant.
Now, because many mail servers can't cope with non-ASCII characters in the mail body, you'll have to encode them. You can choose quoted-printable or base64 for this; quoted-printable is generally smaller and more readable for content that has ASCII characters in it too:
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: quoted-printable
Hello! An a-acute is =C3=A1
The function to encode in this format is quoted_printable_encode. However you do need a reasonably up-to-date PHP to get that function; if you don't have it you could set the Content-Transfer-Encoding
to base64
instead and use base64_encode.
Finally, if you want to include non-ASCII characters in the headers (for example in From
, To
or Subject
), there is a completely different syntax:
Subject: =?utf-8?b?QW4gYS1hY3V0ZSBpcyDDoQ==?=
Where that QW...==
mess in the middle is the base64_encode
of “An a-acute is á” in UTF-8.
What characters are allowed in an email address?
See RFC 5322: Internet Message Format and, to a lesser extent, RFC 5321: Simple Mail Transfer Protocol.
RFC 822 also covers email addresses, but it deals mostly with its structure:
addr-spec = local-part "@" domain ; global address
local-part = word *("." word) ; uninterpreted
; case-preserved
domain = sub-domain *("." sub-domain)
sub-domain = domain-ref / domain-literal
domain-ref = atom ; symbolic reference
And as usual, Wikipedia has a decent article on email addresses:
The local-part of the email address may use any of these ASCII characters:
- uppercase and lowercase Latin letters
A
toZ
anda
toz
;- digits
0
to9
;- special characters
!#$%&'*+-/=?^_`{|}~
;- dot
.
, provided that it is not the first or last character unless quoted, and provided also that it does not appear consecutively unless quoted (e.g.John..Doe@example.com
is not allowed but"John..Doe"@example.com
is allowed);- space and
"(),:;<>@[\]
characters are allowed with restrictions (they are only allowed inside a quoted string, as described in the paragraph below, and in addition, a backslash or double-quote must be preceded by a backslash);- comments are allowed with parentheses at either end of the local-part; e.g.
john.smith(comment)@example.com
and(comment)john.smith@example.com
are both equivalent tojohn.smith@example.com
.
In addition to ASCII characters, as of 2012 you can use international characters above U+007F
, encoded as UTF-8 as described in the RFC 6532 spec and explained on Wikipedia. Note that as of 2019, these standards are still marked as Proposed, but are being rolled out slowly. The changes in this spec essentially added international characters as valid alphanumeric characters (atext) without affecting the rules on allowed & restricted special characters like !#
and @:
.
For validation, see Using a regular expression to validate an email address.
The domain
part is defined as follows:
The Internet standards (Request for Comments) for protocols mandate that component hostname labels may contain only the ASCII letters
a
throughz
(in a case-insensitive manner), the digits0
through9
, and the hyphen (-
). The original specification of hostnames in RFC 952, mandated that labels could not start with a digit or with a hyphen, and must not end with a hyphen. However, a subsequent specification (RFC 1123) permitted hostname labels to start with digits. No other symbols, punctuation characters, or blank spaces are permitted.
php mail special characters utf8
Did you try iconv_set_encoding ?
This should work :
<?php
iconv_set_encoding("internal_encoding", "UTF-8");
$subject = "Testmail — Special Characters";
$msg = "Hi there,\n\nthis isn’t something easy.\n\nI haven’t thought that it’s that complicated!";
mail(utf8_decode($to), utf8_decode($subject), utf8_decode($msg), utf8_decode($from)."\nContent-Type: text/plain; charset=UTF-8\nContent-Transfer-Encoding: 8bit\n");?>
Related Topics
Get the Price of an Item on Steam Community Market with PHP and Regex
Xml Creation Using Codeigniter
"Class Xxx Is Not a Valid Entity or Mapped Super Class" After Moving the Class in the Filesystem
Rationale Behind Simplexmlelement's Handling of Text Values in Addchild and Addattribute
Error_Reporting(E_All) Does Not Produce an Error
Differencebetween ' and " in PHP
Should Lock_Ex on Both Read & Write Be Atomic
How Is an Array in a PHP Foreach Loop Read
MySQL Query In() Clause Slow on Indexed Column
Paypal Gateway Has Rejected Request. Security Header Is Not Valid (#10002: Security Error Magento
The Requested Url /Projectname/Users Was Not Found on This Server. Laravel
Formatting an SQL Timestamp with PHP
Create Thumbnail from Video Without Ffmpeg in PHP