ucfirst() function for multibyte character encodings
There is no mb_ucfirst
function, as you've already noticed. You can fake a mb_ucfirst
with two mb_substr
:
function mb_ucfirst($string, $encoding)
{
$firstChar = mb_substr($string, 0, 1, $encoding);
$then = mb_substr($string, 1, null, $encoding);
return mb_strtoupper($firstChar, $encoding) . $then;
}
UTF-8 ucfirst doesnt work
This works (I know it does, I'm using it in my own projects)
function mb_ucfirst($string, $encoding='UTF-8') {
$firstChar = mb_substr($string, 0, 1, $encoding);
$then = mb_substr($string, 1, mb_strlen($string, $encoding)-1, $encoding);
return mb_strtoupper($firstChar, $encoding) . $then;
} // end function mb_ucfirst
Use it as mb_ucfirst($string);
Complete example:
<?php
$string = mb_ucfirst("ååååeee");
echo $string;
function mb_ucfirst($string, $encoding='UTF-8') {
$firstChar = mb_substr($string, 0, 1, $encoding);
$then = mb_substr($string, 1, mb_strlen($string, $encoding)-1, $encoding);
return mb_strtoupper($firstChar, $encoding) . $then;
} // end function mb_ucfirst
?>
ucfirst() not working properly with scandinavic characters
Your problem here is not ucfirst()
it's strtolower()
. You have to use mb_strtolower()
, to get your string in lower case, e.g.
echo ucfirst(mb_strtolower($str));
//^^^^^^^^^^^^^^ See here
Also you can find a multibyte version of ucfirst()
in the comments from the manual:
Simple multi-bytes ucfirst():
<?php
function my_mb_ucfirst($str) {
$fc = mb_strtoupper(mb_substr($str, 0, 1));
return $fc.mb_substr($str, 1);
}
Code from plemieux from the manual comment
Php function UTF-8 characters issue
The most straightforward way to make your code UTF-8 aware is to use mbstring
functions instead of the plain dumb ones in the three cases where the latter appear:
function sentenceCase($str)
{
$cap = true;
$ret = '';
for ($x = 0; $x < mb_strlen($str); $x++) { // mb_strlen instead
$letter = mb_substr($str, $x, 1); // mb_substr instead
if ($letter == "." || $letter == "!" || $letter == "?") {
$cap = true;
} elseif ($letter != " " && $cap == true) {
$letter = mb_strtoupper($letter); // mb_strtoupper instead
$cap = false;
}
$ret .= $letter;
}
return $ret;
}
You can then configure mbstring
to work with UTF-8 strings and you are ready to go:
mb_internal_encoding('UTF-8');
echo sentenceCase ("üias skdfnsknka");
Bonus solution
Specifically for UTF-8 you can also use a regular expression, which will result in less code:
$str = "üias skdfnsknka";
echo preg_replace_callback(
'/((?:^|[!.?])\s*)(\p{Ll})/u',
function($match) { return $match[1].mb_strtoupper($match[2], 'UTF-8'); },
$str);
strtolower() for unicode/multibyte strings
Have you tried using mb_strtolower()
?
Encoding of Danish letters
Use mb_strtoupper
and specify the character-encoding in mb_substr
echo mb_strtoupper(mb_substr('ølstykke', 0, 1,'utf-8'));//Ø
In your case maybe you want not only first character but also the rest characters,
so maybe mb_convert_case
function can help you.
echo mb_convert_case('ølstykke', MB_CASE_TITLE, "UTF-8");//Ølstykke
How to make first letter of a word capital?
You may want to use ucfirst().
For multibyte strings, please see this snippet.
Related Topics
Email from PHP Has Broken Subject Header Encoding
Correct Way to Set Bearer Token with Curl
How to Efficiently Find the Closest Locations Nearby a Given Location
Run Composer with a PHP Script in Browser
PHP Case-Insensitive In_Array Function
Regular Expression to Collect Everything After the Last /
Symfony2 and Date_Default_Timezone_Get() - It Is Not Safe to Rely on the System's Timezone Settings
PHP Replacing Special Characters Like à->A, è->E
Laravel Orderby Relationship Count
Bounce Email Handling with PHP
Expected Response Code 220 But Got Code "", with Message "" in Laravel
Post Content-Length Exceeds the Limit
Why Should I Use Templating System in PHP