Regex to ignore accents? PHP
I don't think, there is such a way. That would be locale-dependent and you probably want a "/u" switch first to enable UTF-8 in pattern strings.
I would probably do something like this.
function prepare($pattern)
{
$replacements = Array("a" => "[áàäâ]",
"e" => "[éèëê]" ...);
return str_replace(array_keys($replacements), $replacements, $pattern);
}
pcre_replace("/(" . prepare($word) . ")/ui", "<b>\\1</b>", $str);
In your case, index was different, because unless you used mb_string
you were probably dealing with UTF-8 which uses more than one byte per character.
Regex to match string with and without special/accented characters?
You can use the \p{L}
pattern to match any letter.
Source
You have to use the u
modifier after the regular expression to enable unicode mode.
Example : /\p{L}+/u
Edit :
Try something like this. It should replace every letter with an accent to a search pattern containing the accented letter (both single character and unicode dual) and the unaccented letter. You can then use the corrected search pattern to highlight your text.
function mbStringToArray($string)
{
$strlen = mb_strlen($string);
while($strlen)
{
$array[] = mb_substr($string, 0, 1, "UTF-8");
$string = mb_substr($string, 1, $strlen, "UTF-8");
$strlen = mb_strlen($string);
}
return $array;
}
// I had to use this ugly function to remove accents as iconv didn't work properly on my test server.
function stripAccents($stripAccents){
return utf8_encode(strtr(utf8_decode($stripAccents),utf8_decode('àáâãäçèéêëìíîïñòóôõöùúûüýÿÀÁÂÃÄÇÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝ'),'aaaaaceeeeiiiinooooouuuuyyAAAAACEEEEIIIINOOOOOUUUUY'));
}
$clientName = 'céra';
$clientNameNoAccent = stripAccents($clientName);
$clientNameArray = mbStringToArray($clientName);
foreach($clientNameArray as $pos => &$char)
{
$charNA =$clientNameNoAccent[$pos];
if($char != $charNA)
{
$char = "(?:$char|$charNA|$charNA\p{M})";
}
}
$clientSearchPattern = implode($clientNameArray); // c(?:é|e|e\p{M})ra
$text = 'the client name is Céra but it could be Cera or céra too.';
$search = preg_replace('/(.*?)(' . $clientSearchPattern . ')(.*?)/iu', '$1<span class="highlight">$2</span>$3', $text);
echo $search; // the client name is <span class="highlight">Céra</span> but it could be <span class="highlight">Cera</span> or <span class="highlight">céra</span> too.
regex to also match accented characters
$search = str_replace(
['a','e','i','o','u','ñ'],
['[aá]','[eé]','[ií]','[oó]','[uú]','[nñ]'],
$search)
This and the same for upper case will complain your request. A side note: ñ
replacemet sounds invalid to me, as 'niño' is totaly diferent from 'nino'
PHP-REGEX: accented letters matches non-accented ones, and vice versa. How to achieve this?
You can try to make a function to create your regex expression based on your txt_search, replacing any possible match to all possible matches like this:
function search_term($txt_search) {
$search = preg_quote($txt_search);
$search = preg_replace('/[aàáâãåäæ]/iu', '[aàáâãåäæ]', $search);
$search = preg_replace('/[eèéêë]/iu', '[eèéêë]', $search);
$search = preg_replace('/[iìíîï]/iu', '[iìíîï]', $search);
$search = preg_replace('/[oòóôõöø]/iu', '[oòóôõöø]', $search);
$search = preg_replace('/[uùúûü]/iu', '[uùúûü]', $search);
// add any other character
return $search;
}
Then you use the result as a regex on your preg_replace.
How to match with regex unicode text ignoring diacritics on characters (Á É Í)
I finally found working solution thanks to this Tibor's answer here: Regex to ignore accents? PHP
My function highlights text ignoring diacritics, spaces, apostrophes and dashes:
function highlight($pattern, $string)
{
$array = str_split($pattern);
//add or remove characters to be ignored
$pattern=implode('[\s\'\-]*', $array);
//list of letters with diacritics
$replacements = Array("a" => "[áa]", "e"=>"[ée]", "i"=>"[íi]", "o"=>"[óo]", "u"=>"[úu]", "A" => "[ÁA]", "E"=>"[ÉE]", "I"=>"[ÍI]", "O"=>"[ÓO]", "U"=>"[ÚU]");
$pattern=str_replace(array_keys($replacements), $replacements, $pattern);
//instead of <u> you can use <b>, <i> or even <div> or <span> with css class
return preg_replace("/(" . $pattern . ")/ui", "<u>\\1</u>", $string);
}
Replacing accented characters php
I have tried all sorts based on the variations listed in the answers, but the following worked:
$unwanted_array = array( 'Š'=>'S', 'š'=>'s', 'Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U',
'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss', 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c',
'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o',
'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y' );
$str = strtr( $str, $unwanted_array );
Remove special characters in regex PHP that allow accented words and chinese language
$string = preg_replace('/\PL/u', '', $string);
L
is a character attribute meaning letter\P
means does not match attribute/u
is the Unicode modifier, you need this if you want to handle Unicode characters- make sure
$string
is encoded in UTF-8
So this matches all non-letters and removes them. I can only guess that this matches what you want. See http://www.php.net/manual/en/regexp.reference.unicode.php for more attributes you could match by, e.g. /[^\pL\pS]/u
would match everything except letters and "symbols".
Regex that checks upper or lower case characters with or without accents
I see no reason as to why adding \s
to that regex would not work. \s
should match all whitespace characters.
$foo = preg_replace("/[^áéíóúÁÉÍÓÚñÑa-zA-Z\s]/", "", $_REQUEST["bar"]);
Related Topics
Different Recipients Based on Product Category in Woocommerce Email Notification
Php: 'Or' Statement on Instruction Fail: How to Throw a New Exception
How to Use MySQLi Bind_Param Dynamically
JSON Encode an Entire MySQL Result Set
Should I Use Both Striptags() and HTMLspecialchars() to Prevent Xss
Using a Variable as an Operator
How to Execute MySQL Script with Variables Using PHP::Pdo
How to Get the Image Type in PHP
Share Session Between Two Websites
PHP - Upload Picture and Display on Page
Use of Undefined Constant Stdin - Assumed 'Stdin' in C:\Wamp\Www\Study\Sayhello.PHP on Line 5