How to Iterate Utf-8 String in PHP

How to iterate UTF-8 string in PHP?

Use preg_split. With "u" modifier it supports UTF-8 unicode.

$chrArray = preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY);

How To Iterate UTF-8 String To Mysql

If you want to generate raw SQL queries, you can do so by using find and replace in your text editor (that looks like Notepad++). I'm guessing that your delimiters are tabs.

  1. Find and replace all tab characters and replace them with a comma. We do not need to quote anything as all of your fields are integers.
  2. Find and replace all newline characters and replace them with a SQL query.

Execute these commands in regular expression mode:

Columns

Find: \t

Replace: ,

Rows

Find: \r\n (if that doesn't find anything, look for \n)

Replace: );\r\nINSERT INTO Rating (user_id, item_id, rating, timestamp) VALUES (

On the first row, insert the text INSERT INTO Rating (user_id, item_id, rating, timestamp) VALUES ( to make the row a valid SQL statement.

On the last row, remove any trailing portions of SQL query after the last semicolon.

Copy and paste this into your PHPMyAdmin and it should be all good.

PHP - iterate on string characters

Use str_split to iterate ASCII strings (since PHP 5.0)

If your string contains only ASCII (i.e. "English") characters, then use str_split.

$str = 'some text';
foreach (str_split($str) as $char) {
var_dump($char);
}

Use mb_str_split to iterate Unicode strings (since PHP 7.4)

If your string might contain Unicode (i.e. "non-English") characters, then you must use mb_str_split.

$str = 'μυρτιὲς δὲν θὰ βρῶ';
foreach (mb_str_split($str) as $char) {
var_dump($char);
}

How do I get the length of UTF-8 string PHP?

It sounds like the input ($string) is in another encoding - probably iso-8859-1 (especially if mb_strlen() == strlen()).

If $string has come from a form input, you need ensure that the form in posting in UTF-8 format. Unless specified the default is often iso-8859-1.

This is done with decent browsers by doing:

<form action="form.php" method="POST" accept-charset="utf-8">

How to get UTF-8 string using COM object in PHP?

Well, answer was right in front of my eyes, I just overlooked:

COM::__construct ( string $module_name [, mixed $server_name [, int $codepage [, string $typelib ]]] )

There is codepage parameter. If set to CP_UTF8 it works.

$server_name should be NULL if server is not used.

Why string with encoding 'UTF-8' have broken symbols when I looping the string

Within UTF-8, "ö" is encoded using more than one byte.

PHP strings are dumb byte arrays; PHP is not aware of "characters" or such at all.

Accessing string offsets using $str[x] accesses one specific byte; strlen reports the length in bytes, not "characters".

Put all this together and the result is that you're accessing individual bytes rather than characters, and in the case of "ö" that results in outputting half of a character/nonsensical bytes.

Use the mb_ functions to iterate and access strings properly by character, not by byte count: mb_strlen, mb_substr.



Related Topics



Leave a reply



Submit