Php: How to Explode a String by Commas, But Not Wheres the Commas Are Within Quotes

PHP: How can I explode a string by commas, but not wheres the commas are within quotes?

Since you are using comma seperated values, you can use str_getcsv.

str_getcsv($line, ",", "'");

Will return:

Array
(
[0] => TRUE
[1] => 59
[2] => A large number is 10,000
)

explode commas but ignore commas within brackets php

We can make a slight correction to your current regex splitting logic by using the following pattern:

,(?![^(]+\))

This says to split on comma, but only if that comma does not occur inside a terms in parentheses. It works by using a negative lookahead checking that we do not see a ) without first seeing an opening (, which would imply that the comma be inside a (...) term.

$string = "Beer - Domestic,Food - Snacks (chips,dips,nuts),Beer - Imported,UNCATEGORIZED";
$keywords = preg_split("/,(?![^(]+\))/", $string);
print_r($keywords);

This prints:

Array
(
[0] => Beer - Domestic
[1] => Food - Snacks (chips,dips,nuts)
[2] => Beer - Imported
[3] => UNCATEGORIZED
)

PHP split comma-separated values but retain quotes

This pushed the limits of my regex knowledge, and I was unable to come up with an elegant regex which covers all possible cases for the input string (e.g. If the string ends with a comma) without leaving empty matches at the end.

$parts = preg_split('/(?:"[^"]*"|)\K\s*(,\s*|$)/', $string);

By itself, this gives:

Array
(
[0] => Hello
[1] => "San Diego, California"
[2] =>
)

And you can clean up the empty elements like this:

$result = array_filter($parts, function ($value) {
return ($value !== '');
});

Note: The regex trims white-space from the start/end of each match. Remove the \s* parts if you don't want that.

Explode expect if there is

Use str_getcsv() for PHP >= 5.3:

var_dump( str_getcsv( 'hey,how,are,"yo how,are you?"'));

This prints:

array(4) { [0]=> string(3) "hey" [1]=> string(3) "how" [2]=> string(3) "are" [3]=> string(15) "yo how,are you?" }

Explode does not work with multilpe commas inside the string

If you want to remove any space and duplicate tags then you need to also add array_unique with array_filter

$textAray = array_unique(array_filter($textAray));

Note .. Please this would not remove the . in the result .. here is a better way to filter your results

$text = "brazil,banks,home,,uk,,,,test,financial times,.,ipad,,banks,,Two words,,";
$textArray = array_unique(preg_split("/[,.]+/", $text));
$textArray = array_filter($textArray);
echo implode(",", $textArray);

Output

brazil,banks,home,uk,test,financial times,ipad,Two words

Add double quote between text seperated by comma

It should work like a charm:

$parts = split(', ', 'starbucks, KFC, McDonalds');

echo('"' . join('", "', $parts) . '"');


Note: As it has noticed in the comments (thanks, nodeffect), "split" function has been DEPRECATED as of PHP 5.3.0. Use "explode", instead.

regexp split string by commas and spaces, but ignore the inside quotes and parentheses

This will work only for non-nested parentheses:

    $regex = <<<HERE
/ " ( (?:[^"\\\\]++|\\\\.)*+ ) \"
| ' ( (?:[^'\\\\]++|\\\\.)*+ ) \'
| \( ( [^)]* ) \)
| [\s,]+
/x
HERE;

$tags = preg_split($regex, $str, -1,
PREG_SPLIT_NO_EMPTY
| PREG_SPLIT_DELIM_CAPTURE);

The ++ and *+ will consume as much as they can and give nothing back for backtracking. This technique is described in perlre(1) as the most efficient way to do this kind of matching.

How to parse Telegram message by comma, but keep commas that are in parenthesis

You can chain a couple of regexes together and do some data transformation to get the result as a PHP array.

This output looks like it has been printed with Python, instead of using a script to parse the python dump, I'd recommend modifying the python script to output something easier to parse.

$raw_message = "Message(id=1650, peer_id=PeerChannel(channel_id=1286966173), date=datetime.datetime(2022, 4, 15, 17, 14, 25, tzinfo=datetime.timezone.utc), message='Please check your email inbox', out=False, mentioned=False, media_unread=False, silent=False, post=True, from_scheduled=False, legacy=False, edit_hide=False, pinned=False, from_id=None, fwd_from=None, via_bot_id=None, reply_to=None, media=None, reply_markup=None, entities=[], views=382, forwards=0, replies=None, edit_date=None, post_author=None, grouped_id=None, restriction_reason=[], ttl_period=None)04-18-2022 01:25am";

$strip_message = '';
preg_match("/Message\((.*)\)/", $raw_message, $strip_message);

$split_by_comma = array_map('trim', preg_split("/,(?![^()]*\))/", $strip_message[1]));

$message=[];

foreach ($split_by_comma as $element) {
$split_by_equals = explode('=', $element);
$key = array_shift($split_by_equals);
$value = implode('=', $split_by_equals);
if ($key !== 'message' && $key !== 'date') {
$message[$key] = $value;
continue;
}

if ($key === 'date') {
$date_tmp = '';
preg_match("/\((.*)\)/", $value, $date_tmp);
$date_split = explode(', ', $date_tmp[1]);
$date = $date_split[0] . '-' . $date_split[1] . '-' . $date_split[2] . ' ' . $date_split[3] . ':' . $date_split[4] . ':' . $date_split[5];
$message[$key] = $date;
}

if ($key === 'message') {
$message[$key] = str_replace("'", '', $value);
}
}

var_dump($message);

Results in:

array(28) {
["id"]=>
string(4) "1650"
["peer_id"]=>
string(34) "PeerChannel(channel_id=1286966173)"
["date"]=>
string(18) "2022-4-15 17:14:25"
["message"]=>
string(29) "Please check your email inbox"
["out"]=>
string(5) "False"
["mentioned"]=>
string(5) "False"
["media_unread"]=>
string(5) "False"
["silent"]=>
string(5) "False"
["post"]=>
string(4) "True"
["from_scheduled"]=>
string(5) "False"
["legacy"]=>
string(5) "False"
["edit_hide"]=>
string(5) "False"
["pinned"]=>
string(5) "False"
["from_id"]=>
string(4) "None"
["fwd_from"]=>
string(4) "None"
["via_bot_id"]=>
string(4) "None"
["reply_to"]=>
string(4) "None"
["media"]=>
string(4) "None"
["reply_markup"]=>
string(4) "None"
["entities"]=>
string(2) "[]"
["views"]=>
string(3) "382"
["forwards"]=>
string(1) "0"
["replies"]=>
string(4) "None"
["edit_date"]=>
string(4) "None"
["post_author"]=>
string(4) "None"
["grouped_id"]=>
string(4) "None"
["restriction_reason"]=>
string(2) "[]"
["ttl_period"]=>
string(4) "None"
}


Related Topics



Leave a reply



Submit