Using Preg_Split with Multiple Spaces

php preg_split explode line by multiple spaces and tabs in fields that may contain spaces too

You can use str_getcsv()

<?php
$line = 'A 4 "AB5672HMKL OLD" B 9500 8150 39 0000 L XFN "ProductPN"';

print_r(str_getcsv($line, ' '));

Output:-https://3v4l.org/6Qs5e

preg_split not splitting on space

The problem is that you are using the PREG_SPLIT_NO_EMPTY but instead of the fourth parameter, you put it as the third, effectively putting a limit, see the manual on preg_split().

You should use:

preg_split('/\s+/', $input, -1, PREG_SPLIT_NO_EMPTY);
^^ flags go in the 4th parameter of the function
^^ default value, no limit

or:

preg_split('/\s+/', $input, NULL, PREG_SPLIT_NO_EMPTY);

How can I explode a string by more than one space, but not by exactly one space in php

Just use preg_split('/ +/', $line); -- this is a blank followed by one or more blanks. Note that there are 2 blanks between the / and the +, even if it looks like one. Or, you could write it as '/ {2,}/', which also means previous expression (the blank) repeated at least 2 times.

Better way to convert string with multiple spaces into array

Use preg_split(), with 2 or more spaces as the delimiter.

$array = preg_split('/\s{2,}/', $line);

preg_split a string by spaces not between single quotes

Remove the g (global) modifier from your regular expression.

preg_split("/\x20(?=[^']*('[^']*'[^']*)*$)/", "'physical memory %'=92%;99;100 'physical memory'=29.69GB;31.68;32;0;32");

Working Demo

Although your regular expression will work, you could use the following which makes this a lot easier to ignore the space characters in quotations.

$results = preg_split("/'[^']*'(*SKIP)(*F)|\x20/", $str);
print_r($results);

Explanation:

The idea is to skip content in single quotations. I first match the quotation followed by any character except ', followed by a single quotation, and then make the subpattern fail and force the regular expression engine to not retry the substring with another alternative with (*SKIP) and (*F) backtracking control verbs.

Output

Array
(
[0] => 'physical memory %'=92%;99;100
[1] => 'physical memory'=29.69GB;31.68;32;0;32
)

preg_split by space and tab outside quotes

Just split your input according to the below regex. \h+ matches one or more horizontal space characters ie, spaces , tabs.

(?:'[^']*'|"[^"]*")(*SKIP)(*F)|\h+

(?:'[^']*'|"[^"]*") matches all the single and double quotes strings. (*SKIP)(*F) causes the match to fail and picks up all the characters which are matched by the pattern present just after to |. In our case, it's \h+ which matches one or more horizontal spaces.

DEMO

$str = 'microsoft.com.      3600    IN  TXT "v=spf1 include:_spf-a.microsoft.com include:_spf-b.microsoft.com include:_spf-c.microsoft.com -all"';
$match = preg_split('~(?:\'[^\']*\'|"[^"]*")(*SKIP)(*F)|\h+~', $str);
print_r($match);

Output:

Array
(
[0] => microsoft.com.
[1] => 3600
[2] => IN
[3] => TXT
[4] => "v=spf1 include:_spf-a.microsoft.com include:_spf-b.microsoft.com include:_spf-c.microsoft.com -all"
)

Split String at Last Space before Delimiter Using preg_split?

You can use a lookahead assertion:

$string = 'DZ9243/XSHAGT FFGD JERSE XS2 DZ9232/MHAGT SUUMTE KNI M10 DZ9232/LHAGT SUMMER KNI L6';
$pieces = preg_split('@ (?=[^ ]*/)@', $string);

print_r($pieces);

The output is:

Array
(
[0] => DZ9243/XSHAGT FFGD JERSE XS2
[1] => DZ9232/MHAGT SUUMTE KNI M10
[2] => DZ9232/LHAGT SUMMER KNI L6
)

The regex

@ (?=[^ ]*/)@
  • @ is the regex delimiter; usually / is used as delimiter but this regex attempts to match / and this is why it's better to use a different delimiter;
  • it is followed by a space character; it is the space you want to use as delimiter for splitting the input string;
  • ( starts a group; the group is needed by the assertion;
  • ?= is forward looking positive assertion; it requires the group to match the input string but it does not consume the characters from the input string that matches the group;
  • [^ ]*/ is the group content; it matches any non-space character any number of times, followed by a /; this is the word that contains the slash (/);
  • ) ends the group.

All in all, the regex matches the spaces that are followed by a word that contain a slash but the word is not consumed; it is not included in the delimiter by preg_split(), only the space character is used.



Related Topics



Leave a reply



Submit