PHP explode the string, but treat words in quotes as a single word
You could use a preg_match_all(...)
:
$text = 'Lorem ipsum "dolor sit amet" consectetur "adipiscing \\"elit" dolor';
preg_match_all('/"(?:\\\\.|[^\\\\"])*"|\S+/', $text, $matches);
print_r($matches);
which will produce:
Array
(
[0] => Array
(
[0] => Lorem
[1] => ipsum
[2] => "dolor sit amet"
[3] => consectetur
[4] => "adipiscing \"elit"
[5] => dolor
)
)
And as you can see, it also accounts for escaped quotes inside quoted strings.
EDIT
A short explanation:
" # match the character '"'
(?: # start non-capture group 1
\\ # match the character '\'
. # match any character except line breaks
| # OR
[^\\"] # match any character except '\' and '"'
)* # end non-capture group 1 and repeat it zero or more times
" # match the character '"'
| # OR
\S+ # match a non-whitespace character: [^\s] and repeat it one or more times
And in case of matching %22
instead of double quotes, you'd do:
preg_match_all('/%22(?:\\\\.|(?!%22).)*%22|\S+/', $text, $matches);
PHP explode strings, but treat words in quotes as a single word
You may use
if (preg_match_all('~(?|"([^\\\\"]*(?:\\\\.[^"\\\\]*)*)"|([^\s"]+))~s', $s, $matches))
{
print_r($matches[1]);
}
See the regex demo.
Details
(?|
- starts a branch reset group:"
- a"
char([^\\\\"]*(?:\\\\.[^"\\\\]*)*)
- Group 1: any 0+ chars other than\
and"
followed with 0 or more repetitions of any escaped char and then any 0+ chars other than\
and"
"
- a"
char
|
- or([^\s"]+)
- Group 1: one or more chars other than whitespace and"
)
- end of the branch reset group.
See the PHP demo:
$s = '"foo bar"ANDbar"foo"AND"foofoo" lorem "impsum"';
if (preg_match_all('~(?|"([^\\\\"]*(?:\\\\.[^"\\\\]*)*)"|([^\s"]+))~s', $s, $matches))
{
print_r($matches[1]);
}
// => Array ( [0] => foo bar [1] => ANDbar [2] => foo [3] => AND [4] => foofoo [5] => lorem [6] => impsum )
PHP explode the string, but treat words in quotes as a single word and ignore brackets
The reason is that the regex you use is meant to keep standalone "
in the matches.
If you are sure the unescaped double quotes are always paired in your input, use
'/"(?:\\\\.|[^\\\\"])*"|[^\s"]+/'
^^^^^^
Exclude the "
from \S
by turning it into a negative character class [^\s]
and add the double quote inside.
To include single quoted substrings, you may use
'~"(?:\\\\.|[^\\\\"])*"|\'(?:\\\\.|[^\\\\\'])*\'|[^\s"\']+~'
See the regex demo and a PHP demo:
$re = '~"(?:\\\\.|[^\\\\"])*"|\'(?:\\\\.|[^\\\\\'])*\'|[^\s"\']+~';
$str = 'Lorem ipsum ("dolor sit amet") consectetur "adipiscing \\"elit" dolor \'something \\\'here\'';
preg_match_all($re, $str, $matches);
print_r($matches[0]);
// => Array ( [0] => Lorem [1] => ipsum [2] => ( [3] => "dolor sit amet" [4] => )
// [5] => consectetur [6] => "adipiscing \"elit" [7] => dolor [8] => 'something \'here' )
R: Explode string but keep quoted text as a single word
A simple option would be to use scan
:
> x <- scan(what = "", text = mystr)
Read 11 items
> x
[1] "preceded by itself in quotation marks forms a complete sentence"
[2] "preceded"
[3] "by"
[4] "itself"
[5] "in"
[6] "quotation"
[7] "marks"
[8] "forms"
[9] "a"
[10] "complete"
[11] "sentence"
Split string on spaces except words in quotes
You can use:
$string = 'Some of "this string is" in quotes';
$arr = preg_split('/("[^"]*")|\h+/', $string, -1,
PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
print_r ( $arr );
Output:
Array
(
[0] => Some
[1] => of
[2] => "this string is"
[3] => in
[4] => quotes
)
RegEx Breakup
("[^"]*") # match quoted text and group it so that it can be used in output using
# PREG_SPLIT_DELIM_CAPTURE option
| # regex alteration
\h+ # match 1 or more horizontal whitespace
An explode() function that ignores characters inside quotes?
str_getcsv
($str, '/')
There's a recipe for <5.3 on the linked page.
Exploding string on space but not spaces in quotation marks
You could use regex:
$string = 'test1 test2 "test3 test4"';
preg_match_all('/\"[\s\S]+\")|([\S]+)/ism', $string, $matches);
print_r($matches);
Alternatively, you could try using str_getcsv()
PHP string explode on space, except when in quotes
You may use
preg_split('~(?<!\\\\)(?:\\\\{2})*"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"(*SKIP)(*F)|\s+~s', $s)
See the regex demo
Details
(?<!\\)
- no\
allowed immediately to the left of the current location(?:\\{2})*
- zero or more double backslashes"
- a quote[^"\\]*
- 0+ chars other than"
and\
(?:\\.[^"\\]*)*
- 0+ sequences of\\.
- any escape sequence[^"\\]*
- 0+ chars other than"
and\
"
- a quote(*SKIP)(*F)
- skipping the match and proceeding to the next match from the current match end location|
- or\s+
- 1+ whitespaces in any other contexts.
See the PHP demo:
$s = 'title:"tab system" color:="blue" price:>10';
print_r(preg_split('~(?<!\\\\)(?:\\\\{2})*"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"(*SKIP)(*F)|\s+~s', $s));
Output:
Array
(
[0] => title:"tab system"
[1] => color:="blue"
[2] => price:>10
)
Related Topics
PHP to Search Within Txt File and Echo the Whole Line
Convert a Comma-delimited String into Array of Integers
Constant Filter_Sanitize_String Is Deprecated
Difference in Accessing Arrays in PHP 5.3 and 5.4 or Some Configuration Mismatch
Jquery Ui Sortable, Then Write Order into a Database
Imagecreatefrompng() Makes a Black Background Instead of Transparent
Which One Is the Best Pdf-API For PHP
PHP Domdocument Failing to Handle Utf-8 Characters (☆)
Trying to Get Property of Non-Object - Laravel 5
How to Remove Array Element and Then Re-index Array
PHP: Merge 2 Multidimensional Arrays
How to Check File Types of Uploaded Files in PHP
How to Access a Property With an Invalid Name