How do I include the split delimiter in results for preg_split()?
Here you go:
preg_split('/([^.:!?]+[.:!?]+)/', 'good:news.everyone!', -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
How it works: The pattern actually turns everything into a delimiter. Then, to include these delimiters in the array, you can use the PREG_SPLIT_DELIM_CAPTURE
constant. This will return an array like:
array (
0 => '',
1 => 'good:',
2 => '',
3 => 'news.',
4 => '',
5 => 'everyone!',
6 => '',
);
To get rid of the empty values, use PREG_SPLIT_NO_EMPTY
. To combine two or more of these constants, we use the bitwise |
operator. The result:
array (
0 => 'good:',
1 => 'news.',
2 => 'everyone!'
);
preg_split - split by white space and by chosen character but keep the character in array
the problem is that i want to keep that comma in array
Then just use the flag PREG_SPLIT_DELIM_CAPTURE
PREG_SPLIT_DELIM_CAPTURE
If this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well.
http://php.net/manual/en/function.preg-split.php
So you will split it like this
$split = preg_split('/(,)\s|\s/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
You can test it here
https://3v4l.org/Eq8uS
For the Limit argument null
is more appropriate then -1
because we just want to skip to the flag argument. It's more clean when you read it because null means nothing where -1
may have some important value (in this case it doesn't) but it just makes it clearer for someone that doesn't know preg_split
as well that we are just ignoring that argument.
I am trying to split/explode/preg_split a string but I want to keep the delimiter
You can use preg_match_all like so:
$matches = array();
preg_match_all('/(\/block\/[0-9]+\/page\/[0-9]+)/', '/block/2/page/2/block/3/page/4', $matches);
var_dump( $matches[0]);
Output:
array(2) {
[0]=>
string(15) "/block/2/page/2"
[1]=>
string(15) "/block/3/page/4"
}
Demo
Edit: This is the best I could do with preg_split.
$array = preg_split('#(/block/)#', '/block/2/page/2/block/3/page/4', -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
$result = array();
for( $i = 0, $count = count( $array); $i < $count; $i += 2)
{
$result[] = $array[$i] . $array[$i + 1];
}
It's not worth the overhead to use a regular expression if you still need to loop to prepend the delimiter. Just use explode and prepend the delimiter yourself:
$delimiter = '/block/'; $results = array();
foreach( explode( $delimiter, '/block/2/page/2/block/3/page/4') as $entry)
{
if( !empty( $entry))
{
$results[] = $delimiter . $entry;
}
}
Demo
Final Edit: Solved! Here is the solution using one regex, preg_split
, and PREG_SPLIT_DELIM_CAPTURE
$regex = '#(/block/(?:\w+/?)+(?=/block/))#';
$flags = PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY;
preg_split( $regex, '/block/2/page/2/block/3/page/4', -1, $flags);
preg_split( $regex, '/block/2/page/2/order/title/sort/asc/block/3/page/4', -1, $flags);
Output:
array(2) {
[0]=>
string(15) "/block/2/page/2"
[1]=>
string(15) "/block/3/page/4"
}
array(2) {
[0]=>
string(36) "/block/2/page/2/order/title/sort/asc"
[1]=>
string(15) "/block/3/page/4"
}
Final Demo
Regex (preg_split): how do I split based on a delimiter, excluding delimiters included in a pair of quotes?
You can use the following.
$text = '1 2 3 4/5/6 "7/8 9" 10';
$results = preg_split('~"[^"]*"(*SKIP)(*F)|[ /]+~', $text);
print_r($results);
Explanation:
On the left side of the alternation operator we match anything in quotations making the subpattern fail, forcing the regular expression engine to not retry the substring using backtracking control with (*SKIP)
and (*F)
. The right side of the alternation operator matches either a space character or a forward slash not in quotations.
Output
Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
[5] => 6
[6] => "7/8 9"
[7] => 10
)
Split a string just before each occurrence of 3 specific delimiters
Try this:
$ar = preg_split('/(\$[^#]+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
preg_split : splitting a string according to a very specific pattern
Here's an attempt with preg_match
:
$pattern = "/^([^\[]+)\[([^\]]+)\]\s+\(([^,]+),\s+([^,]+),\s+([^,]+),\s+([^,]+)\)\s+(.+)$/i";
$string = "CADAVRES [FILM] (Canada : Québec, Érik Canuel, 2009, long métrage) FICTION";
preg_match($pattern, $string, $keywords);
array_shift($keywords);
print_r($keywords);
Output:
Array
(
[0] => CADAVRES
[1] => FILM
[2] => Canada : Québec
[3] => Érik Canuel
[4] => 2009
[5] => long métrage
[6] => FICTION
)
Try it!
Regex breakdown:
^ anchor to start of string
( begin capture group 1
[^\[]+ one or more non-left bracket characters
) end capture group 1
\[ literal left bracket
( begin capture group 2
[^\]]+ one or more non-right bracket characters
) end capture group 2
\] literal bracket
\s+ one or more spaces
\( literal open parenthesis
( open capture group 3
[^,]+ one or more non-comma characters
) end capture group 3
,\s+ literal comma followed by one or more spaces
([^,]+),\s+([^,]+),\s+([^,]+) repeats of the above
\) literal closing parenthesis
\s+ one or more spaces
( begin capture group 7
.+ everything else
) end capture group 7
$ EOL
This assumes your structure to be static and is not particularly pretty, but on the other hand, should be robust to delimiters creeping into fields where they're not supposed to be. For example, the title having a :
or ,
in it seems plausible and would break a "split on these delimiters anywhere"-type solution. For example,
"Matrix:, Trilogy() [FILM, reviewed: good] (Canada() : Québec , \t Érik Canuel , ): 2009 , long ():():[][]métrage) FICTIO , [(:N";
correctly parses as:
Array
(
[0] => Matrix:, Trilogy()
[1] => FILM, reviewed: good
[2] => Canada() : Québec
[3] => Érik Canuel
[4] => ): 2009
[5] => long ():():[][]métrage
[6] => FICTIO , [(:N
)
Try it!
Additionally, if your parenthesized comma region is variable length, you might want to extract that first and parse it, then handle the rest of the string.
php preg_split(). Not the right pattern and converts period to comma?
The biggest part of the solution is from @Rarst in this post.
I've ended up with this code:
function calcCssNewValue( $string, $scale ){
$stringArray = preg_split( '/([a-zA-Z]+)/', $string, 2, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY );
$returnVal = number_format( ( $stringArray[0] * $scale ), 2, '.', ''); ;
if ( count( $stringArray ) > 1 ) {
$returnVal .= $stringArray[1];
}
return $returnVal;
}
The $stringArray
is by @Rarst.
The $returnVal
I've added number_format()
. And forced the decimal point to an actual point. Somehow, and I don't know why, it changed the decimal point to a comma. But only when doing math...
use preg_split but keep delimiter
Use a zero-width assertion (a lookbehind here):
$result = preg_split('~(?<=\.)\s~', $text, -1, PREG_SPLIT_NO_EMPTY);
or you can use the \K
feature that removes all on the left from the whole match:
$result = preg_split('~\.\K\s~', $text, -1, PREG_SPLIT_NO_EMPTY);
Without regex (if whitespaces are only spaces, and if the last dot is not followed by a space):
$chunks = explode('. ', $text);
$last = array_pop($chunks);
$result = array_map(function ($i) { return $i . '.'; }, $chunks);
$result[] = $last;
or better:
$result = explode(' #&§', strtr($text, ['. '=>'. #&§']));
PHP preg_split keeping delimiter in a different element
This will get you pretty close
$page_content = 'the quick brown fox [[random text here]] and then [[a different text here]]';
print_r(preg_split('/(\[\[[^\]]+\]\])/', $page_content, -1, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY));
The thing to remember is that this is the delimiter (\[\[[^\]]+\]\])
Output:
Array
(
[0] => the quick brown fox
[1] => [[random text here]]
[2] => and then
[3] => [[a different text here]]
)
Sandbox
When i say pretty close
, I do mean really pretty close...
The regex is pretty straight forward, capture 2 [
then anything but a ]
then 2 of those ]
. Which makes our delimiter, which we then capture. No empty flag is nice too.
Enjoy!
UPDATE
but it fails on " here is my table [[{"widget":"table","id":"1","title": "Views Table", "columns": []}]] and this is more text"...Note the "[]" under the 'columns'
To handle that you will need a recursive regex pattern using (?R)
, like this:
$page_content = 'here is my table [[{"widget":"table","id":"1","title": "Views Table", "columns": []}]] and this is more text [someother bracket]';
print_r(preg_split('/(\[(?:[^\[\]]|(?R))*\])/', $page_content, -1, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY));
Output:
Array
(
[0] => here is my table
[1] => [[{"widget":"table","id":"1","title": "Views Table", "columns": []}]]
[2] => and this is more text
[3] => [someother bracket] //single bracket capture
)
Sandbox
I won't pretend, this is kind of at the edge of my knowledge of regex, I should note this matches single brackets and not specifically double ones. You could try something like this /(\[(\[(?:[^\[\]]|(?2))*\])\])/
the (?2)
is like (?R)
but for a specific capture group. Which this works to match only [[ ... ]]
while keeping the inner nesting. But the issue is, then you have the capture duplicated, so you wind up with this:
Array
(
[0] => here is my table
[1] => [[{"widget":"table","id":"1","title": "Views Table", "columns": []}]]
[2] => [{"widget":"table","id":"1","title": "Views Table", "columns": []}]
[3] => and this is more text [someother bracket]
)
Notice how it doesn't capture [someother bracket]
, but it captures the other one 2 times. There may be a way around that, but i can't think of it.
Rather or not capturing single bracket pairs is an issue I don't know.
But I have used this before, mainly for matching, matched pairs of "
or ( )
but it's the same concept.
The only other solution would be to make a lexer/parser for it, I have some examples of how do do that on my GitHub account. Regex (by itself) is not suited to nested elements. Most any regex solution will fail on nesting.
PHP preg_split delimiter pattern, split at character chain
If it can only be ,a,
and ,a,,a,
, then this should be enough:
preg_split("/(,a,)+/", $str);
Related Topics
What's the Best Way to Pass a PHP Variable to JavaScript
Will Copy-On-Write Prevent Data Duplication on Arrays
How to Generate a Random Key Within PHP
How to Get the Previous Url Using PHP
Show Results While Script Is Still Executing
PHP Emitting 500 on Errors - Where Is This Documented
MySQL Db Question Marks Instead of Hebrew Characters..
Regex Backreference to Match Different Values
How to Capture the Result of Var_Dump to a String
How to Override Trait Function and Call It from the Overridden Function
Determining What Classes Are Defined in a PHP Class File
How to Capitalize First Letter of First Word in a Sentence
How to Reduce the Image Size Without Losing Quality in PHP
Understanding Nested PHP Ternary Operator
Use Strings to Access (Potentially Large) Multidimensional Arrays