How to remove square brackets and anything between them with a regex?
[
and ]
are special characters in a regex. They are used to list characters of a match. [a-z]
matches any lowercase letter between a
and z
. [03b]
matches a "0", "3", or "b". To match the characters [
and ]
, you have to escape them with a preceding \
.
Your code currently says "replace any character of []().
with an empty string" (reordered from the order in which you typed them for clarity).
Greedy match:
preg_replace('/\[.*\]/', '', $str); // Replace from one [ to the last ]
A greedy match could match multiple [s and ]s. That expression would take an example [of "sneaky"] text [with more "sneaky"] here
and turn it into an example here
.
Perl has a syntax for a non-greedy match (you most likely don't want to be greedy):
preg_replace('/\[.*?\]/', '', $str);
Non-greedy matches try to catch as few characters as possible. Using the same example: an example [of "sneaky"] text [with more "sneaky"] here
becomes an example text here
.
Only up to the first following ]:
preg_replace('/\[[^\]]*\]/', '', $str); // Find a [, look for non-] characters, and then a ]
This is more explicit, but harder to read. Using the same example text, you'd get the output of the non-greedy expression.
Note that none of these deal explicitly with white space. The spaces on either side of [
and ]
will remain.
Also note that all of these can fail for malformed input. Multiple [
s and ]
s without matches could cause a surprising result.
Regular expression to extract text between square brackets
You can use the following regex globally:
\[(.*?)\]
Explanation:
\[
:[
is a meta char and needs to be escaped if you want to match it literally.(.*?)
: match everything in a non-greedy way and capture it.\]
:]
is a meta char and needs to be escaped if you want to match it literally.
How to remove square parentheses and text within from strings in R
I would use:
input <- c("6.77[9]", "5.92[10]", "2.98[103]")
gsub("\\[.*?\\]", "", input)
[1] "6.77" "5.92" "2.98"
The regex pattern \[.*?\]
should match any quoted terms in square brackets, and using gsub
would tell R to replace all such terms.
Remove Square Brackets in Text and its contents
\[[^]]*\]
Try this.Replace by empty string
.See demo.
http://regex101.com/r/xT7yD8/2
How to remove text inside brackets and parentheses at the same time with any whitespace before if present?
There are four main points here:
- String between parentheses can be matched with
\([^()]*\)
- String between square brackets can be matched with
\[[^][]*]
(or\[[^\]\[]*\]
if you prefer to escape literal[
and]
, in PCRE, it is stylistic, but in some other regex flavors, it might be a must) - You need alternation to match either this or that pattern and account for any whitespaces before these patterns
- Since after removing these strings you may get leading and trailing spaces, you need to
trim
the string.
You may use
$string = "Deadpool 2 [Region 4](Blu-ray)";
echo trim(preg_replace("/\s*(?:\[[^][]*]|\([^()]*\))/","", $string));
See the regex demo and a PHP demo.
The \[[^][]*]
part matches strings between [
and ]
having no other [
and ]
inside and \([^()]*\)
matches strings between (
and )
having no other parentheses inside. trim
removes leading/trailing whitespace.
Regex graph and explanation:
\s*
- 0+ whitespaces(?:
- start of a non-capturing group:\[[^][]*]
-[
, zero or more chars other than[
and]
(note you may keep these brackets inside a character class unescaped in a PCRE pattern if]
is right after initial[
, in JS, you would have to escape]
by all means,[^\][]*
)|
- or (an alternation operator)\([^()]*\)
-(
, any 0+ chars other than(
and)
and a)
)
- end of the non-capturing group.
Remove text between square brackets at the end of string
Note that \[.*?\]$
won't work as it will match the first [
(because a regex engine processes the string from left to right), and then will match all the rest of the string up to the ]
at its end. So, it will match [something][something2]
in input[something][something2]
.
You may specify the end of string anchor and use [^\][]*
(matching zero or more chars other than [
and ]
) instead of .*?
:
\[[^\][]*]$
See the JS demo:
console.log(
"input[something][something2]".replace(/\[[^\][]*]$/, '')
);
Using Regex to delete contents between repeating brackets
You can use a word boundary in combination with a negated character class [^
\[[^][]*\bDontDeleteMe\b[^][]*\]
Regex demo
If the word is DeleteMe
you can match it using word boundaries and repace with an empty string.
\[[^][]*\bDeleteMe\b[^][]*\]
Regex demo
Remove square brackets that don't have spaces between them
The new version of stringr
may be of use to you, it has a nice widget for testing out regex
matching.
stringr::str_view_all(c("[please]", "[help me]"), "(\\[)\\S*(\\])")
matches [
, then any number of non-space characters, then ]
, with the [
and ]
as capture groups. I'm not sure what you want to do with them.
Update: To remove brackets, you actually want to capture what's inside and then substitute with it.
stringr::str_replace_all(c("[please]", "[help me]"), "\\[(\\S*)\\]", "\\1")
#> [1] "please" "[help me]"
(capture any all-non-space characters between brackets, and substitute the entire string for the capture where found)
How can I remove the closing square bracket using regex in Python?
You can use
cleaned = re.sub(r'^\[+[A-Z\d-]+:\s*|]+$', '', string)
See the Python demo and the regex demo.
Alternatively, to make sure the string starts with [[word:
and ends with ]
s, you may use
cleaned = re.sub(r'^\[+[A-Z\d-]+:\s*(.*?)\s*]+$', r'\1', string)
See this regex demo and this Python demo.
And, in case you simply want to extract that text inside, you may use
# First match only
m = re.search(r'\[+[A-Z\d-]+:\s*(.*?)\s*]', string)
if m:
print(m.group(1))
# All matches
matches = re.findall(r'\[+[A-Z\d-]+:\s*(.*?)\s*]', string)
See this regex demo and this Python demo.
Details
^
- start of string\[+
- one or more[
chars[A-Z\d-]+
- one or more uppercase ASCII letters, digits or-
chars:
- a colon\s*
- zero or more whitespaces|
- or]+$
- one or more]
chars at the end of string.
Also, (.*?)
is a capturing group with ID 1 that matches any zero or more chars other than line break chars, as few as possible. \1
in the replacement refers to the value stored in this group memory buffer.
Related Topics
How to Handle Error for Duplicate Entries
How to Install the Ext-Curl Extension with PHP 7
How to Easily Consume a Web Service from PHP
PHP Date Format /Date(1365004652303-0500)/
Jquery Mobile: How to Correctly Submit Form Data
What's the Best Practice to Set HTML Attribute via PHP
How to Check If a Video Exists on Youtube, Using PHP
How to Fetch All in Assoc Array from a Prepared Statement
Sqlstate[Hy000] [1045] Access Denied for User 'Username'@'Localhost' Using Cakephp
Pre-Declare All Private/Local Variables
PHP Send Mail to Multiple Email Addresses
Create a Comma-Separated String from a Single Column of an Array of Objects
PHP Remove Duplicate Values from Multidimensional Array