Regex, Get String Value Between Two Characters

Regular expression to get a string between two strings in Javascript

A lookahead (that (?= part) does not consume any input. It is a zero-width assertion (as are boundary checks and lookbehinds).

You want a regular match here, to consume the cow portion. To capture the portion in between, you use a capturing group (just put the portion of pattern you want to capture inside parenthesis):

cow(.*)milk

No lookaheads are needed at all.

Regex Match all characters between two strings

For example

(?<=This is)(.*)(?=sentence)

Regexr

I used lookbehind (?<=) and look ahead (?=) so that "This is" and "sentence" is not included in the match, but this is up to your use case, you can also simply write This is(.*)sentence.

The important thing here is that you activate the "dotall" mode of your regex engine, so that the . is matching the newline. But how you do this depends on your regex engine.

The next thing is if you use .* or .*?. The first one is greedy and will match till the last "sentence" in your string, the second one is lazy and will match till the next "sentence" in your string.

Update

Regexr

This is(?s)(.*)sentence

Where the (?s) turns on the dotall modifier, making the . matching the newline characters.

Update 2:

(?<=is \()(.*?)(?=\s*\))

is matching your example "This is (a simple) sentence". See here on Regexr

Regular Expression to find a string included between two characters while EXCLUDING the delimiters

Easy done:

(?<=\[)(.*?)(?=\])

Technically that's using lookaheads and lookbehinds. See Lookahead and Lookbehind Zero-Width Assertions. The pattern consists of:

  • is preceded by a [ that is not captured (lookbehind);
  • a non-greedy captured group. It's non-greedy to stop at the first ]; and
  • is followed by a ] that is not captured (lookahead).

Alternatively you can just capture what's between the square brackets:

\[(.*?)\]

and return the first captured group instead of the entire match.

Get string between two characters using regex in typescript

If your identifiers are comprised of word characters only:

this regex works: (?<=id=")\w+(?="). The full match is the id. However, this could cause compatibility issue in Safari, that doesn't support lookbehinds ((?<= ))

Test it on Regex101

You can also use capture groups with id="(\w+)". The only captured group will be the id you're looking for.

Test it on Regex101

regex to get string between two % characters

Your pattern is fine but your code didn't compile. Try this instead:

Swift 4

let query = "Hello %test% how do you do %test1%"
let regex = try! NSRegularExpression(pattern:"%(.*?)%", options: [])
var results = [String]()

regex.enumerateMatches(in: query, options: [], range: NSMakeRange(0, query.utf16.count)) { result, flags, stop in
if let r = result?.range(at: 1), let range = Range(r, in: query) {
results.append(String(query[range]))
}
}

print(results) // ["test", "test1"]

NSString uses UTF-16 encoding so NSMakeRange is called with the number of UTF-16 code units.

Swift 2

let query = "Hello %test% how do you do %test1%"
let regex = try! NSRegularExpression(pattern:"%(.*?)%", options: [])
let tmp = query as NSString
var results = [String]()

regex.enumerateMatchesInString(query, options: [], range: NSMakeRange(0, tmp.length)) { result, flags, stop in
if let range = result?.rangeAtIndex(1) {
results.append(tmp.substringWithRange(range))
}
}

print(results) // ["test", "test1"]

Getting a substring out of Swift's native String type is somewhat of a hassle. That's why I casted query into an NSString

Regular expression: extract string between two characters/strings

You may use

xx <- "gee(formula = breaks ~ tension, id = wool, data = warpbreaks)"
sub(".*\\bid\\s*=\\s*(\\w+).*", "\\1", xx)
## or, if the value extracted may contain any chars but commas
sub(".*\\bid\\s*=\\s*([^,]+).*", "\\1", xx)

See the R demo and the regex demo.

Details

  • .* - any 0+ chars, as many as possible
  • \\bid - a whole word id (\b is a word boundary)
  • \\s*=\\s* - a = enclosed with 0+ whitespaces
  • (\\w+) - Capturing group 1 (\\1 in the replacement pattern refers to this value): one or more letters, digits or underscores (or [^,]+ matches 1+ chars other than a comma)
  • .* - the rest of the string.

Other alternative solutions:

> xx <- "gee(formula = breaks ~ tension, id = wool, data = warpbreaks)"
> regmatches(xx, regexpr("\\bid\\s*=\\s*\\K[^,]+", xx, perl=TRUE))
[1] "wool"

The pattern matches id, = enclosed with 0+ whitespaces, then \K omits the matched text and only 1+ chars other than , land in the match value.

Or, a capturing approach with stringr::str_match is also valid here:

> library(stringr)
> str_match(xx, "\\bid\\s*=\\s*([^,]+)")[,2]
[1] "wool"

Match text between two strings with regular expression

Use re.search

>>> import re
>>> s = 'Part 1. Part 2. Part 3 then more text'
>>> re.search(r'Part 1\.(.*?)Part 3', s).group(1)
' Part 2. '
>>> re.search(r'Part 1(.*?)Part 3', s).group(1)
'. Part 2. '

Or use re.findall, if there are more than one occurances.

How to use REGEXTRACT to extract certain characters between two strings

You can use

=ARRAYFORMULA(REGEXEXTRACT(B3:B, "\s-\s+([^.]*?)\s*\."))

See the regex demo. Details:

  • \s-\s+ - a whitespace, -, one or more whitespaces
  • ([^.]*?) - Group 1: zero or more chars other than a . as few as possible
  • \s* - zero or more whitespaces
  • \. - a . char.


Related Topics



Leave a reply



Submit