Swift Extract Regex Matches

Swift extract regex matches

Even if the matchesInString() method takes a String as the first argument,
it works internally with NSString, and the range parameter must be given
using the NSString length and not as the Swift string length. Otherwise it will
fail for "extended grapheme clusters" such as "flags".

As of Swift 4 (Xcode 9), the Swift standard
library provides functions to convert between Range<String.Index>
and NSRange.

func matches(for regex: String, in text: String) -> [String] {

do {
let regex = try NSRegularExpression(pattern: regex)
let results = regex.matches(in: text,
range: NSRange(text.startIndex..., in: text))
return results.map {
String(text[Range($0.range, in: text)!])
}
} catch let error {
print("invalid regex: \(error.localizedDescription)")
return []
}
}

Example:

let string = "€4€9"
let matched = matches(for: "[0-9]", in: string)
print(matched)
// ["4", "9"]

Note: The forced unwrap Range($0.range, in: text)! is safe because
the NSRange refers to a substring of the given string text.
However, if you want to avoid it then use

        return results.flatMap {
Range($0.range, in: text).map { String(text[$0]) }
}

instead.


(Older answer for Swift 3 and earlier:)

So you should convert the given Swift string to an NSString and then extract the
ranges. The result will be converted to a Swift string array automatically.

(The code for Swift 1.2 can be found in the edit history.)

Swift 2 (Xcode 7.3.1) :

func matchesForRegexInText(regex: String, text: String) -> [String] {

do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matchesInString(text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substringWithRange($0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}

Example:

let string = "€4€9"
let matches = matchesForRegexInText("[0-9]", text: string)
print(matches)
// ["4", "9"]

Swift 3 (Xcode 8)

func matches(for regex: String, in text: String) -> [String] {

do {
let regex = try NSRegularExpression(pattern: regex)
let nsString = text as NSString
let results = regex.matches(in: text, range: NSRange(location: 0, length: nsString.length))
return results.map { nsString.substring(with: $0.range)}
} catch let error {
print("invalid regex: \(error.localizedDescription)")
return []
}
}

Example:

let string = "€4€9"
let matched = matches(for: "[0-9]", in: string)
print(matched)
// ["4", "9"]

Swift - Regex to extract value

I guess you want to remove the tags.

If the backslash is only virtual the pattern is pretty simple: Basically <em> with optional slash /?

let trimmedString = string.replacingOccurrences(of: "</?em>", with: "", options: .regularExpression)

Considering also the backslash it's

let trimmedString = string.replacingOccurrences(of: "<\\\\?/?em>", with: "", options: .regularExpression)

If you want to extract only Furnished you have to capture groups: The string between the tags and everything after the closing tag until the next whitespace character.

let string = "Fully <em>Furni<\\/em>shed |Downtown and Canal Views"
let pattern = "<em>(.*)<\\\\?/em>(\\S+)"
do {
let regex = try NSRegularExpression(pattern: pattern)
if let match = regex.firstMatch(in: string, range: NSRange(string.startIndex..., in: string)) {
let part1 = string[Range(match.range(at: 1), in: string)!]
let part2 = string[Range(match.range(at: 2), in: string)!]
print(String(part1 + part2))
}
} catch { print(error) }

(Swift) How to find all strings that matches regex in string

You would need to manually find all occurrences in your string using a while condition similar to the one used in this post and get the string subsequences instead of its range:

func findSrcs(_ content: String) -> [Substring] {
let pattern = #"(?<=src=")[^"]+"#
var srcs: [Substring] = []
var startIndex = content.startIndex
while let range = content[startIndex...].range(of: pattern, options: .regularExpression) {
srcs.append(content[range])
startIndex = range.upperBound
}
return srcs
}

Playground testing:

let content = """
<span>whatever</span>
<img src="smiley.gif" alt="Smiley face">
<span>whatever</span>
<img src="stackoverflow.jpg" alt="Stack Overflow">
"""

print(findSrcs(content))

This will print

["smiley.gif", "stackoverflow.jpg"]

Regular expressions in swift

Separate the string by non alpha numeric characters except white spaces. Then trim the elements with white spaces.

extension String {
func words() -> [String] {
return self.components(separatedBy: CharacterSet.alphanumerics.inverted.subtracting(.whitespaces))
.filter({ !$0.isEmpty })
.map({ $0.trimmingCharacters(in: .whitespaces) })
}
}

let string1 = "(name,john,string for user name)"
let string2 = "(name, john,name of john)"
let string3 = "key = value // comment"

print(string1.words())//["name", "john", "string for user name"]
print(string2.words())//["name", "john", "name of john"]
print(string3.words())//["key", "value", "comment"]

Make sure regex matches the entire string with Swift regex

You need to use anchors, ^ (start of string anchor) and $ (end of string anchor), with range(of:options:range:locale:), passing the .regularExpression option:

import Foundation

let phoneNumber = "123-456-789"
let result = phoneNumber.range(of: "^\\d{3}-\\d{3}-\\d{3}$", options: .regularExpression) != nil
print(result)

Or, you may pass an array of options, [.regularExpression, .anchored], where .anchored will anchor the pattern at the start of the string only, and you will be able to omit ^, but still, $ will be required to anchor at the string end:

let result = phoneNumber.range(of: "\\d{3}-\\d{3}-\\d{3}$", options: [.regularExpression, .anchored]) != nil

See the online Swift demo

Also, using NSPredicate with MATCHES is an alternative here:

The left hand expression equals the right hand expression using a regex-style comparison according to ICU v3 (for more details see the ICU User Guide for Regular Expressions).

MATCHES actually anchors the regex match both at the start and end of the string (note this might not work in all Swift 3 builds):

let pattern = "\\d{3}-\\d{3}-\\d{3}"
let predicate = NSPredicate(format: "self MATCHES [c] %@", pattern)
let result = predicate.evaluate(with: "123-456-789")

Retrieve array of substring matched with regex in swift

Your fundamental issue, as @jnpdx hinted at in a comment, is that your regexp string contains control elements from another language. The following should solve your issue:

let regexp = "@\\w*"

You also get bogged down in unnecessary try-catch statements and outdated APIs based on Objective-C and their related type conversions. The following should do:

func matches(for regex: String, in text: String) -> [String] {
var result = [String]()
var startIndex = text.startIndex
let endIndex = text.endIndex
while let range = text.range(of: regex,
options: .regularExpression,
range: startIndex ..< endIndex)
{
result.append(String(text[range]))
startIndex = range.upperBound
}
return result
}

Swift 3 - How do I extract captured groups in regular expressions?

but I don't know how to do that in Swift 3.

When you receive a match from NSRegularExpression, what you get is an NSTextCheckingResult. You call rangeAt to get a specific capture group.

Example:

let s = "hey ho ha"
let pattern = "(h).*(h).*(h)"
// our goal is capture group 3, "h" in "ha"
let regex = try! NSRegularExpression(pattern: pattern)
let result = regex.matches(in:s, range:NSMakeRange(0, s.utf16.count))
let third = result[0].rangeAt(3) // <-- !!
third.location // 7
third.length // 1


Related Topics



Leave a reply



Submit