What Is the Ruby Regex for Including Apostrophes

I want to match all punctuation in my regexp except apostrophes. How do i do that in Ruby?

string = "jack. o'reilly? mike??!?"
puts string.gsub(/[\p{P}&&[^']]/, '')
# => jack o'reilly mike

Docs:

A character class may contain another character class. By itself this isn’t useful because [a-z[0-9]] describes the same set as [a-z0-9]. However, character classes also support the && operator which performs set intersection on its arguments.

So, [\p{P}&&[^']] is "any character that is punctuation and also not an apostrophe".

select quotes but NOT apostrophes in REGEX

Simplest one for your situation could be something like this.

Regex: /\s'|'\s/ and replace with a space.

Regex101 Demo


You can also go with /(['"])([A-Za-z]+)\1/ and replace with \2 i.e second captured group.

Regex101 Demo

ruby regex extract word between single quotes

You can use the following expression:

/\w+(?:'\w+)*/

See the Rubular demo

The expression matches:

  • \w+ - 1 or more word chars
  • (?:'\w+)* - zero or more sequences (as (?:...)* is a non-capturing group that groups a sequence of subpatterns quantified with * quantifier matching 0 or more occurrences) of:

    • ' - apostrophe
    • \w+ - 1 or more word chars.

See a short Ruby demo here:

"ciao: c'iao 'ciao'".scan(/\w+(?:'\w+)*/)
# => [ciao, c'iao, ciao]

Regex to allow alphanumeric characters and should allow . (dot) ' (apostrophe) and - (dash)

A few things were missing:

  • Escape the last dash in the set. The - symbol denotes a range in a set, such as with a-z.
  • After the set add +, so that the characters are matched one or more times.

Expression

^[a-zA-Z0-9\.'\-]+$

REY

You could also revise it to something like ^[a-zA-Z0-9\.'\-]{5,}$, where the {5,} requires a minimum of 5 members of the set matched concurrently. Usually user names have to be longer than 1 character.

How do I replace all the apostrophes that come right before or right after a comma?

This answers the question, "I want to replace the apostrophes that come right before or right after a comma".

r = /
(?<=,) # match a comma in a positive lookbehind
\' # match an apostrophe
| # or
\' # match an apostrophe
(?=,) # match a comma in a positive lookahead
/x # free-spacing regex definition mode

aString = "old_tag1,x'old_tag2'x,x'old_tag3','new_tag1','new_tag2'"

aString.gsub(r, '')
#=> => "old_tag1,x'old_tag2'x,x'old_tag3,new_tag1,new_tag2'"

If the objective is instead to remove single quotes enclosing a substring when the left quote is at the the beginning of the string or is immediately preceded by a comma and the right quote is at the end of the string or is immediately followed by comma, several approaches are possible. One is to use a single, modified regex, as @Dimitry has done. Another is to split the string on commas, process each string in the resulting array and them join the modified substrings, separated by commas.

r = /
\A # match beginning of string
\' # match single quote
.* # match zero or more characters
\' # match single quote
\z # match end of string
/x # free-spacing regex definition mode

aString.split(',').map { |s| (s =~ r) ? s[1..-2] : s }.join(',')
#=> "old_tag1,x'old_tag2'x,x'old_tag3',new_tag1,new_tag2"

Note:

arr = aString.split(',')
#=> ["old_tag1", "x'old_tag2'x", "x'old_tag3'", "'new_tag1'", "'new_tag2'"]
"old_tag1" =~ r #=> nil
"x'old_tag2'x" =~ r #=> nil
"x'old_tag3'" =~ r #=> nil
"'new_tag1'" =~ r #=> 0
"'new_tag2'" =~ r #=> 0

Ruby regex: exclude apostrophe but include it if it's escaped

A simpler way (assumes you don't need to match anything past your captures):

AjouterRDV\((\d+),(\d+),(\d+),(\d+),'(.+?)',

See Rubular example

How do I match `:punct:` except for some character?

From Ruby docs:

A character class may contain another character class. By itself this isn't useful because [a-z[0-9]] describes the same set as [a-z0-9]. However, character classes also support the && operator which performs set intersection on its arguments.

So, "punctuation but not apostrophe" is:

[[:punct:]&&[^']]

EDIT: By demand from revo in question comments, on my machine this benchmarks lookahead as ~10% slower, and lookbehind as ~20% slower:

require 'benchmark'

N = 1_000_000
STR = "Mr. O'Brien! Please don't go, Mr. O'Brien!"

def test(bm, re)
N.times {
STR.scan(re).size
}
end

Benchmark.bm do |bm|
bm.report("intersection") { test(bm, /[[:punct:]&&[^']]/) }
bm.report("lookahead") { test(bm, /(?!')[[:punct:]]/) }
bm.report("lookbehind") { test(bm, /[[:punct:]](?<!')/) }
end


Related Topics



Leave a reply



Submit