I want to match all punctuation in my regexp except apostrophes. How do i do that in Ruby?
string = "jack. o'reilly? mike??!?"
puts string.gsub(/[\p{P}&&[^']]/, '')
# => jack o'reilly mike
Docs:
A character class may contain another character class. By itself this isn’t useful because
[a-z[0-9]]
describes the same set as[a-z0-9]
. However, character classes also support the&&
operator which performs set intersection on its arguments.
So, [\p{P}&&[^']]
is "any character that is punctuation and also not an apostrophe".
select quotes but NOT apostrophes in REGEX
Simplest one for your situation could be something like this.
Regex: /\s'|'\s/
and replace with a space
.
Regex101 Demo
You can also go with /(['"])([A-Za-z]+)\1/
and replace with \2
i.e second captured group.
Regex101 Demo
ruby regex extract word between single quotes
You can use the following expression:
/\w+(?:'\w+)*/
See the Rubular demo
The expression matches:
\w+
- 1 or more word chars(?:'\w+)*
- zero or more sequences (as(?:...)*
is a non-capturing group that groups a sequence of subpatterns quantified with*
quantifier matching 0 or more occurrences) of:'
- apostrophe\w+
- 1 or more word chars.
See a short Ruby demo here:
"ciao: c'iao 'ciao'".scan(/\w+(?:'\w+)*/)
# => [ciao, c'iao, ciao]
Regex to allow alphanumeric characters and should allow . (dot) ' (apostrophe) and - (dash)
A few things were missing:
- Escape the last dash in the set. The
-
symbol denotes a range in a set, such as witha-z
. - After the set add
+
, so that the characters are matched one or more times.
Expression
^[a-zA-Z0-9\.'\-]+$
REY
You could also revise it to something like ^[a-zA-Z0-9\.'\-]{5,}$
, where the {5,}
requires a minimum of 5 members of the set matched concurrently. Usually user names have to be longer than 1 character.
How do I replace all the apostrophes that come right before or right after a comma?
This answers the question, "I want to replace the apostrophes that come right before or right after a comma".
r = /
(?<=,) # match a comma in a positive lookbehind
\' # match an apostrophe
| # or
\' # match an apostrophe
(?=,) # match a comma in a positive lookahead
/x # free-spacing regex definition mode
aString = "old_tag1,x'old_tag2'x,x'old_tag3','new_tag1','new_tag2'"
aString.gsub(r, '')
#=> => "old_tag1,x'old_tag2'x,x'old_tag3,new_tag1,new_tag2'"
If the objective is instead to remove single quotes enclosing a substring when the left quote is at the the beginning of the string or is immediately preceded by a comma and the right quote is at the end of the string or is immediately followed by comma, several approaches are possible. One is to use a single, modified regex, as @Dimitry has done. Another is to split the string on commas, process each string in the resulting array and them join the modified substrings, separated by commas.
r = /
\A # match beginning of string
\' # match single quote
.* # match zero or more characters
\' # match single quote
\z # match end of string
/x # free-spacing regex definition mode
aString.split(',').map { |s| (s =~ r) ? s[1..-2] : s }.join(',')
#=> "old_tag1,x'old_tag2'x,x'old_tag3',new_tag1,new_tag2"
Note:
arr = aString.split(',')
#=> ["old_tag1", "x'old_tag2'x", "x'old_tag3'", "'new_tag1'", "'new_tag2'"]
"old_tag1" =~ r #=> nil
"x'old_tag2'x" =~ r #=> nil
"x'old_tag3'" =~ r #=> nil
"'new_tag1'" =~ r #=> 0
"'new_tag2'" =~ r #=> 0
Ruby regex: exclude apostrophe but include it if it's escaped
A simpler way (assumes you don't need to match anything past your captures):
AjouterRDV\((\d+),(\d+),(\d+),(\d+),'(.+?)',
See Rubular example
How do I match `:punct:` except for some character?
From Ruby docs:
A character class may contain another character class. By itself this isn't useful because
[a-z[0-9]]
describes the same set as[a-z0-9]
. However, character classes also support the&&
operator which performs set intersection on its arguments.
So, "punctuation but not apostrophe" is:
[[:punct:]&&[^']]
EDIT: By demand from revo in question comments, on my machine this benchmarks lookahead as ~10% slower, and lookbehind as ~20% slower:
require 'benchmark'
N = 1_000_000
STR = "Mr. O'Brien! Please don't go, Mr. O'Brien!"
def test(bm, re)
N.times {
STR.scan(re).size
}
end
Benchmark.bm do |bm|
bm.report("intersection") { test(bm, /[[:punct:]&&[^']]/) }
bm.report("lookahead") { test(bm, /(?!')[[:punct:]]/) }
bm.report("lookbehind") { test(bm, /[[:punct:]](?<!')/) }
end
Related Topics
Numeric Literals Prepended with '0'
Ruby: How to Process a CSV File with "Bad Commas"
Error Occurs When Trying to Install Homebrew on a MAC for Ruby on Rails
Rails: How to Use Scope to Find an Element in Array of Arrays
Deleting a Line in a Text File
Rails: Upload a File or Store a Url
Ruby Parallel Assignment, Step Question
Rails in Rendering Unnecessary Information
Ruby - How to Retrieve Sum in Array Group by Multiple Keys with Condition Max
Seahorse::Client::Networkingerror Amazon S3 File Upload with Rails
Ruby: Append Text to the 2Nd Line of a File
Why Does the Script Affect Everything on My Rails 3 App Even When Cased in This Code
How to Print a Line Number in Ruby
Setting a Text Field That Has a Jquery Mask on It
How to Convert Timestamp with Ruby
Could Not Find 'Cocoapods' (>= 0.A) Among 48 Total Gem(S) (Gem::Missingspecerror)