How to change case of letters in string using RegEx in Ruby
@sawa Has the simple answer, and you've edited your question with another mechanism. However, to answer two of your questions:
Is there a way to do this within the regex though?
No, Ruby's regex does not support a case-changing feature as some other regex flavors do. You can "prove" this to yourself by reviewing the official Ruby regex docs for 1.9 and 2.0 and searching for the word "case":
- https://github.com/ruby/ruby/blob/ruby_1_9_3/doc/re.rdoc
- https://github.com/ruby/ruby/blob/ruby_2_0_0/doc/re.rdoc
I don't really understand the '\1' '\2' thing. Is that backreferencing? How does that work?
Your use of \1
is a kind of backreference. A backreference can be when you use \1
and such in the search pattern. For example, the regular expression /f(.)\1/
will find the letter f
, followed by any character, followed by that same character (e.g. "foo" or "f!!").
In this case, within a replacement string passed to a method like String#gsub
, the backreference does refer to the previous capture. From the docs:
"If replacement is a String it will be substituted for the matched text. It may contain back-references to the pattern’s capture groups of the form
\d
, whered
is a group number, or\k<n>
, wheren
is a group name. If it is a double-quoted string, both back-references must be preceded by an additional backslash."
In practice, this means:
"hello world".gsub( /([aeiou])/, '_\1_' ) #=> "h_e_ll_o_ w_o_rld"
"hello world".gsub( /([aeiou])/, "_\1_" ) #=> "h_\u0001_ll_\u0001_ w_\u0001_rld"
"hello world".gsub( /([aeiou])/, "_\\1_" ) #=> "h_e_ll_o_ w_o_rld"
Now, you have to understand when code runs. In your original code…
string.gsub!(/([a-z])([A-Z]+ )/, '\1'.upcase)
…what you are doing is calling upcase
on the string '\1'
(which has no effect) and then calling the gsub!
method, passing in a regex and a string as parameters.
Finally, another way to achieve this same goal is with the block form like so:
# Take your pick of which you prefer:
string.gsub!(/([a-z])([A-Z]+ )/){ $1.upcase << $2.downcase }
string.gsub!(/([a-z])([A-Z]+ )/){ [$1.upcase,$2.downcase].join }
string.gsub!(/([a-z])([A-Z]+ )/){ "#{$1.upcase}#{$2.downcase}" }
In the block form of gsub the captured patterns are set to the global variables $1
, $2
, etc. and you can use those to construct the replacement string.
Ruby: break string into words by capital letters and acronyms
Use
s.split(/(?<=\p{Ll})(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}\p{Ll})/)
See proof.
Explanation
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
\p{Ll} any lowercase letter
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\p{Lu} any uppercase letter
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
\p{Lu} any uppercase letter
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\p{Lu}\p{Ll} any uppercase letter, any lowercase letter
--------------------------------------------------------------------------------
) end of look-ahead
Ruby code:
str = 'QuickFoxReadingPDF'
p str.split(/(?<=\p{Ll})(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}\p{Ll})/)
Results: ["Quick", "Fox", "Reading", "PDF"]
Regex for uppercase word in ruby
Use string.scan
function instead of string.gsub
to grab a particular string you want.
> "sign me up for LUNCH".scan(/\b[A-Z]+\b/)[0]
=> "LUNCH"
\b
called word boundary which matches between a word character and a non-word character.
OR
> "sign me up for LUNCH".scan(/(?<!\S)[A-Z]+(?!\S)/)[0]
=> "LUNCH"
(?<!\S)
Negative lookbehind which asserts that the match wouldn't be preceded by a non-space character.[A-Z]+
Matches one or more uppercase letters.(?!\S)
Negative lookahead which asserts that the match wouldn't be followed by a non-space character.
Regular Expression for matching words containing both upper and lowercase letters
Since you're using Ruby, an answer could benefit from lookaheads to assert letters shouldn't be all uppercase or lowercase:
\b(?![a-z]+\b|[A-Z]+\b)[a-zA-Z]+
Live demo
Breakdown:
\b
Match a word boundary(?!
Start of negative lookahead[a-z]+\b
Match a lowercased word|
Or[A-Z]+\b
Match an uppercased word
)
End of lookahead[a-zA-Z]+
Match letters
Regex to validate string having only lower case, first char must be a letter
Why don't you just stick to your requirements ?
- first char must be a lowercase letter:
[a-z]
- remaining characters must match:
[a-z0-9_.]
-> your regex: /^[a-z][a-z0-9_.]*$/
change case with regex
Match the dash followed by a single character, and use a replacer function that returns that character toUpperCase
:
const dashToCamel = str => str.replace(/-(\w)/g, (_, g1) => g1.toUpperCase());console.log(dashToCamel("font-size-18"));
Select a string in regex with ruby
You can try an alternative approach: matching everything you want to keep then joining the result.
You can use this regex to match everything you want to keep:
[A-Z\d+| ^]|<?=>
As you can see this is just a using |
and []
to create a list of strings that you want to keep: uppercase, numbers, +, |, space, ^, => and <=>.
Example:
"aA azee + B => C=".scan(/[A-Z\d+| ^]|<?=>/).join()
Output:
"A + B => C"
Note that there are 2 consecutive spaces between "A" and "+". If you don't want that you can call String#squeeze
.
Replace with uppercase characters with gsub
"som".gsub(/[aeiou]/, &:upcase)
# => "sOm"
or
"som".tr("aeiou", "AEIOU")
# => "sOm"
How can I use regex in Ruby to split a string into an array of the words it contains?
You may use a matching approach to extract chunks of 2 or more uppercase letters or a letter followed only with 0+ lowercase letters:
s.scan(/\p{Lu}{2,}|\p{L}\p{Ll}*/).map(&:downcase)
See the Ruby demo and the Rubular demo.
The regex matches:
\p{Lu}{2,}
- 2 or more uppercase letters|
- or\p{L}
- any letter\p{Ll}*
- 0 or more lowercase letters.
With map(&:downcase)
, the items you get with .scan()
are turned to lower case.
Related Topics
Solving Dependency Constraints
Watermark in Existing PDF in Ruby
How to Break Out of a Map/Collect and Return Whatever Has Been Collected Up to That Point
Running a Shell Command from Ruby: Capturing the Output While Displaying the Output
How to Create a Rails 3 Route That Will Match All Requests and Direct to One Resource/Page
"The Ruby Way" (Mixins and Class Reopening) VS. Dependency Injection
How to Run Ruby in Haml in JavaScript Definition
Disable Devise's :Confirmable On-The-Fly to Batch-Generate Users
Re-Opened Nested Module Anomaly in Ruby
How Does Ruby Handle Array Range Accessing
Argument Out of Range Rails 4 and Bootstrap3-Datetimepicker-Rails
Ruby/Rails - How to Create a Class and Access It from the Controller
How to Properly Test Cancan Abilities with Rspec
Ssl Error on Http Post (Unknown Protocol)
How to Download a Ruby Gem Without Installing It Automatically