How to Change Case of Letters in String Using Regex in Ruby

How to change case of letters in string using RegEx in Ruby

@sawa Has the simple answer, and you've edited your question with another mechanism. However, to answer two of your questions:

Is there a way to do this within the regex though?

No, Ruby's regex does not support a case-changing feature as some other regex flavors do. You can "prove" this to yourself by reviewing the official Ruby regex docs for 1.9 and 2.0 and searching for the word "case":

https://github.com/ruby/ruby/blob/ruby_1_9_3/doc/re.rdoc
https://github.com/ruby/ruby/blob/ruby_2_0_0/doc/re.rdoc

I don't really understand the '\1' '\2' thing. Is that backreferencing? How does that work?

Your use of \1 is a kind of backreference. A backreference can be when you use \1 and such in the search pattern. For example, the regular expression /f(.)\1/ will find the letter f, followed by any character, followed by that same character (e.g. "foo" or "f!!").

In this case, within a replacement string passed to a method like String#gsub, the backreference does refer to the previous capture. From the docs:

"If replacement is a String it will be substituted for the matched text. It may contain back-references to the pattern’s capture groups of the form \d, where d is a group number, or \k<n>, where n is a group name. If it is a double-quoted string, both back-references must be preceded by an additional backslash."

In practice, this means:

"hello world".gsub( /([aeiou])/, '_\1_' )  #=> "h_e_ll_o_ w_o_rld"
"hello world".gsub( /([aeiou])/, "_\1_" )  #=> "h_\u0001_ll_\u0001_ w_\u0001_rld"
"hello world".gsub( /([aeiou])/, "_\\1_" ) #=> "h_e_ll_o_ w_o_rld"

Now, you have to understand when code runs. In your original code…

string.gsub!(/([a-z])([A-Z]+ )/, '\1'.upcase)

…what you are doing is calling upcase on the string '\1' (which has no effect) and then calling the gsub! method, passing in a regex and a string as parameters.

Finally, another way to achieve this same goal is with the block form like so:

# Take your pick of which you prefer:
string.gsub!(/([a-z])([A-Z]+ )/){ $1.upcase << $2.downcase }
string.gsub!(/([a-z])([A-Z]+ )/){ [$1.upcase,$2.downcase].join }
string.gsub!(/([a-z])([A-Z]+ )/){ "#{$1.upcase}#{$2.downcase}" }

In the block form of gsub the captured patterns are set to the global variables $1, $2, etc. and you can use those to construct the replacement string.

Ruby: break string into words by capital letters and acronyms

Use

s.split(/(?<=\p{Ll})(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}\p{Ll})/)

See proof.

Explanation

--------------------------------------------------------------------------------
  (?<=                     look behind to see if there is:
--------------------------------------------------------------------------------
    \p{Ll}                 any lowercase letter
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \p{Lu}                 any uppercase letter
--------------------------------------------------------------------------------
  )                        end of look-ahead
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (?<=                     look behind to see if there is:
--------------------------------------------------------------------------------
    \p{Lu}                 any uppercase letter
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \p{Lu}\p{Ll}           any uppercase letter, any lowercase letter
--------------------------------------------------------------------------------
  )                        end of look-ahead

Ruby code:

str = 'QuickFoxReadingPDF'
p str.split(/(?<=\p{Ll})(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}\p{Ll})/)

Results: ["Quick", "Fox", "Reading", "PDF"]

Regex for uppercase word in ruby

Use string.scan function instead of string.gsub to grab a particular string you want.

> "sign me up for LUNCH".scan(/\b[A-Z]+\b/)[0]
=> "LUNCH"

\b called word boundary which matches between a word character and a non-word character.

> "sign me up for LUNCH".scan(/(?<!\S)[A-Z]+(?!\S)/)[0]
=> "LUNCH"

(?<!\S) Negative lookbehind which asserts that the match wouldn't be preceded by a non-space character.
[A-Z]+ Matches one or more uppercase letters.
(?!\S) Negative lookahead which asserts that the match wouldn't be followed by a non-space character.

Regular Expression for matching words containing both upper and lowercase letters

Since you're using Ruby, an answer could benefit from lookaheads to assert letters shouldn't be all uppercase or lowercase:

\b(?![a-z]+\b|[A-Z]+\b)[a-zA-Z]+

Live demo

Breakdown:

\b Match a word boundary
(?! Start of negative lookahead
- [a-z]+\b Match a lowercased word
- | Or
- [A-Z]+\b Match an uppercased word
) End of lookahead
[a-zA-Z]+ Match letters

Regex to validate string having only lower case, first char must be a letter

Why don't you just stick to your requirements ?

first char must be a lowercase letter: [a-z]
remaining characters must match: [a-z0-9_.]

-> your regex: /^[a-z][a-z0-9_.]*$/

change case with regex

Match the dash followed by a single character, and use a replacer function that returns that character toUpperCase:

const dashToCamel = str => str.replace(/-(\w)/g, (_, g1) => g1.toUpperCase());console.log(dashToCamel("font-size-18"));

Select a string in regex with ruby

You can try an alternative approach: matching everything you want to keep then joining the result.

You can use this regex to match everything you want to keep:

[A-Z\d+| ^]|<?=>

As you can see this is just a using | and [] to create a list of strings that you want to keep: uppercase, numbers, +, |, space, ^, => and <=>.

Example:

"aA azee + B => C=".scan(/[A-Z\d+| ^]|<?=>/).join()

Output:

"A  + B => C"

Note that there are 2 consecutive spaces between "A" and "+". If you don't want that you can call String#squeeze.

Replace with uppercase characters with gsub

"som".gsub(/[aeiou]/, &:upcase)
# => "sOm"

"som".tr("aeiou", "AEIOU")
# => "sOm"

How can I use regex in Ruby to split a string into an array of the words it contains?

You may use a matching approach to extract chunks of 2 or more uppercase letters or a letter followed only with 0+ lowercase letters:

s.scan(/\p{Lu}{2,}|\p{L}\p{Ll}*/).map(&:downcase)

See the Ruby demo and the Rubular demo.

The regex matches:

\p{Lu}{2,} - 2 or more uppercase letters
| - or
\p{L} - any letter
\p{Ll}* - 0 or more lowercase letters.

With map(&:downcase), the items you get with .scan() are turned to lower case.

How to Change Case of Letters in String Using Regex in Ruby