How to convert CamelCase to not.camel.case in R
Not clear what the entire set of rules is here but we have assumed that
- we should lower case any upper case character after a lower case one and insert a dot between them and also
- lower case the first character of the string if succeeded by a lower case character.
To do this we can use perl regular expressions with sub
and gsub
:
# test data
camelCase <- c("ThisText", "NextText", "DON'T_CHANGE")
s <- gsub("([a-z])([A-Z])", "\\1.\\L\\2", camelCase, perl = TRUE)
sub("^(.[a-z])", "\\L\\1", s, perl = TRUE) # make 1st char lower case
giving:
[1] "this.text" "next.text" "DON'T_CHANGE"
How to convert not.camel.case to CamelCase in R
Here's one approach but with regex there's probably better ones:
t1 <- c('this.text', 'next.text')
camel <- function(x){ #function for camel case
capit <- function(x) paste0(toupper(substring(x, 1, 1)), substring(x, 2, nchar(x)))
sapply(strsplit(x, "\\."), function(x) paste(capit(x), collapse=""))
}
camel(t1)
This yields:
> camel(t1)
[1] "ThisText" "NextText"
EDIT: As a curiosity I microbenchmarked the 4 answers (TOM=original poster, TR=myself, JMS=jmsigner & SB=sebastion; commented on jmsigner's post) and found the non regex answers to be faster. I would have assumed them slower.
expr min lq median uq max
1 JMS() 183.801 188.000 197.796 201.762 349.409
2 SB() 93.767 97.965 101.697 104.963 147.881
3 TOM() 75.107 82.105 85.370 89.102 1539.917
4 TR() 70.442 76.507 79.772 83.037 139.484
Splitting CamelCase in R
string.to.split = "thisIsSomeCamelCase"
gsub("([A-Z])", " \\1", string.to.split)
# [1] "this Is Some Camel Case"
strsplit(gsub("([A-Z])", " \\1", string.to.split), " ")
# [[1]]
# [1] "this" "Is" "Some" "Camel" "Case"
Looking at Ramnath's and mine I can say that my initial impression that this was an underspecified question has been supported.
And give Tommy and Ramanth upvotes for pointing out [:upper:]
strsplit(gsub("([[:upper:]])", " \\1", string.to.split), " ")
# [[1]]
# [1] "this" "Is" "Some" "Camel" "Case"
Camel Case format conversion using regular expressions in R
We can use sub
. We match one or more punctuation characters ([[:punct:]]+
) followed by a single character which is captured as a group ((.)
). In the replacement, the backreference for the capture group (\\1
) is changed to upper case (\\U
).
sub("[[:punct:]]+(.)", "\\U\\1", str1, perl = TRUE)
#[1] "DrDre" "CaptainSpock" "spiderMan"
For the second case, we use regex lookarounds i.e. match a letter ((?<=[A-Za-z])
) followed by a capital letter and replace with _
.
gsub("(?<=[A-Za-z])(?=[A-Z])", "_", str2, perl = TRUE)
#[1] "End_Of_File" "Camel_Case" "A_B_C"
data
str1 <- c("Dr_dre", "Captain.Spock", "spider-man")
str2 <- c("EndOfFile", "CamelCase", "ABC")
Elegant R function: mixed case separated by periods to underscore separated lower case and/or camel case
Try this. These at least work on the examples given:
toUnderscore <- function(x) {
x2 <- gsub("([A-Za-z])([A-Z])([a-z])", "\\1_\\2\\3", x)
x3 <- gsub(".", "_", x2, fixed = TRUE)
x4 <- gsub("([a-z])([A-Z])", "\\1_\\2", x3)
x5 <- tolower(x4)
x5
}
underscore2camel <- function(x) {
gsub("_(.)", "\\U\\1", x, perl = TRUE)
}
#######################################################
# test
#######################################################
u <- toUnderscore(as.Given)
u
## [1] "icu_days" "sex_code" "max_of_mld" "age_group"
underscore2camel(u)
## [1] "icuDays" "sexCode" "maxOfMld" "ageGroup"
Convert camelCaseText to Title Case Text
const text = 'helloThereMister';
const result = text.replace(/([A-Z])/g, " $1");
const finalResult = result.charAt(0).toUpperCase() + result.slice(1);
console.log(finalResult);
How to use `strsplit` before every capital letter of a camel case?
It seems that by adding (?!^)
you can obtained the desired result.
strsplit('AaaBbbCcc', "(?!^)(?=[A-Z])", perl=TRUE)
For the camel case we may do
strsplit('AaaABbbBCcc', '(?!^)(?=\\p{Lu}\\p{Ll})', perl=TRUE)[[1]]
strsplit('AaaABbbBCcc', '(?!^)(?=[A-Z][a-z])', perl=TRUE)[[1]] ## or
# [1] "AaaA" "BbbB" "Ccc"
Elegant Python function to convert CamelCase to snake_case?
Camel case to snake case
import re
name = 'CamelCaseName'
name = re.sub(r'(?<!^)(?=[A-Z])', '_', name).lower()
print(name) # camel_case_name
If you do this many times and the above is slow, compile the regex beforehand:
pattern = re.compile(r'(?<!^)(?=[A-Z])')
name = pattern.sub('_', name).lower()
To handle more advanced cases specially (this is not reversible anymore):
def camel_to_snake(name):
name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', name).lower()
print(camel_to_snake('camel2_camel2_case')) # camel2_camel2_case
print(camel_to_snake('getHTTPResponseCode')) # get_http_response_code
print(camel_to_snake('HTTPResponseCodeXYZ')) # http_response_code_xyz
To add also cases with two underscores or more:
def to_snake_case(name):
name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
name = re.sub('__([A-Z])', r'_\1', name)
name = re.sub('([a-z0-9])([A-Z])', r'\1_\2', name)
return name.lower()
Snake case to pascal case
name = 'snake_case_name'
name = ''.join(word.title() for word in name.split('_'))
print(name) # SnakeCaseName
Related Topics
Replace Every Single Character at the Start of String That Matches a Regex Pattern
Error in Bind_Rows_(X, .Id):Column Can't Be Converted from Factor to Numeric
How to Set Factor Levels to the Order They Appear in a Data Frame
Adding Multiple Shadows/Rectangles to Ggplot2 Graph
Applying Over a Vector of Functions
Remove Some of the Axis Labels in Ggplot Faceted Plots
How to Represent Polynomials with Numeric Vectors in R
How to Print the Name of Current Row When Using Apply in R
Is There a Fast Parser for Date
How to Install 2 Different R Versions on Debian
R Cmd Check Latex Error: Fatal PDFlatex - Gui Framework Cannot Be Initialized
Determining Minimum Values in a Vector in R
Flatten Nested Lists in a List
Ess to Call Different Installations of R