How to Convert Camelcase to Not.Camel.Case in R

How to convert CamelCase to not.camel.case in R

Not clear what the entire set of rules is here but we have assumed that

  • we should lower case any upper case character after a lower case one and insert a dot between them and also
  • lower case the first character of the string if succeeded by a lower case character.

To do this we can use perl regular expressions with sub and gsub:

# test data
camelCase <- c("ThisText", "NextText", "DON'T_CHANGE")

s <- gsub("([a-z])([A-Z])", "\\1.\\L\\2", camelCase, perl = TRUE)
sub("^(.[a-z])", "\\L\\1", s, perl = TRUE) # make 1st char lower case

giving:

[1] "this.text"    "next.text"    "DON'T_CHANGE"

How to convert not.camel.case to CamelCase in R

Here's one approach but with regex there's probably better ones:

t1 <- c('this.text', 'next.text')

camel <- function(x){ #function for camel case
capit <- function(x) paste0(toupper(substring(x, 1, 1)), substring(x, 2, nchar(x)))
sapply(strsplit(x, "\\."), function(x) paste(capit(x), collapse=""))
}

camel(t1)

This yields:

> camel(t1)
[1] "ThisText" "NextText"

EDIT: As a curiosity I microbenchmarked the 4 answers (TOM=original poster, TR=myself, JMS=jmsigner & SB=sebastion; commented on jmsigner's post) and found the non regex answers to be faster. I would have assumed them slower.

   expr     min      lq  median      uq      max
1 JMS() 183.801 188.000 197.796 201.762 349.409
2 SB() 93.767 97.965 101.697 104.963 147.881
3 TOM() 75.107 82.105 85.370 89.102 1539.917
4 TR() 70.442 76.507 79.772 83.037 139.484

Sample Image

Splitting CamelCase in R

string.to.split = "thisIsSomeCamelCase"
gsub("([A-Z])", " \\1", string.to.split)
# [1] "this Is Some Camel Case"

strsplit(gsub("([A-Z])", " \\1", string.to.split), " ")
# [[1]]
# [1] "this" "Is" "Some" "Camel" "Case"

Looking at Ramnath's and mine I can say that my initial impression that this was an underspecified question has been supported.

And give Tommy and Ramanth upvotes for pointing out [:upper:]

strsplit(gsub("([[:upper:]])", " \\1", string.to.split), " ")
# [[1]]
# [1] "this" "Is" "Some" "Camel" "Case"

Camel Case format conversion using regular expressions in R

We can use sub. We match one or more punctuation characters ([[:punct:]]+) followed by a single character which is captured as a group ((.)). In the replacement, the backreference for the capture group (\\1) is changed to upper case (\\U).

sub("[[:punct:]]+(.)", "\\U\\1", str1, perl = TRUE)
#[1] "DrDre" "CaptainSpock" "spiderMan"

For the second case, we use regex lookarounds i.e. match a letter ((?<=[A-Za-z])) followed by a capital letter and replace with _.

gsub("(?<=[A-Za-z])(?=[A-Z])", "_", str2, perl = TRUE)
#[1] "End_Of_File" "Camel_Case" "A_B_C"

data

str1 <- c("Dr_dre", "Captain.Spock", "spider-man")
str2 <- c("EndOfFile", "CamelCase", "ABC")

Elegant R function: mixed case separated by periods to underscore separated lower case and/or camel case

Try this. These at least work on the examples given:

toUnderscore <- function(x) {
x2 <- gsub("([A-Za-z])([A-Z])([a-z])", "\\1_\\2\\3", x)
x3 <- gsub(".", "_", x2, fixed = TRUE)
x4 <- gsub("([a-z])([A-Z])", "\\1_\\2", x3)
x5 <- tolower(x4)
x5
}

underscore2camel <- function(x) {
gsub("_(.)", "\\U\\1", x, perl = TRUE)
}

#######################################################
# test
#######################################################

u <- toUnderscore(as.Given)
u
## [1] "icu_days" "sex_code" "max_of_mld" "age_group"

underscore2camel(u)
## [1] "icuDays" "sexCode" "maxOfMld" "ageGroup"

Convert camelCaseText to Title Case Text

const text = 'helloThereMister';
const result = text.replace(/([A-Z])/g, " $1");
const finalResult = result.charAt(0).toUpperCase() + result.slice(1);
console.log(finalResult);

How to use `strsplit` before every capital letter of a camel case?

It seems that by adding (?!^) you can obtained the desired result.

strsplit('AaaBbbCcc', "(?!^)(?=[A-Z])", perl=TRUE)

For the camel case we may do

strsplit('AaaABbbBCcc', '(?!^)(?=\\p{Lu}\\p{Ll})', perl=TRUE)[[1]]
strsplit('AaaABbbBCcc', '(?!^)(?=[A-Z][a-z])', perl=TRUE)[[1]] ## or
# [1] "AaaA" "BbbB" "Ccc"

Elegant Python function to convert CamelCase to snake_case?

Camel case to snake case

import re

name = 'CamelCaseName'
name = re.sub(r'(?<!^)(?=[A-Z])', '_', name).lower()
print(name) # camel_case_name

If you do this many times and the above is slow, compile the regex beforehand:

pattern = re.compile(r'(?<!^)(?=[A-Z])')
name = pattern.sub('_', name).lower()

To handle more advanced cases specially (this is not reversible anymore):

def camel_to_snake(name):
name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', name).lower()

print(camel_to_snake('camel2_camel2_case')) # camel2_camel2_case
print(camel_to_snake('getHTTPResponseCode')) # get_http_response_code
print(camel_to_snake('HTTPResponseCodeXYZ')) # http_response_code_xyz

To add also cases with two underscores or more:

def to_snake_case(name):
name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
name = re.sub('__([A-Z])', r'_\1', name)
name = re.sub('([a-z0-9])([A-Z])', r'\1_\2', name)
return name.lower()

Snake case to pascal case

name = 'snake_case_name'
name = ''.join(word.title() for word in name.split('_'))
print(name) # SnakeCaseName


Related Topics



Leave a reply



Submit