Adding Prefix or Suffix to Most Data.Frame Variable Names in Piped R Workflow

Adding prefix or suffix to most data.frame variable names in piped R workflow

You can pass functions to rename_at, so do

 means14 <- dat14 %>%
group_by(class) %>%
select(-ID) %>%
summarise_all(funs(mean(.))) %>%
rename_at(vars(-class),function(x) paste0(x,"_2014"))

Add a suffix to a select group of column names in a dataset

Depending on your data, this could work:

 colnames(CTDB)[9:83] <- paste(colnames(CTDB)[9:83], "CHILD", sep = "_")

If you don't want to set the indices manually, you can use "which()" to find them.

Adding a suffix to a selection of column names in tidyverse

You may select columns in rename_with -

library(dplyr)

df %>% rename_with(~paste0("a", .x), c(x, y))

# ax ay z
#1 1 3 5
#2 2 4 6

setNames suffix to prefix

A single sub call should be enough:

sub("^(.*)_(.*)$", "\\2_\\1", names(df))
#[1] "loc_home" "loc_work" "x1" "act_walk" "act_bike" "x2" "yest_happy" "yest_sad"

And of course to change the names, assign it back:

names(df) <- sub("^(.*)_(.*)$", "\\2_\\1", names(df))

And in a dplyr-pipe you could use setNames:

df %>% setNames(sub("^(.*)_(.*)$", "\\2_\\1", names(.)))

The pattern "^(.*)_(.*)$" creates two capturing groups, one before the underscore and one after it. And in the replacement "\\2_\\1" we tell R to extract the second group first, then an underscore and finnaly the first group which makes suffixes prefixes. However, if the pattern with an underscore is not found in an entry, nothing is changed.

Update after Question update:

For the slightly more complicated case, you can do the following:

1) store all suffixes that need to be changed to prefixes:

suf <- c("act", "loc", "yest")

2) create a regular expression pattern based on the suffixes:

pat <- paste0("^(.*)_(", paste(suf, collapse = "|"), ")$")
pat
#[1] "^(.*)_(act|loc|yest)$"

3) proceed as before:

sub(pat, "\\2_\\1", names(df))
# [1] "loc_home" "loc_work" "x_1" "act_walk" "act_bike" "x_2" "yest_happy" "yest_sad" "free_time"

or

df %>% setNames(sub(pat, "\\2_\\1", names(.)))

Add a prefix and suffix to each row of a dataframe but no suffix to the last row and then collapse all

I think that the best approach is to add prefix and suffix to all,
then extract the string to remove the last OR

library(tidyverse)
library(glue)

x <- data.frame(products = c("foo","bar","foobar"))

x$products <- glue("BRAND_NAME LIKE '% {x$products} %' OR ")

now collapse to single string

glue_collapse(x$products) %>%

now the string extraction

str_extract(., ".+(?= OR $)")

This last statement looks ahead for the space-OR-space at the end ($) for a match, and includes all characters up to but not including this match

Removing a prefix from a subset of column names using the str_remove function

You put the ~ symbol to a wrong place... It should be

df %>%
rename_with(.cols = ends_with("_end"),
.fn = ~ str_remove(string = .x, pattern = "^ATH_"))

ATH_V1 V2_end V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
1 1.50743299 -0.445307241 0.8299688 0.17539549 -0.1327284 -0.3396151 0.51307888
2 -1.41938708 0.778638127 -0.2813838 -0.32856970 0.1652872 -0.3049578 0.94609307
3 0.67968358 -1.424279034 0.4743970 0.07742006 0.1302074 0.2824700 -0.62150878
4 1.37265457 0.626442526 -0.9043668 -1.26182381 -2.0965678 1.5024311 -0.13721899
5 1.56945505 -0.808444575 -0.6629072 -1.05412193 2.2763880 -2.0970344 -1.67471537
6 -1.33771537 1.610411569 0.3740234 1.08666291 0.4914622 0.2749874 3.37133643
7 -0.02463483 -0.008389356 0.7068729 -0.03796850 0.3389535 0.9763993 -0.34287204
8 0.31237309 0.011720063 0.1572582 -0.17382867 0.3284980 0.2716920 -0.07771273
9 -1.20628787 -0.654695991 -0.3015155 0.32320577 2.1091207 -0.2484013 -1.46188370
10 -0.56686265 -0.279659749 0.1913190 -1.58601761 -0.3031979 -1.2062704 -0.26730244

More concise expression is

df %>%
rename_with(~ str_remove(.x, "^ATH_"), ends_with("_end"))

and even

df %>%
rename_with(str_remove, ends_with("_end"), "^ATH_")

Trouble passing in variable names for R user defined functions

object$variable does not do a substitution on variable.
Rather it assumes there is something already called variable (not the value of variable, but the actual string "variable") in your object.
However, the following will work:

data <- data.frame(A=1:4, B=c(1,1,1,2))
variable <- "A"
data[[variable]] # Same as df[["A"]] or df$A
# [1] 1 2 3 4

So, your function should be:

recode_4scale <- function (var, name, skip, df){
df[[name]] <- df[[var]] #generate new variable
df[which(df[[skip]]==2), name] <- 5 #replace with 5 if skip pattern
df[is.na(df[[var]]), name] <- 6 #replace with 6 if missing
df[[name]] <- df[[name]] == 3 | df[[name]] == 4 #code as true if 3 or 4
df[[name]] <- as.factor(df[[name]])
return (df)
}
data1 <- recode_4scale("A", "new", "B", data)
data1
# A B new
# 1 1 1 FALSE
# 2 2 1 FALSE
# 3 3 1 TRUE
# 4 4 2 FALSE

subset a vector of column names by a particular sample prefix

Try using grepl

> Names <- colnames(data)
> Names[grepl("^ca", Names)]
[1] "ca01" "ca02" "ca03"

dplyr- renaming sequence of columns with select function

I think you'll have an easier time crating such an expression with the select_ function:

library(dplyr)

test <- data.frame(x=rep(1:3, each=2),
group=rep(c("Group 1", "Group 2"), 3),
y1=c(22, 8, 11, 4, 7, 5),
y2=c(22, 18, 21, 14, 17, 15),
y3=c(23, 18, 51, 44, 27, 35),
y4=c(21, 28, 311,24, 227, 225))

# build out our select "translation" named vector
DQ <- paste0("y", 1:4)
names(DQ) <- paste0("DQ", seq(0, 3, 1))

# take a look
DQ

## DQ0 DQ1 DQ2 DQ3
## "y1" "y2" "y3" "y4"

test %>%
select_("AC"="x", "AR"="group", .dots=DQ)

## AC AR DQ0 DQ1 DQ2 DQ3
## 1 1 Group 1 22 22 23 21
## 2 1 Group 2 8 18 18 28
## 3 2 Group 1 11 21 51 311
## 4 2 Group 2 4 14 44 24
## 5 3 Group 1 7 17 27 227
## 6 3 Group 2 5 15 35 225


Related Topics



Leave a reply



Submit