How to Drop Columns by Name Pattern in R

How to drop columns with column names that contain specific string?

Using dplyr:

library(dplyr)

df %>%
  select(-contains(c("epm", "enn", "jkk")))
#>    name agelk
#> 1   Jon    23
#> 2  Bill    41
#> 3 Maria    32

Drop data frame columns by name

There's also the subset command, useful if you know which columns you want:

df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
df <- subset(df, select = c(a, c))

UPDATED after comment by @hadley: To drop columns a,c you could do:

df <- subset(df, select = -c(a, c))

Drop columns and order the data by a specific columns' names

Solution with data.table

Since you're using data.table, here you can find a full data.table solution:

library(data.table)

# get files this way: it is preferable not to use setwd()
files <- list.files(dir, pattern = ".csv$", full.names = TRUE)

# fastest way to read your csv.
# drop here the column you don't want (I assumed it was ExamTitle)
# add id to reshape later
dt <- rbindlist(lapply(files, fread, drop = "ExamTitle"), idcol = "id")

# reshape with `data.table`
dcast(dt, id ~ ParametersName, value.var = "ParametersValue")
#>    id CervicalLordosisDepth_SAG SagittalImbalance_SAG TrunkInclination_VPDM_SAG
#> 1:  1                        30                    -4                     -0,49
#> 2:  2                        30                    -4                     -0,49
#>    TrunkLenght_VPDM_SAG
#> 1:                  446
#> 2:                  446

Solution with `tidyverse`

You can also use tidyverse. It depends on you and your project.

library(tidyverse)

# read and bind dataframes, add id
map_df(files, read_csv2, .id = "id") %>%

  # remove column
  select(-ExamTitle) %>%

  # reshape
  pivot_wider(names_from = ParametersName, values_from = ParametersValue)
#> # A tibble: 2 x 5
#>   id    TrunkLenght_VPDM_SAG TrunkInclination~ SagittalImbalan~ CervicalLordosi~
#>   <chr>                <dbl>             <dbl>            <dbl>            <dbl>
#> 1 1                      446             -0.49               -4               30
#> 2 2                      446             -0.49               -4               30

Solution with Base R

And to conclude, you can also solve your problem with a one-line solution from base R

unstack(Reduce(rbind, lapply(files, read.csv2)), form = ParametersValue ~ ParametersName)
#>  CervicalLordosisDepth_SAG SagittalImbalance_SAG TrunkInclination_VPDM_SAG TrunkLenght_VPDM_SAG
#> 1                        30                    -4                     -0.49                  446
#> 2                        16                    -4                     -0.49                  446

Reproducible example

Here, I'll leave a simple reproducible example to run my code.

dir <- tempdir()
write("ExamTitle;ParametersName;ParametersValue
Titolo nuovo esame;TrunkLenght_VPDM_SAG;446
Titolo nuovo esame;TrunkInclination_VPDM_SAG;-0,49
Titolo nuovo esame;SagittalImbalance_SAG;-4
Titolo nuovo esame;CervicalLordosisDepth_SAG;30",
          file = file.path(dir, "tmp1.csv"))
write("ExamTitle;ParametersName;ParametersValue
Titolo nuovo esame;TrunkLenght_VPDM_SAG;446
Titolo nuovo esame;TrunkInclination_VPDM_SAG;-0,49
Titolo nuovo esame;SagittalImbalance_SAG;-4
Titolo nuovo esame;CervicalLordosisDepth_SAG;30",
          file = file.path(dir, "tmp2.csv"))

R delete columns from data frame matching regex pattern

We can use grep for regex matching of a pattern in the column name. Here, the pattern is to check for letter 'd' at the start (^) of the string followed by one or more digits (\\d+) till the end ($) of the string, use the invert = TRUE (by default it is FALSE), and subset the columns with the numeric index

df[grep("^d\\d+$", names(df), invert = TRUE)]
#  b c
#1 3 4
#2 4 5
#3 5 3

R: Dropping columns with names containing a substring anywhere except the start using regular expressions in dplyr

You need one or more of . at the beginning so you could write ^.{1,}.

df %>% dplyr::select(-matches("^.{1,}foo1"))
#         bar       foo1
# 1 -1.077056 -0.5649875

How to Drop Columns by Name Pattern in R

How to drop columns with column names that contain specific string?

Drop data frame columns by name

Drop columns and order the data by a specific columns' names

Solution with data.table

Solution with `tidyverse`

Solution with Base R

Reproducible example

R delete columns from data frame matching regex pattern

R: Dropping columns with names containing a substring anywhere except the start using regular expressions in dplyr

Related Topics

Leave a reply

How to drop columns with column names that contain specific string?

Drop data frame columns by name

Drop columns and order the data by a specific columns' names

Solution with data.table

Solution with tidyverse

Solution with Base R

Reproducible example

R delete columns from data frame matching regex pattern

R: Dropping columns with names containing a substring anywhere except the start using regular expressions in dplyr

Related Topics

Leave a reply

Solution with `tidyverse`