How to drop columns with column names that contain specific string?
Using dplyr
:
library(dplyr)
df %>%
select(-contains(c("epm", "enn", "jkk")))
#> name agelk
#> 1 Jon 23
#> 2 Bill 41
#> 3 Maria 32
Drop data frame columns by name
There's also the subset
command, useful if you know which columns you want:
df <- data.frame(a = 1:10, b = 2:11, c = 3:12)
df <- subset(df, select = c(a, c))
UPDATED after comment by @hadley: To drop columns a,c you could do:
df <- subset(df, select = -c(a, c))
Drop columns and order the data by a specific columns' names
Solution with data.table
Since you're using data.table
, here you can find a full data.table
solution:
library(data.table)
# get files this way: it is preferable not to use setwd()
files <- list.files(dir, pattern = ".csv$", full.names = TRUE)
# fastest way to read your csv.
# drop here the column you don't want (I assumed it was ExamTitle)
# add id to reshape later
dt <- rbindlist(lapply(files, fread, drop = "ExamTitle"), idcol = "id")
# reshape with `data.table`
dcast(dt, id ~ ParametersName, value.var = "ParametersValue")
#> id CervicalLordosisDepth_SAG SagittalImbalance_SAG TrunkInclination_VPDM_SAG
#> 1: 1 30 -4 -0,49
#> 2: 2 30 -4 -0,49
#> TrunkLenght_VPDM_SAG
#> 1: 446
#> 2: 446
Solution with tidyverse
You can also use tidyverse
. It depends on you and your project.
library(tidyverse)
# read and bind dataframes, add id
map_df(files, read_csv2, .id = "id") %>%
# remove column
select(-ExamTitle) %>%
# reshape
pivot_wider(names_from = ParametersName, values_from = ParametersValue)
#> # A tibble: 2 x 5
#> id TrunkLenght_VPDM_SAG TrunkInclination~ SagittalImbalan~ CervicalLordosi~
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 1 446 -0.49 -4 30
#> 2 2 446 -0.49 -4 30
Solution with Base R
And to conclude, you can also solve your problem with a one-line solution from base R
unstack(Reduce(rbind, lapply(files, read.csv2)), form = ParametersValue ~ ParametersName)
#> CervicalLordosisDepth_SAG SagittalImbalance_SAG TrunkInclination_VPDM_SAG TrunkLenght_VPDM_SAG
#> 1 30 -4 -0.49 446
#> 2 16 -4 -0.49 446
Reproducible example
Here, I'll leave a simple reproducible example to run my code.
dir <- tempdir()
write("ExamTitle;ParametersName;ParametersValue
Titolo nuovo esame;TrunkLenght_VPDM_SAG;446
Titolo nuovo esame;TrunkInclination_VPDM_SAG;-0,49
Titolo nuovo esame;SagittalImbalance_SAG;-4
Titolo nuovo esame;CervicalLordosisDepth_SAG;30",
file = file.path(dir, "tmp1.csv"))
write("ExamTitle;ParametersName;ParametersValue
Titolo nuovo esame;TrunkLenght_VPDM_SAG;446
Titolo nuovo esame;TrunkInclination_VPDM_SAG;-0,49
Titolo nuovo esame;SagittalImbalance_SAG;-4
Titolo nuovo esame;CervicalLordosisDepth_SAG;30",
file = file.path(dir, "tmp2.csv"))
R delete columns from data frame matching regex pattern
We can use grep
for regex matching of a pattern in the column name. Here, the pattern is to check for letter 'd' at the start (^
) of the string followed by one or more digits (\\d+
) till the end ($
) of the string, use the invert = TRUE
(by default it is FALSE), and subset the columns with the numeric index
df[grep("^d\\d+$", names(df), invert = TRUE)]
# b c
#1 3 4
#2 4 5
#3 5 3
R: Dropping columns with names containing a substring anywhere except the start using regular expressions in dplyr
You need one or more of .
at the beginning so you could write ^.{1,}
.
df %>% dplyr::select(-matches("^.{1,}foo1"))
# bar foo1
# 1 -1.077056 -0.5649875
Related Topics
How to Plot with a Png as Background
Difference Between Passing Options in Aes() and Outside of It in Ggplot2
Specifying Colclasses in the Read.Csv
How to Choose Variable to Display in Tooltip When Using Ggplotly
Why Is Allow.Cartesian Required at Times When When Joining Data.Tables with Duplicate Keys
How to Increase the Space Between the Bars in a Bar Plot in Ggplot2
How to Name the "Row Names" Column in R
How to Source R Markdown File Like 'Source('Myfile.R')'
Changing Fonts for Graphs in R
Ggplot Side by Side Geom_Bar()
Hosting and Setting Up Own Shiny Apps Without Shiny Server
Error in Plot.New():Figure Margins Too Large, Scatter Plot
Deleting Reversed Duplicates with R
How to Assign the Result of the Previous Expression to a Variable