How do I remove duplicated columns from a data frame in R?
An option is
df[!duplicated(as.list(df))]
Or
df[!duplicated(unclass(df))]
How to remove duplicate column names in R?
Your real dataframe is of class data.table
, while your small example is not. You can try:
df[,!duplicated(colnames(df)), with=F]
R remove duplicated columns
df[!duplicated(as.list(df))]
X3 X4 X5 X6 X7
1 step1 step2 step3 step4 step10
remove first occurrence of duplicate column names data.table
Here is a more flexible way:
g <- as.integer(ave(names(dt), names(dt), FUN = length))
# for duplicated column names, keep the 1st occurrence
dt[, g == 1 | (rowid(names(dt)) == 1), with = FALSE]
# keep the 2nd occurrence
dt[, g == 1 | (rowid(names(dt)) == 2), with = FALSE]
# keep the 2nd and 3rd occurrences
dt[, g == 1 | (rowid(names(dt)) %in% c(2, 3)), with = FALSE]
# keep the last occurrence
dt[, g == rowid(names(dt)), with = FALSE]
How to delete duplicated columns in a tibble in the tidyverse
Building off the answer provided by Ronak, if you want to do this in dplyr
, then you can just use his provided solution with select_if
.
library(dplyr)
df <- data.frame("x" = runif(3),
"SYC SJ Equity...406" = c("a", "a", "b"),
"SYC SJ Equity...407" = c("a", "a", "b"),
"y" = runif(3))
df %>%
select_if(!duplicated(sub("\\.\\.\\..*", "", names(.))))
How to remove duplicated (by name) column in data.tables in R?
How about
dt[, .SD, .SDcols = unique(names(dt))]
This selects the first occurrence of each name (I'm not sure how you want to handle this).
As @DavidArenburg suggests in comments above, you could use check.names=TRUE
in data.table()
or fread()
Related Topics
Model.Matrix() with Na.Action=Null
Conditionally Display Block of Markdown Text Using Knitr
How to Change the Na Color from Gray to White in a Ggplot Choropleth Map
Note in R Cran Check: No Repository Set, So Cyclic Dependency Check Skipped
Factor Order Within Faceted Dotplot Using Ggplot2
R Table Function: How to Sum Instead of Counting
Can't Open Sockets for Parallel Cluster
Find Overlapping Dates for Each Id and Create a New Row for the Overlap
Row-By-Row Operations and Updates in Data.Table
Ggplot Us State Map; Colors Are Fine, Polygons Jagged - R
How to Define a Vectorized Function in R
Creating a Facet_Wrap Plot with Ggplot2 with Different Annotations in Each Plot
Initialize an Empty Tibble with Column Names and 0 Rows
Why Does Median Trip Up Data.Table (Integer Versus Double)
Distance of Point Feature to Nearest Polygon in R
Using Multiple Ellipses Arguments in R
R Dpylr Select_If with Multiple Conditions
Match and Replace Multiple Strings in a Vector of Text Without Looping in R