How to filter for unique combination of columns from an R dataframe
The following should do it:
unique(df[,c('session','first','last')])
where df
is your data frame.
How can I find the unique combinations based on two columns?
For the given data set, it is enough to check the column "Genus" for values appearing twice and then to remove the corresponding rows from the dataframe.
df %>% count(Genus) -> countGenus
filter(df, Genus %in% filter(countGenus,n==1)$Genus)
select unique combinations of some columns in R, and random value for another column
I figured out a fast and simple solution.
First, randomly permute the rows:
myD <- myD[sample(1:dim(myD)[1],replace=FALSE),]
Next, keep only the first row for each unique combination of x and y:
myD <- myD[!duplicated(myD[,c("x","y")]),]
Select rows from dataframe with unique combination of values from multiple columns
Have you tried distinct
function from dplyr? For your case, it can be something like
library(dplyr)
df %>% distinct(team, opponent_team, date)
Another alternative is to use duplicated
function from base R inside filter
function of dplyr like below.
filter(!duplicated(team, opponent_team, date)
Creating a df of unique combinations of columns in R where order doesn't matter
A base R method is to create all the combination of political_spectrum_values
taking 3 at a time using expand.grid
, sort
them by row and select unique rows.
df <- expand.grid(first_person = political_spectrum_values,
second_person = political_spectrum_values,
third_person = political_spectrum_values)
df[] <- t(apply(df, 1, sort))
unique(df)
If needed as a single string
unique(apply(df, 1, function(x) paste0(sort(x), collapse = "_")))
Numbering rows based on unique combinations of multiple columns in R
We can use rowid
from data.table
library(data.table)
df1$Id <- with(df1, rowid(Treatments, Replicates))
Or using data.table
syntax
setDT(df1)[, Id := rowid(Treatments, Replicates))]
If we need the group id, use .GRP
setDT(df1)[, Id := .GRP, .(Treatments, Replicates)]
Or using dplyr
df1 %>%
group_by(Treatments, Replicates) %>%
mutate(Id = row_number())
To get the group indices, in the devel version
df1 %>%
group_by(Treatments, Replicates) %>%
mutate(Id = cur_group_id())
Or in the current dplyr
version
df1 %>%
mutate(Id = group_indices(., Treatments, Replicates))
In base R
, this can be done using ave
df1$Id <- with(df1, ave(seq_along(Treatments), Treatments,
Replicates, FUN = seq_along))
data
df1 <- structure(list(Treatments = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L), Replicates = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
1L, 2L, 2L, 2L), Value = c(4L, 5L, 7L, 9L, 25L, 39L, 43L, 24L,
12L, 9L, 4L, 2L), Id = c(NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_)), row.names = c(NA,
-12L), class = "data.frame")
Related Topics
How to Pass Data Between Functions in a Shiny App
Scale_Y_Log10() and Coord_Trans(Ytrans = 'Log10') Lead to Different Results
What Does Passing an Ellipsis (...) as an Argument Mean in R
Oauth Authentification to Fitbit Using Httr
Can Lapply Not Modify Variables in a Higher Scope
Get Name of X When Defining '(<-' Operator
Align Axis Label on the Right with Ggplot2
Extracting Coefficients and Their Standard Error for Each Unit in an Lme Model Fit
Exporting R Regression Summary for Publishable Paper
Remove White Space Between Plots and Table in Grid.Arrange
Add Missing Value in Column with Value from Row Above
Importing Wikipedia Tables in R
How to Use Spell Check in Rmarkdown
Multiple Filled.Contour Plots in One Graph Using with Par(Mfrow=C())
Apply a Function to Each Row in a Data Frame in R
Set Upper Limit in Ggplot to Include Label Greater Than the Maximum Value