Select equivalent rows [A-B & B-A]
Like this:
unique(t(apply(mat, 1, sort)))
Note that output rows are sorted, so for example an "unmatched" row like c(5, 1)
in the original data will appear as c(1, 5)
in the output. If instead you want the output rows to be as they are in the input, then you can do:
mat[!duplicated(t(apply(mat, 1, sort))), ]
SQL - How to select A&B results excluding B&A redundent results from same table
First, @TriV's answer is correct, as I read the question. I have no idea why it is downvoted or deleted.
Second, if you want to remove the rows and you know all are duplicated, then you can do:
delete from t
where function1 > function2;
That is not satisfying, because you want the fastest method. Deletion can be expensive, so it might be faster to:
select *
into temp_t
from t
where function1 < function2;
truncate table t;
insert into t
select *
from temp_t;
If you don't have full duplicates, then you can do the same idea with something like:
select *
into temp_t
from t
where function1 < function2
union all
select *
from t t
where function1 > function2 and
not exists (select 1 from t t2 where t2.function1 = t.function2 and t2.function2 = t.function1);
The latter expression is probably the fastest way to get the unique set, assuming you have an index on t(function1, function2)
.
Removing mutual reference rows in R dataframe i.e. when (a, b) value in a row exists as (b, a) in another row of the same dataframe
I believe you are looking for something like this.
Sort the data frame horizontally. Remove duplicate rows.
df <- data.frame("A" = c(1, 10, 1, 1, 2, 2, 14, 4),
"B" = c(10, 1, 11, 12, 13, 14, 2, 15))
sorted <- t(apply(df, 1, function(x) sort(x)))
df[!duplicated(sorted), ]
Delete duplicated rows with same values but in different column in R
One option would be to use a least/greatest trick, and then remove duplicates:
library(SparkR)
df <- unique(cbind(least(df$A, df$B), greatest(df$A, df$B)))
Here is a base R version of the above:
df <- unique(cbind(ifelse(df$A < df$B, df$A, df$B),
ifelse(df$A >= df$B, df$A, df$B)))
Unique case of finding duplicate values flexibly across columns in R
tidyverse
df <- data.frame(animal_1 = c("cat", "dog", "mouse", "squirrel"),
predation_type = c("eats", "eats", "eaten by", "eats"),
animal_2 = c("mouse", "squirrel", "cat", "nuts"))
library(tidyverse)
df %>%
rowwise() %>%
mutate(duplicates = str_c(sort(c_across(c(1, 3))), collapse = "")) %>%
group_by(duplicates) %>%
mutate(duplicates = n() > 1) %>%
ungroup()
#> # A tibble: 4 x 4
#> animal_1 predation_type animal_2 duplicates
#> <chr> <chr> <chr> <lgl>
#> 1 cat eats mouse TRUE
#> 2 dog eats squirrel FALSE
#> 3 mouse eaten by cat TRUE
#> 4 squirrel eats nuts FALSE
Created on 2022-01-17 by the reprex package (v2.0.1)
removing duplicates
library(tidyverse)
df %>%
filter(!duplicated(map2(animal_1, animal_2, ~str_c(sort((c(.x, .y))), collapse = ""))))
#> animal_1 predation_type animal_2
#> 1 cat eats mouse
#> 2 dog eats squirrel
#> 3 squirrel eats nuts
Created on 2022-01-17 by the reprex package (v2.0.1)
Related Topics
Split Violin Plot With Ggplot2
Merging Two Data Frames Using Fuzzy/Approximate String Matching in R
R on Macos Error: Vector Memory Exhausted (Limit Reached)
Subset a Dataframe Between 2 Dates
What Does %≫% Function Mean in R
R Apply() Function on Specific Dataframe Columns
Plot Correlation Matrix into a Graph
Ggplot Bar Plot With Facet-Dependent Order of Categories
How to Change the Default Time Zone in R
Manually Setting Group Colors For Ggplot2
Subscript Letters in Ggplot Axis Label
Dplyr Mutate Rowsums Calculations or Custom Functions
Replace Na in Column With Value in Adjacent Column
Assign Multiple Objects to .Globalenv from Within a Function