Select Equivalent Rows [A-B & B-A]

Select equivalent rows [A-B & B-A]

Like this:

unique(t(apply(mat, 1, sort)))

Note that output rows are sorted, so for example an "unmatched" row like c(5, 1) in the original data will appear as c(1, 5) in the output. If instead you want the output rows to be as they are in the input, then you can do:

mat[!duplicated(t(apply(mat, 1, sort))), ]

SQL - How to select A&B results excluding B&A redundent results from same table

First, @TriV's answer is correct, as I read the question. I have no idea why it is downvoted or deleted.

Second, if you want to remove the rows and you know all are duplicated, then you can do:

delete from t
where function1 > function2;

That is not satisfying, because you want the fastest method. Deletion can be expensive, so it might be faster to:

select *
into temp_t
from t
where function1 < function2;

truncate table t;

insert into t
select *
from temp_t;

If you don't have full duplicates, then you can do the same idea with something like:

select *
into temp_t
from t
where function1 < function2
union all
select *
from t t
where function1 > function2 and
not exists (select 1 from t t2 where t2.function1 = t.function2 and t2.function2 = t.function1);

The latter expression is probably the fastest way to get the unique set, assuming you have an index on t(function1, function2).

Removing mutual reference rows in R dataframe i.e. when (a, b) value in a row exists as (b, a) in another row of the same dataframe

I believe you are looking for something like this.

Sort the data frame horizontally. Remove duplicate rows.

df <- data.frame("A" = c(1, 10, 1,  1,  2,  2, 14, 4),
"B" = c(10, 1, 11, 12, 13, 14, 2, 15))
sorted <- t(apply(df, 1, function(x) sort(x)))
df[!duplicated(sorted), ]

Delete duplicated rows with same values but in different column in R

One option would be to use a least/greatest trick, and then remove duplicates:

library(SparkR)

df <- unique(cbind(least(df$A, df$B), greatest(df$A, df$B)))

Here is a base R version of the above:

df <- unique(cbind(ifelse(df$A < df$B, df$A, df$B),
ifelse(df$A >= df$B, df$A, df$B)))

Unique case of finding duplicate values flexibly across columns in R

tidyverse

df <- data.frame(animal_1 = c("cat", "dog", "mouse", "squirrel"),
predation_type = c("eats", "eats", "eaten by", "eats"),
animal_2 = c("mouse", "squirrel", "cat", "nuts"))
library(tidyverse)

df %>%
rowwise() %>%
mutate(duplicates = str_c(sort(c_across(c(1, 3))), collapse = "")) %>%
group_by(duplicates) %>%
mutate(duplicates = n() > 1) %>%
ungroup()
#> # A tibble: 4 x 4
#> animal_1 predation_type animal_2 duplicates
#> <chr> <chr> <chr> <lgl>
#> 1 cat eats mouse TRUE
#> 2 dog eats squirrel FALSE
#> 3 mouse eaten by cat TRUE
#> 4 squirrel eats nuts FALSE

Created on 2022-01-17 by the reprex package (v2.0.1)

removing duplicates


library(tidyverse)
df %>%
filter(!duplicated(map2(animal_1, animal_2, ~str_c(sort((c(.x, .y))), collapse = ""))))
#> animal_1 predation_type animal_2
#> 1 cat eats mouse
#> 2 dog eats squirrel
#> 3 squirrel eats nuts

Created on 2022-01-17 by the reprex package (v2.0.1)



Related Topics



Leave a reply



Submit