Select Groups Which Have At Least One of a Certain Value

How to check if at least one of a group of rows has a specific value

Add one more condition which checks if the group has atleast one eligible=True value.

SELECT *
FROM ExampleTable
WHERE Group in
(SELECT Group
FROM ExampleTable
GROUP BY Group
HAVING (count(distinct LastName) > 1 or count(distinct FirstName) > 1)
and count(case when eligible='True' then 1 end) >= 1
)

group_by filter to groups that share at least one matching value

We can look at the length of age for each group and compare it to the length of unique(age). If length(age) > length(unique(age)) at least two of the observations share an age.

library(dplyr)

df %>%
group_by(household) %>%
filter(length(age) > length(unique(age)))

# id household age
# <dbl> <dbl> <dbl>
# 1 1 1 19
# 2 2 1 19
# 3 3 1 45

@Dave2e pointed out another more dplyr-y way to do this:

df %>%
group_by(household) %>%
filter(n() > n_distinct(age))

MySQL query to select groups containing at least a certain number of elements

You need Group By and Having clause

select G_id 
from yourtable
where E_id in (1,3)
group by G_id
having count(distinct E_id) = 2

Update:

select G_id 
from yourtable
group by G_id
having count(case when E_id = 1 then 1 end) > 0
and count(case when E_id = 3 then 1 end) > 0
and count(distinct E_id) = 3

GROUP BY Create group if at least one value in group meets condition

With the OP's modified requirement:

select   type
from test
group by type
having count(*) > 1 and count(case when color = 'G' then 0 end) > 0
;

Select groups with row containing specific value (with dplyr and pipes)

After grouping by 'id', subset the 'string' for the second element and apply %in% with "is" on the lhs of %in% to return a single TRUE per group

library(dplyr)
df %>%
group_by(id) %>%
filter('is' %in% string[2]) %>%
ungroup

-output

# A tibble: 8 x 2
# id string
# <chr> <chr>
#1 id_1 here
#2 id_1 is
#3 id_1 some
#4 id_1 text
#5 id_2 here
#6 id_2 is
#7 id_2 other
#8 id_2 text

Group by having at least one of each item

One way of doing this is to count how many different pets each person has and to compare it (i.e. join it) with the total number of different pets:

SELECT person_id
FROM (SELECT person_id, COUNT(DISTINCT pet) AS dp
FROM pets
GROUP BY person_id) a
JOIN (SELECT COUNT(DISTINCT pet) AS dp FROM pets) b ON a.dp = b.dp

EDIT:

If just some pets are considered "ideal", and this list is known upfront, the query can be greatly simplified by introducing this information in a where clause:

SELECT   person_id
FROM pets
WHERE pet IN ('dog', 'cat', 'tiger')
GROUP BY person_id
HAVING COUNT(DISTINCT pet) = 3

Select grouped rows with at least one matching criterion

I would do it like this:

Data_File %>% group_by(Group_ID) %>%
filter(any(Product_Name %in% "ABCD"))
# Source: local data frame [7 x 3]
# Groups: Group_ID [3]
#
# Group_ID Product_Name Qty
# <dbl> <chr> <dbl>
# 1 123 ABCD 2
# 2 123 EFGH 3
# 3 123 XYZ1 4
# 4 123 Z123 5
# 5 234 ABCD 6
# 6 444 ABCD 8
# 7 444 ABCD 9

Explanation: any() will return TRUE if there are any rows (within the group) that match the condition. The length-1 result will then be recycled to the full length of the group and the entire group will be kept. You could also do it with sum(Product_name %in% "ABCD") > 0 as the condition, but the any reads very nicely. Use sum instead if you wanted a more complicated condition, like 3 or more matching product names.

I prefer%in%to == for things like this because it has better behavior with NA and it is easy to expand if you wanted to check for any of multiple products by group.


If speed and efficiency are an issue, data.table will be faster. I would do it like this, which relies on a keyed join for the filtering and uses no non-data.table operations, so it should be very fast:

library(data.table)
df = as.data.table(df)
setkey(df)
groups = unique(subset(df, Product_Name %in% "ABCD", Group_ID))
df[groups, nomatch = 0]
# Group_ID Product_Name Qty
# 1: 123 ABCD 2
# 2: 123 EFGH 3
# 3: 123 XYZ1 4
# 4: 123 Z123 5
# 5: 234 ABCD 6
# 6: 444 ABCD 8
# 7: 444 ABCD 9

Pick groups that have at least one non-missing value in R

This would be better if we had a reproducible example, but let's create a toy version of your data:

DataX <- data.frame(orgcode = rep(LETTERS[1:5], each = 3),
budget = c(NA, 21000, 22000,
30000, NA, 40000,
NA, NA, NA,
12000, 15000, 14000,
NA, NA, NA))

DataX
#> orgcode budget
#> 1 A NA
#> 2 A 21000
#> 3 A 22000
#> 4 B 30000
#> 5 B NA
#> 6 B 40000
#> 7 C NA
#> 8 C NA
#> 9 C NA
#> 10 D 12000
#> 11 D 15000
#> 12 D 14000
#> 13 E NA
#> 14 E NA
#> 15 E NA

We can see that organizations with the orgcode C and E have all NA values and should be removed. We can do this by using a dummy variable to find out whether each group is all(is.na(budget)) and filter on that:

library(dplyr)

DataX %>%
group_by(orgcode) %>%
mutate(allNA = !all(is.na(budget))) %>%
filter(allNA) %>%
select(-allNA)

#> # A tibble: 9 x 2
#> # Groups: orgcode [3]
#> orgcode budget
#> <fct> <dbl>
#> 1 A NA
#> 2 A 21000
#> 3 A 22000
#> 4 B 30000
#> 5 B NA
#> 6 B 40000
#> 7 D 12000
#> 8 D 15000
#> 9 D 14000

Created on 2020-07-29 by the reprex package (v0.3.0)



Related Topics



Leave a reply



Submit