"Adding Missing Grouping Variables" Message in Dplyr in R

Adding missing grouping variables message in dplyr in R

For consistency sake the grouping variables should be always present when defined earlier and thus are added when select(value) is executed. ungroup should resolve it:

qu25 <- mydata %>% 
  group_by(month, day, station_number) %>%
  arrange(desc(value)) %>% 
  slice(2) %>% 
  ungroup() %>%
  select(value)

The requested result is without warnings:

> mydata %>% 
+   group_by(month, day, station_number) %>%
+   arrange(desc(value)) %>% 
+   slice(2) %>% 
+   ungroup() %>%
+   select(value)
# A tibble: 1 x 1
  value
  <dbl>
1   113

Dplyr keeps automatically adding one of my columns

Try ungroup :

df <- data.frame(trial.number=1:2,indexer=3:4)

df %>% distinct(trial.number)
#  trial.number
#1            1
#2            2

df %>% group_by(trial.number,indexer) %>% distinct(trial.number)
## A tibble: 2 x 2
## Groups:   trial.number, indexer [2]
#  trial.number indexer
#         <int>   <int>
#1            1       3
#2            2       4

df %>% group_by(trial.number,indexer) %>% ungroup %>% distinct(trial.number)
## A tibble: 2 x 1
#  trial.number
#         <int>
#1            1
#2            2

How can I keep additional variables after grouping in some other variables in dplyr in R?

From the previous solution, add .keep_all = TRUE in distinct and then fill the loc column with the previous non-NA value

library(dplyr)
library(tidyr)
library(lubridate)
data %>%   
   mutate(month = lubridate::month(date)) %>%
   group_by(var, month) %>% 
   mutate(height = sum(height)) %>%
   ungroup %>% 
   complete(var, month, fill = list(height = 0)) %>% 
   mutate(Quarter = quarter, Condition = !is.na(date)) %>% 
   distinct(var, month, Quarter, Condition, .keep_all = TRUE) %>% 
   fill(loc) %>% 
   select(-date)

-output

# A tibble: 9 × 6
  var   month loc    height Quarter Condition
  <chr> <dbl> <chr>   <dbl>   <dbl> <lgl>    
1 A         1 london     13       1 TRUE     
2 A         2 london     14       1 TRUE     
3 A         3 london     15       1 TRUE     
4 B         1 berlin     13       1 TRUE     
5 B         2 berlin      0       1 FALSE    
6 B         3 berlin     15       1 TRUE     
7 C         1 cairo      28       1 TRUE     
8 C         2 cairo      27       1 TRUE     
9 C         3 cairo      15       1 TRUE

Grouping data by time intervals and adding missing rows

With the tidyverse:

left_join(mp, bills) %>% 
  group_by(name, surname, month = lubridate::floor_date(date, "month")) %>% 
  summarise(n = sum(!is.na(month))) %>% 
  replace_na(list(month = as.Date("2021-01-01"))) %>% 
  ungroup(month) %>% 
  complete(month = seq.Date(as.Date("2021-01-01"), as.Date("2021-12-01"), '1 month'), fill = list(n = 0))

# A tibble: 60 x 4
# Groups:   name, surname [5]
   name  surname month          n
   <chr> <chr>   <date>     <dbl>
 1 Diane Abbott  2021-01-01     1
 2 Diane Abbott  2021-02-01     0
 3 Diane Abbott  2021-03-01     0
 4 Diane Abbott  2021-04-01     0
 5 Diane Abbott  2021-05-01     1
 6 Diane Abbott  2021-06-01     0
 7 Diane Abbott  2021-07-01     0
 8 Diane Abbott  2021-08-01     0
 9 Diane Abbott  2021-09-01     0
10 Diane Abbott  2021-10-01     0

Calculate internal consistency of items by grouping variables using dplyr/tidyverse

Perhaps this helps

out1 <-  mydata %>%
    group_by(age, raterType) %>%    
     summarise(alpha = alpha(across(all_of(itemNames)))$total$raw_alpha, 
     omega = ci.reliability(across(all_of(itemNames)), 
    type = "omega", interval.type = "none")$est, .groups = 'drop')

-output

> out1
# A tibble: 15 × 4
     age raterType   alpha     omega
   <int> <fct>       <dbl>     <dbl>
 1     1 self      -0.135    2.76   
 2     1 friend     0.138    0.231  
 3     1 parent    -0.229  255.     
 4     2 self      -0.421   NA      
 5     2 friend     0.0650  58.7    
 6     2 parent     0.153   NA      
 7     3 self      -0.302    0.00836
 8     3 friend     0.147    0.334  
 9     3 parent     0.196    0.132  
10     4 self      -0.0699  NA      
11     4 friend     0.118    0.214  
12     4 parent    -0.0303  31.1    
13     5 self      -0.0166   0.246  
14     5 friend    -0.192    0.0151 
15     5 parent     0.0847  NA

Or may be this

out2 <- mydata %>%
   nest_by(age, raterType) %>%
   mutate(alpha = alpha(data[, itemNames])$total$raw_alpha, 
   omega = ci.reliability(data[, itemNames], type = "omega", 
    interval.type = "none")$est)

-output

out2
# A tibble: 15 × 5
# Rowwise:  age, raterType
     age raterType               data   alpha     omega
   <int> <fct>     <list<tibble[,7]>>   <dbl>     <dbl>
 1     1 self               [100 × 7] -0.135    2.76   
 2     1 friend             [100 × 7]  0.138    0.231  
 3     1 parent             [100 × 7] -0.229  255.     
 4     2 self               [100 × 7] -0.421   NA      
 5     2 friend             [100 × 7]  0.0650  58.7    
 6     2 parent             [100 × 7]  0.153   NA      
 7     3 self               [100 × 7] -0.302    0.00836
 8     3 friend             [100 × 7]  0.147    0.334  
 9     3 parent             [100 × 7]  0.196    0.132  
10     4 self               [100 × 7] -0.0699  NA      
11     4 friend             [100 × 7]  0.118    0.214  
12     4 parent             [100 × 7] -0.0303  31.1    
13     5 self               [100 × 7] -0.0166   0.246  
14     5 friend             [100 × 7] -0.192    0.0151 
15     5 parent             [100 × 7]  0.0847  NA

"Adding Missing Grouping Variables" Message in Dplyr in R

Adding missing grouping variables message in dplyr in R

Dplyr keeps automatically adding one of my columns

How can I keep additional variables after grouping in some other variables in dplyr in R?

Grouping data by time intervals and adding missing rows

Calculate internal consistency of items by grouping variables using dplyr/tidyverse

Related Topics

Leave a reply