Can't add rows to grouped data frames
I actually recently made a little helper function for exactly this. The idea
is to use group_modify()
to take the group data, andbind_rows()
the summary statistics calculated with summarise()
.
This is what it looks like in code:
add_summary_rows <- function(.data, ...) {
group_modify(.data, function(x, y) bind_rows(x, summarise(x, ...)))
}
And here’s how that would work with your data:
library(dplyr, warn.conflicts = FALSE)
df <- data.frame(
test_id = c(1, 1, 1, 1, 1, 1, 1, 1),
test_nr = c(1, 1, 1, 1, 2, 2, 2, 2),
region = c("A", "B", "C", "D", "A", "B", "C", "D"),
test_value = c(3, 1, 1, 2, 4, 2, 4, 1)
)
df %>%
group_by(test_id, test_nr) %>%
add_summary_rows(
region = "MEAN",
test_value = mean(test_value)
)
#> # A tibble: 10 x 4
#> # Groups: test_id, test_nr [2]
#> test_id test_nr region test_value
#> <dbl> <dbl> <chr> <dbl>
#> 1 1 1 A 3
#> 2 1 1 B 1
#> 3 1 1 C 1
#> 4 1 1 D 2
#> 5 1 1 MEAN 1.75
#> 6 1 2 A 4
#> 7 1 2 B 2
#> 8 1 2 C 4
#> 9 1 2 D 1
#> 10 1 2 MEAN 2.75
Add rows to grouped data with dplyr?
Without dplyr it can be done like this:
as.data.frame(xtabs(Demand ~ Week + Article, data))
giving:
Week Article Freq
1 2013-W01 10004 1215
2 2013-W02 10004 900
3 2013-W03 10004 774
4 2013-W04 10004 1170
5 2013-W01 10006 0
6 2013-W02 10006 0
7 2013-W03 10006 0
8 2013-W04 10006 5
9 2013-W01 10007 2
10 2013-W02 10007 0
11 2013-W03 10007 0
12 2013-W04 10007 0
and this can be rewritten as a magrittr or dplyr pipeline like this:
data %>% xtabs(formula = Demand ~ Week + Article) %>% as.data.frame()
The as.data.frame()
at the end could be omitted if a wide form solution were desired.
Add row in each group using dplyr and add_row()
If you want to use a grouped operation, you need do
like JasonWang described in his comment, as other functions like mutate
or summarise
expect a result with the same number of rows as the grouped data frame (in your case, 50) or with one row (e.g. when summarising).
As you probably know, in general do
can be slow and should be a last resort if you cannot achieve your result in another way. Your task is quite simple because it only involves adding extra rows in your data frame, which can be done by simple indexing, e.g. look at the output of iris[NA, ]
.
What you want is essentially to create a vector
indices <- c(NA, 1:50, NA, 51:100, NA, 101:150)
(since the first group is in rows 1 to 50, the second one in 51 to 100 and the third one in 101 to 150).
The result is then iris[indices, ]
.
A more general way of building this vector uses group_indices
.
indices <- seq(nrow(iris)) %>%
split(group_indices(iris, Species)) %>%
map(~c(NA, .x)) %>%
unlist
(map
comes from purrr
which I assume you have loaded as you have tagged this with tidyverse
).
How to add a row to each group and assign values
According to the documentation of the function group_modify
, if you use a formula, you must use ".
or .x
to refer to the subset of rows of .tbl
for the given group;" that's why you used .x
inside the add_row
function. To be entirely consistent, you have to do it also within the first
function.
df %>%
group_by(id) %>%
group_modify(~ add_row(A=4, B=first(.x$B), .x))
# A tibble: 6 x 3
# Groups: id [3]
id A B
<chr> <dbl> <dbl>
1 one 1 4
2 one 4 4
3 three 3 6
4 three 4 6
5 two 2 5
6 two 4 5
Using first(.$B)
or first(df$B)
will provide the same results.
R add rows to grouped df using dplyr
This should do the trick:
library(plyr)
df %>%
join(subset(df, item_code %in% additional_rows$item_code, select = c(id, item_code)) %>%
join(additional_rows) %>%
subset(!duplicated(.)), type = "full") %>%
arrange(id, item_code, -score)
Not sure if its the best way, but it works
Edit: to get the score in the same order added the other arrange terms
Edit 2: alright, there should now be no duplicated rows added from the additional rows as per your comment
Add rows by group and fill them with zero in R with dplyr
We can use complete
library(dplyr)
library(tidyr)
df %>%
complete(gene, time = 1:4, fill = list(frequency = 0)) %>%
select(names(df))
-output
# A tibble: 8 x 3
gene frequency time
<chr> <dbl> <dbl>
1 A 0.590 1
2 A 0.762 2
3 A 0.336 3
4 A 0.437 4
5 B 0.904 1
6 B 1.97 2
7 B 0 3
8 B 0 4
R Add rows to each group so each group has same number, and specify other variable
tidyr::complete(df, week, session)
# A tibble: 16 x 3
week session work
<dbl> <dbl> <chr>
1 1 1 done
2 1 2 done
3 1 3 NA
4 1 4 NA
5 2 1 done
6 2 2 done
7 2 3 NA
8 2 4 NA
9 3 1 done
10 3 2 done
11 3 3 done
12 3 4 NA
13 4 1 done
14 4 2 done
15 4 3 done
16 4 4 done
Insert new row on group_by data in R dplyr based on condition
You have almost achieved what you want.
new_rows <- example %>%
group_by(bucket) %>%
summarise(rate = 1 - sum(rate))
new_rows
# bucket rate
# <dbl> <dbl>
# 1 0 0.015
# 2 1 0.02
bind_rows(example, new_rows)
# bucket bucket2 rate
# 1 0 0 0.950
# 2 0 1 0.020
# 3 0 2 0.010
# 4 0 3 0.005
# 5 0 4 0.000
# 6 1 0 0.900
# 7 1 1 0.050
# 8 1 2 0.020
# 9 1 3 0.010
# 10 1 4 0.000
# 11 0 NA 0.015
# 12 1 NA 0.020
adding rows by group to get same number of observations by group
We may group by 'anon_ID' and use complete
to expand the data
library(dplyr)
library(tidyr)
df1 %>
group_by(anon_ID) %>%
complete(nth_assistance_interaction = c(5, 10, 15, 20)) %>%
ungroup
Related Topics
Provide Shades Between Dates on X Axis
Get Name of X When Defining '(<-' Operator
Setting Default Number of Decimal Places for Printing
Adding Scale Bar to Ggplot Map
What Are the Caveats of Using Source Versus Parse & Eval
Create Multiple Data Frames from One Based Off Values with a for Loop
R Map Switzerland According to Npa (Locality)
Add Missing Value in Column with Value from Row Above
How to Manipulate Null Elements in a Nested List
Loops with Captions with Knitr
Plot Margin of PDF Plot Device: Y-Axis Label Falling Outside Graphics Window
Remove the Rows That Have Non-Numeric Characters in One Column in R
Delete Rows Based on Multiple Conditions with Dplyr