Add Margin Row Totals in Dplyr Chain

Add margin row totals in dplyr chain

With adorn_totals() from the janitor package:

library(janitor)
mtcars %>%
tabyl(cyl, gear) %>%
adorn_totals("row")

cyl 3 4 5
4 1 8 2
6 2 4 1
8 12 0 2
Total 15 12 5

To get from there to the "long" form in your post, add tidyr::gather() to the pipeline:

mtcars %>%
tabyl(cyl, gear) %>%
adorn_totals("row") %>%
tidyr::gather(gear, n, 2:ncol(.), convert = TRUE)

cyl gear n
1 4 3 1
2 6 3 2
3 8 3 12
4 Total 3 15
5 4 4 8
6 6 4 4
7 8 4 0
8 Total 4 12
9 4 5 2
10 6 5 1
11 8 5 2
12 Total 5 5

Self-promotion alert, I authored this package - adding this answer b/c it's a genuinely efficient solution here.

dplyr pipe: how to add a margin row calculating a total (like addmargins function - base)

You can do:

data2 <- data %>%
filter (grepl("A|B|D", column1)) %>%
rbind(., data.frame(column1="Total", column2=sum(.$column2, na.rm=T)))

column1 column2
1 A 4
2 B NA
3 D 1
4 Total 5

Easier way to add rows with totals for groups in dplyr

Honestly, I would do what you have done to add rows for each group but for the purpose of demonstrating way to use add_row here's an answer :

library(dplyr)
library(purrr)

df %>%
group_split(CO_ANO, subsector, niv100) %>%
map_df(~add_row(.x, CO_ANO = first(.x$CO_ANO), subsector = first(.x$subsector),
niv100 = first(.x$niv100),VL_FOB_real = sum(.x$VL_FOB_real), CO_UF = 'Total'))

# CO_ANO CO_UF niv100 subsector VL_FOB_real
# <chr> <chr> <chr> <chr> <dbl>
# 1 1989 41 410 2 10
# 2 1989 Total 410 2 10
# 3 1990 41 3020 13 9
# 4 1990 Total 3020 13 9
# 5 1990 45 1530 8 2
# 6 1990 Total 1530 8 2
# 7 1992 41 2510 13 11
# 8 1992 Total 2510 13 11
# 9 1992 99 2610 13 5
#10 1992 Total 2610 13 5
# … with 20 more rows

The only benefit I see of this approach is you get "Total" row for each group immediately after the group unlike in bind_rows where you get all "Total" rows together.

Add Group Subheader and Subtotal Rows to data.frame or table in R

Instead of applying adorn_totals on the entire summary, use group_modify and then convert to gt

library(dplyr)
library(tidyr)
library(purrr)
library(janitor)
library(gt)
d %>%
group_by(year, sector, subsector) %>%
summarise(sales = sum(value, na.rm = TRUE), .groups = 'drop') %>%
pivot_wider(names_from = year, values_from = sales) %>%
group_by(sector) %>%
group_modify(~ .x %>% adorn_totals(where = "row")) %>%

gt()

-output

Sample Image


An option is also to split the column with expss

library(expss)
library(openxlsx)
out <- d %>%
group_by(year, sector, subsector) %>%
summarise(sales = sum(value, na.rm = TRUE), .groups = 'drop') %>%
pivot_wider(names_from = year, values_from = sales) %>%
group_by(sector) %>%
group_modify(~ .x %>% adorn_totals(where = "row")) %>%
ungroup %>%
split_columns(columns = 1)
wb <- createWorkbook()
sh <- addWorksheet(wb, "Tables")
xl_write(out, wb, sh)
saveWorkbook(wb, file.path(getwd(), "Documents/table1.xlsx"), overwrite = TRUE)

-output

Sample Image

Using dplyr to add summary rows

An alternative using reshape2 and dplyr packages (both of the same author, btw=P):

df <- structure(...) # structure of your question

Now, melt and cast:

df %>% 
select(-Gender.1,-Age.1) %>%
melt %>% # melting data.frame. See the results until here to better understanding.
dcast(Gender + Age ~ variable, sum, na.rm=TRUE, margins = "Age")

What's going on, line by line:

  1. your data.frame
  2. removing duplicated columns
  3. melting data.frame. See the results until here to better understanding.
  4. aggregate by Gender, Age and variable (Years, in this case). The aggregation function is sum with na.rm=TRUE. Then, margin = "Age" puts the totals by Age, as desired.

Results:

        Gender   Age Year.10 Year.11 Year.12 Year.13 Year.10.1 Year.11.1 Year.12.1 Year.13.1
1 FEMALE 3-9 46890 30150 18020 19060 41370 30150 18020 19060
2 FEMALE 10-19 58440 92870 50900 84110 52680 87280 50900 78440
3 FEMALE 40-59 484770 439860 454200 412650 460340 417800 447400 395700
4 FEMALE (all) 590100 562880 523120 515820 554390 535230 516320 493200
5 MALE 0-2 16900 22380 2260 5860 16900 22380 2260 5860
...
17 UNSPECIFIED UNSP 10920 0 5500 5700 10920 0 0 5700
18 UNSPECIFIED (all) 26240 2150 14670 5700 26240 0 9170 5700

Hope it helps.

How do you add a column with row totals in DPLYR when the first variable is not numeric?

You can also try:

require(tidyverse)

df %>%
mutate(Sum = select_if(., is.numeric) %>%
rowSums)

As @akrun mentioned, another option is to use reduce instead of rowSums:

df %>% 
mutate(Sum = select_if(., is.numeric) %>%
reduce("+"))

Result:

  branch Pass_test_equity Fail_test_equity no_dependent_equity  Sum
1 branc1 230 47 620 897
2 branc2 1460 201 1681 3342
3 branc3 875 140 1533 2548

Data:

df <- read.table(text ="branch Pass_test_equity Fail_test_equity no_dependent_equity
branc1 230 47 620
branc2 1460 201 1681
branc3 875 140 1533",
header = TRUE)

Row sum using mutate and select

You can summarise first and then bind them to your original df, i.e.

library(tidyverse)

bind_rows(d1,
d1 %>%
group_by(name) %>%
summarise_all(funs(sum)) %>%
mutate(name = paste0(name, '_total')))

which gives,

# A tibble: 6 x 3
name `2012` `2013`
<chr> <dbl> <dbl>
1 jim 57 14
2 john 58 3
3 jim 47 3
4 john 57 90
5 jim_total 104 17
6 john_total 115 93

R: Divide rows by row totals using dplyr

It seems like adorn_percentages does the same thing as your custom function.

d %>% 
adorn_percentages("col") %>%
mutate_at(vars(-level), ~round(.*100,2)) %>%
bind_rows(
d %>% adorn_totals("row") %>% slice(11)
)


Related Topics



Leave a reply



Submit