Add margin row totals in dplyr chain
With adorn_totals() from the janitor package:
library(janitor)
mtcars %>%
tabyl(cyl, gear) %>%
adorn_totals("row")
cyl 3 4 5
4 1 8 2
6 2 4 1
8 12 0 2
Total 15 12 5
To get from there to the "long" form in your post, add tidyr::gather()
to the pipeline:
mtcars %>%
tabyl(cyl, gear) %>%
adorn_totals("row") %>%
tidyr::gather(gear, n, 2:ncol(.), convert = TRUE)
cyl gear n
1 4 3 1
2 6 3 2
3 8 3 12
4 Total 3 15
5 4 4 8
6 6 4 4
7 8 4 0
8 Total 4 12
9 4 5 2
10 6 5 1
11 8 5 2
12 Total 5 5
Self-promotion alert, I authored this package - adding this answer b/c it's a genuinely efficient solution here.
dplyr pipe: how to add a margin row calculating a total (like addmargins function - base)
You can do:
data2 <- data %>%
filter (grepl("A|B|D", column1)) %>%
rbind(., data.frame(column1="Total", column2=sum(.$column2, na.rm=T)))
column1 column2
1 A 4
2 B NA
3 D 1
4 Total 5
Easier way to add rows with totals for groups in dplyr
Honestly, I would do what you have done to add rows for each group but for the purpose of demonstrating way to use add_row
here's an answer :
library(dplyr)
library(purrr)
df %>%
group_split(CO_ANO, subsector, niv100) %>%
map_df(~add_row(.x, CO_ANO = first(.x$CO_ANO), subsector = first(.x$subsector),
niv100 = first(.x$niv100),VL_FOB_real = sum(.x$VL_FOB_real), CO_UF = 'Total'))
# CO_ANO CO_UF niv100 subsector VL_FOB_real
# <chr> <chr> <chr> <chr> <dbl>
# 1 1989 41 410 2 10
# 2 1989 Total 410 2 10
# 3 1990 41 3020 13 9
# 4 1990 Total 3020 13 9
# 5 1990 45 1530 8 2
# 6 1990 Total 1530 8 2
# 7 1992 41 2510 13 11
# 8 1992 Total 2510 13 11
# 9 1992 99 2610 13 5
#10 1992 Total 2610 13 5
# … with 20 more rows
The only benefit I see of this approach is you get "Total"
row for each group immediately after the group unlike in bind_rows
where you get all "Total"
rows together.
Add Group Subheader and Subtotal Rows to data.frame or table in R
Instead of applying adorn_totals
on the entire summary, use group_modify
and then convert to gt
library(dplyr)
library(tidyr)
library(purrr)
library(janitor)
library(gt)
d %>%
group_by(year, sector, subsector) %>%
summarise(sales = sum(value, na.rm = TRUE), .groups = 'drop') %>%
pivot_wider(names_from = year, values_from = sales) %>%
group_by(sector) %>%
group_modify(~ .x %>% adorn_totals(where = "row")) %>%
gt()
-output
An option is also to split the column with expss
library(expss)
library(openxlsx)
out <- d %>%
group_by(year, sector, subsector) %>%
summarise(sales = sum(value, na.rm = TRUE), .groups = 'drop') %>%
pivot_wider(names_from = year, values_from = sales) %>%
group_by(sector) %>%
group_modify(~ .x %>% adorn_totals(where = "row")) %>%
ungroup %>%
split_columns(columns = 1)
wb <- createWorkbook()
sh <- addWorksheet(wb, "Tables")
xl_write(out, wb, sh)
saveWorkbook(wb, file.path(getwd(), "Documents/table1.xlsx"), overwrite = TRUE)
-output
Using dplyr to add summary rows
An alternative using reshape2
and dplyr
packages (both of the same author, btw=P):
df <- structure(...) # structure of your question
Now, melt and cast:
df %>%
select(-Gender.1,-Age.1) %>%
melt %>% # melting data.frame. See the results until here to better understanding.
dcast(Gender + Age ~ variable, sum, na.rm=TRUE, margins = "Age")
What's going on, line by line:
- your data.frame
- removing duplicated columns
- melting data.frame. See the results until here to better understanding.
- aggregate by
Gender
,Age
andvariable
(Years, in this case). The aggregation function issum
with na.rm=TRUE. Then,margin = "Age"
puts the totals byAge
, as desired.
Results:
Gender Age Year.10 Year.11 Year.12 Year.13 Year.10.1 Year.11.1 Year.12.1 Year.13.1
1 FEMALE 3-9 46890 30150 18020 19060 41370 30150 18020 19060
2 FEMALE 10-19 58440 92870 50900 84110 52680 87280 50900 78440
3 FEMALE 40-59 484770 439860 454200 412650 460340 417800 447400 395700
4 FEMALE (all) 590100 562880 523120 515820 554390 535230 516320 493200
5 MALE 0-2 16900 22380 2260 5860 16900 22380 2260 5860
...
17 UNSPECIFIED UNSP 10920 0 5500 5700 10920 0 0 5700
18 UNSPECIFIED (all) 26240 2150 14670 5700 26240 0 9170 5700
Hope it helps.
How do you add a column with row totals in DPLYR when the first variable is not numeric?
You can also try:
require(tidyverse)
df %>%
mutate(Sum = select_if(., is.numeric) %>%
rowSums)
As @akrun mentioned, another option is to use reduce
instead of rowSums
:
df %>%
mutate(Sum = select_if(., is.numeric) %>%
reduce("+"))
Result:
branch Pass_test_equity Fail_test_equity no_dependent_equity Sum
1 branc1 230 47 620 897
2 branc2 1460 201 1681 3342
3 branc3 875 140 1533 2548
Data:
df <- read.table(text ="branch Pass_test_equity Fail_test_equity no_dependent_equity
branc1 230 47 620
branc2 1460 201 1681
branc3 875 140 1533",
header = TRUE)
Row sum using mutate and select
You can summarise first and then bind them to your original df, i.e.
library(tidyverse)
bind_rows(d1,
d1 %>%
group_by(name) %>%
summarise_all(funs(sum)) %>%
mutate(name = paste0(name, '_total')))
which gives,
# A tibble: 6 x 3
name `2012` `2013`
<chr> <dbl> <dbl>
1 jim 57 14
2 john 58 3
3 jim 47 3
4 john 57 90
5 jim_total 104 17
6 john_total 115 93
R: Divide rows by row totals using dplyr
It seems like adorn_percentages
does the same thing as your custom function.
d %>%
adorn_percentages("col") %>%
mutate_at(vars(-level), ~round(.*100,2)) %>%
bind_rows(
d %>% adorn_totals("row") %>% slice(11)
)
Related Topics
How to Clear Only a Few Specific Objects from the Workspace
Stepwise Regression Using P-Values to Drop Variables with Nonsignificant P-Values
How to Copy and Paste Data into R from the Clipboard
Normalizing Y-Axis in Histograms in R Ggplot to Proportion
Calculate Correlation with Cor(), Only for Numerical Columns
How to Hold Figure Position with Figure Caption in PDF Output of Knitr
Convert a Dataframe to a Vector (By Rows)
How to Pass Command-Line Arguments When Calling Source() on an R File Within Another R File
Populating a Data Frame in R in a Loop
Creating a Density Histogram in Ggplot2
Filling Missing Dates in a Grouped Time Series - a Tidyverse-Way
Add Text to Horizontal Barplot in R, Y-Axis at Different Scale
Insert a Blank Row After Each Group of Data
How to Sort All Dataframes in a List of Dataframes on the Same Column