ddply multiple quantiles by group
With base R
you could use tapply
and do.call
library(plyr)
do.call("rbind", tapply(baseball$ab, baseball$team, quantile))
do.call("rbind", tapply(baseball$ab, baseball$team, quantile, c(0.05, 0.1, 0.2)))
Or, with ddply
ddply(baseball, .(team), function(x) quantile(x$ab))
Inconsistent ddply multiple quantiles by group
That is because the value of Cmax
is changed when you run Cmax = round(median(Cmax), 2)
. The next command that you run (Cmax_25 = round(quantile(Cmax, 0.25), 2)
) gets this changed Cmax
value and not the original one.
You can keep that line at the last so that it will not change the Cmax
value. Also plyr
is retired so you may want to switch to dplyr
.
library(dplyr)
NCAtrim %>%
group_by(DoseWt) %>%
summarise(AUC_inf = round(median(AUCINF_obs),2),
AUCinf25 = round(quantile(AUCINF_obs, 0.25),2),
AUCinf75 = round(quantile(AUCINF_obs, 0.75),2),
Cmax_25 = round(quantile(Cmax, 0.25), 2),
Cmax_75 = round(quantile(Cmax, 0.75), 2),
Cmax = round(median(Cmax), 2)) -> NCA.by.Dose.25_75tile
NCA.by.Dose.25_75tile
Doing quantiles per group
We can do a group by operation and then get the quantile
on each of those numeric columns by looping across
the columns and then return a list
object which can be converted to columns with unnest_wider
etc.
library(dplyr)
df1 %>%
select(-Ano) %>%
group_by(paises) %>%
summarise(across(where(is.numeric), ~
list(as.list(quantile(.x, prob = c(.25, 0.5, 0.75)))))
ddply - dplyr: .fun = summarize with several rows
Check if this works:
Output is different because of no
set.seed
dfx %>% group_by(group) %>% do(data.frame(p=p, stats=quantile(.$age, probs=p)))
Source: local data frame [12 x 3]
Groups: group
group p stats
1 A 0.2 27.68069
2 A 0.4 35.36915
3 A 0.6 39.15223
4 A 0.8 46.41073
5 B 0.2 34.68378
6 B 0.4 37.22358
7 B 0.6 40.76185
8 B 0.8 44.48645
9 C 0.2 33.86023
10 C 0.4 36.30515
11 C 0.6 46.80672
12 C 0.8 52.82140
Grouped table of percentiles
You could get the quantile
data in a list and then use unnest_wider
to have separate columns.
library(dplyr)
set.seed(123)
data.frame(group=sample(LETTERS[1:5],100,TRUE),values=rnorm(100)) %>%
group_by(group) %>%
summarise(perc_int= list(quantile(values, probs=c(0.05,0.34,0.5,0.67,0.95)))) %>%
tidyr::unnest_wider(perc_int)
# A tibble: 5 x 6
# group `5%` `34%` `50%` `67%` `95%`
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A -2.40 -0.580 -0.0887 0.371 1.38
#2 B -1.83 -0.200 0.0848 0.546 1.78
#3 C -0.947 -0.148 0.184 0.789 1.81
#4 D -0.992 -0.275 -0.0193 0.274 1.82
#5 E -1.65 -0.457 -0.0422 0.540 1.66
Creating multiple subsets all in one data.frame (possibly with ddply)
You could try:
ddply(df, .(x), subset, rnorm.100. > quantile(rnorm.100., 0.8))
And off topic: you could use df <- data.frame(x,y=rnorm(100))
to name a column on-the-fly.
Using colwise, is.numeric in ddply in R for quantile calculation
We could try with data.table
library(data.table)
setDT(d)[,lapply(.SD, quantile, probs=0.75) , groups]
Or using dplyr
library(dplyr)
d %>%
group_by(groups) %>%
summarise_each(funs(quantile(., probs=0.75)))
ddply to split and add rows to each group
I think this will do what you want:
AddRows <- function(df) {
new_numbers <- seq(from = min(df$numbers), to = 12)
new_numbers <- new_numbers[new_numbers != 0]
noms <- rep(unique(df$noms), length(new_numbers))
numbers <- c(df$numbers, rep(NA, length(new_numbers) - length(df$numbers)))
return(data.frame(noms, numbers, new_numbers))
}
ddply(df, .(noms), AddRows)
Related Topics
Generating Names Iteratively in R for Storing Plots
How to Loop Through a Folder of CSV Files in R
Saving a Data Frame as a Binary File
Ggplot2:How to Reduce the Width and the Space Between Bars with Geom_Bar
Convert Begin and End Coordinates into Spatial Lines in R
Mapping Specific States and Provinces in R
Subset Data Based on Partial Match of Column Names
Multiple Condition If-Else Using Dplyr, Custom Function, or Purrr
Keep Same Order as in Data Files When Using Ggplot
How to Change Strip.Text Labels in Ggplot with Facet and Margin=True
Grouping with Custom Geom Fails - How to Inspect Internal Object from Draw_Panel()
Getting Both Column Counts and Proportions in the Same Table in R
Handling Latex Backslashes in Xtable
Cumulative Count of Unique Values in R
Linear Interpolate Missing Values in Time Series
R - Count Shiny Download Button Clicks