Ddply Multiple Quantiles by Group

ddply multiple quantiles by group

With base R you could use tapply and do.call

library(plyr)
do.call("rbind", tapply(baseball$ab, baseball$team, quantile))

do.call("rbind", tapply(baseball$ab, baseball$team, quantile, c(0.05, 0.1, 0.2)))

Or, with ddply

ddply(baseball, .(team), function(x) quantile(x$ab))

Inconsistent ddply multiple quantiles by group

That is because the value of Cmax is changed when you run Cmax = round(median(Cmax), 2). The next command that you run (Cmax_25 = round(quantile(Cmax, 0.25), 2)) gets this changed Cmax value and not the original one.

You can keep that line at the last so that it will not change the Cmax value. Also plyr is retired so you may want to switch to dplyr.

library(dplyr)

NCAtrim %>%
group_by(DoseWt) %>%
summarise(AUC_inf = round(median(AUCINF_obs),2),
AUCinf25 = round(quantile(AUCINF_obs, 0.25),2),
AUCinf75 = round(quantile(AUCINF_obs, 0.75),2),
Cmax_25 = round(quantile(Cmax, 0.25), 2),
Cmax_75 = round(quantile(Cmax, 0.75), 2),
Cmax = round(median(Cmax), 2)) -> NCA.by.Dose.25_75tile

NCA.by.Dose.25_75tile

Doing quantiles per group

We can do a group by operation and then get the quantile on each of those numeric columns by looping across the columns and then return a list object which can be converted to columns with unnest_wider etc.

library(dplyr)
df1 %>%
select(-Ano) %>%
group_by(paises) %>%
summarise(across(where(is.numeric), ~
list(as.list(quantile(.x, prob = c(.25, 0.5, 0.75)))))

ddply - dplyr: .fun = summarize with several rows

Check if this works:
Output is different because of no set.seed

 dfx %>% group_by(group) %>% do(data.frame(p=p, stats=quantile(.$age, probs=p)))
Source: local data frame [12 x 3]
Groups: group

group p stats
1 A 0.2 27.68069
2 A 0.4 35.36915
3 A 0.6 39.15223
4 A 0.8 46.41073
5 B 0.2 34.68378
6 B 0.4 37.22358
7 B 0.6 40.76185
8 B 0.8 44.48645
9 C 0.2 33.86023
10 C 0.4 36.30515
11 C 0.6 46.80672
12 C 0.8 52.82140

Grouped table of percentiles

You could get the quantile data in a list and then use unnest_wider to have separate columns.

library(dplyr)
set.seed(123)

data.frame(group=sample(LETTERS[1:5],100,TRUE),values=rnorm(100)) %>%
group_by(group) %>%
summarise(perc_int= list(quantile(values, probs=c(0.05,0.34,0.5,0.67,0.95)))) %>%
tidyr::unnest_wider(perc_int)

# A tibble: 5 x 6
# group `5%` `34%` `50%` `67%` `95%`
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A -2.40 -0.580 -0.0887 0.371 1.38
#2 B -1.83 -0.200 0.0848 0.546 1.78
#3 C -0.947 -0.148 0.184 0.789 1.81
#4 D -0.992 -0.275 -0.0193 0.274 1.82
#5 E -1.65 -0.457 -0.0422 0.540 1.66

Creating multiple subsets all in one data.frame (possibly with ddply)

You could try:

ddply(df, .(x), subset, rnorm.100. > quantile(rnorm.100., 0.8))

And off topic: you could use df <- data.frame(x,y=rnorm(100)) to name a column on-the-fly.

Using colwise, is.numeric in ddply in R for quantile calculation

We could try with data.table

library(data.table)
setDT(d)[,lapply(.SD, quantile, probs=0.75) , groups]

Or using dplyr

library(dplyr)
d %>%
group_by(groups) %>%
summarise_each(funs(quantile(., probs=0.75)))

ddply to split and add rows to each group

I think this will do what you want:

AddRows <- function(df) {
new_numbers <- seq(from = min(df$numbers), to = 12)
new_numbers <- new_numbers[new_numbers != 0]
noms <- rep(unique(df$noms), length(new_numbers))
numbers <- c(df$numbers, rep(NA, length(new_numbers) - length(df$numbers)))

return(data.frame(noms, numbers, new_numbers))
}

ddply(df, .(noms), AddRows)


Related Topics



Leave a reply



Submit