How Calculate Growth Rate in Long Format Data Frame

How calculate growth rate in long format data frame?

For these sorts of questions ("how do I compute XXX by category YYY")? there are always solutions based on by(), the data.table() package, and plyr. I generally prefer plyr, which is often slower, but (to me) more transparent/elegant.

df <- data.frame(Category=c(rep("A",6),rep("B",6)),
Year=rep(2010:2015,2),Value=1:12)

library(plyr)
ddply(df,"Category",transform,
Growth=c(NA,exp(diff(log(Value)))-1))

The main difference between this answer and @krlmr's is that I am using a geometric-mean trick (taking differences of logs and then exponentiating) while @krlmr computes an explicit ratio.

Mathematically, diff(log(Value)) is taking the differences of the logs, i.e. log(x[t+1])-log(x[t]) for all t. When we exponentiate that we get the ratio x[t+1]/x[t] (because exp(log(x[t+1])-log(x[t])) = exp(log(x[t+1]))/exp(log(x[t])) = x[t+1]/x[t]). The OP wanted the fractional change rather than the multiplicative growth rate (i.e. x[t+1]==x[t] corresponds to a fractional change of zero rather than a multiplicative growth rate of 1.0), so we subtract 1.

I am also using transform() for a little bit of extra "syntactic sugar", to avoid creating a new anonymous function.

Calculating yearly growth-rates from quarterly, long form data in r

I think you need something like this:

library(dplyr)
df %>%
#group by sector and country
group_by(Sector, Country) %>%
#calculate growth as (quarter / 5-period-lagged quarter) - 1
mutate(growth = Income / lag(Income, 4) - 1)

Output

Source: local data frame [32 x 5]
Groups: Sector, Country [4]

Sector Country Quarter Income growth
(fctr) (fctr) (int) (int) (dbl)
1 A USA 1 20 NA
2 A USA 2 21 NA
3 A USA 3 22 NA
4 A USA 4 23 NA
5 A USA 5 24 0.2000000
6 A USA 6 25 0.1904762
7 A USA 7 26 0.1818182
8 A USA 8 27 0.1739130
9 B USA 1 28 NA
10 B USA 2 29 NA
.. ... ... ... ... ...

Calculating rates when data is in long form

You can use group_by to calculate this for each squirrel:

group_by(df, squirrel) %>% 
mutate(g.rate = (wt - nth(wt, which.min(is.na(wt)))) /
(age - nth(age, which.min(is.na(wt)))))

That leaves NaNs where the age term is zero, but you can change those to NAs if you want with df$g.rate[is.nan(df$g.rate)] <- NA.

Calculating growth rates across rows by groups in the long format (relative to a base group) in R

To compute based on the first in each group use, well, first. This is valid if Baseline is always the first in each group.

df<- structure(list(
country = c("CAN", "CAN", "CAN", "CAN", "CAN",
"CAN", "CAN", "CAN", "CAN", "CAN", "CAN", "CAN", "CAN", "CAN",
"CAN", "NCOA", "NCOA", "NCOA", "NCOA", "NCOA"),
year = c("2020", "2020", "2020", "2020", "2020", "2025", "2025",
"2025", "2025", "2025", "2030", "2030", "2030", "2030", "2030",
"2020", "2020", "2020", "2020", "2020"),
scenario = c("Baseline", "BCA", "Full BCA", "NDC", "Partial BCA",
"Baseline", "BCA", "Full BCA", "NDC", "Partial BCA",
"Baseline", "BCA", "Full BCA", "NDC", "Partial BCA", "Baseline",
"BCA", "Full BCA", "NDC", "Partial BCA"),
value = c(50527.8708215592, 50487.4619290311, 50485.0924261504,
50489.4453487844, 50486.1975947164, 55845.9708589775, 55070.2745559464,
55133.107605613, 55153.4525662034, 55065.0036253937, 61463.2383809614,
59893.8712077455, 59971.8726308887, 59936.72156767, 59875.7762254252,
338418.917408225, 338420.617142445, 338428.007621131, 338419.514027857,
338427.263672463)), row.names = c(NA, -20L),
class = c("tbl_df", "tbl", "data.frame"))

suppressPackageStartupMessages(library(dplyr))

df %>%
group_by(country, year) %>%
mutate(growth = (value/first(value) - 1)*100)
#> # A tibble: 20 × 5
#> # Groups: country, year [4]
#> country year scenario value growth
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 CAN 2020 Baseline 50528. 0
#> 2 CAN 2020 BCA 50487. -0.0800
#> 3 CAN 2020 Full BCA 50485. -0.0847
#> 4 CAN 2020 NDC 50489. -0.0760
#> 5 CAN 2020 Partial BCA 50486. -0.0825
#> 6 CAN 2025 Baseline 55846. 0
#> 7 CAN 2025 BCA 55070. -1.39
#> 8 CAN 2025 Full BCA 55133. -1.28
#> 9 CAN 2025 NDC 55153. -1.24
#> 10 CAN 2025 Partial BCA 55065. -1.40
#> 11 CAN 2030 Baseline 61463. 0
#> 12 CAN 2030 BCA 59894. -2.55
#> 13 CAN 2030 Full BCA 59972. -2.43
#> 14 CAN 2030 NDC 59937. -2.48
#> 15 CAN 2030 Partial BCA 59876. -2.58
#> 16 NCOA 2020 Baseline 338419. 0
#> 17 NCOA 2020 BCA 338421. 0.000502
#> 18 NCOA 2020 Full BCA 338428. 0.00269
#> 19 NCOA 2020 NDC 338420. 0.000176
#> 20 NCOA 2020 Partial BCA 338427. 0.00247

Created on 2022-05-08 by the reprex package (v2.0.1)

How to calculate growth in percentage between rows in a Pandas DataFrame?

Use Series.pct_change with multiple by 100 with Series.mul and rounding by Series.round:

df['C'] = df.B.pct_change().mul(100).round(2)
print (df)
A B C
0 1 100 NaN
1 2 150 50.00
2 3 200 33.33

For percentage add map with format and processing with missing values with NaN != NaN:

df['C'] = df.B.pct_change().mul(100).round(2).map(lambda x: '{0:g}%'.format(x) if x==x else x)
print (df)
A B C
0 1 100 NaN
1 2 150 50%
2 3 200 33.33%

How to compute growth rate (1- and 3-year horizon) from panel data in R

@Dan Do you use any packages? I recommend you using zoo and data.table packages and transform dates in the following way:

DT[, YearNumeric := as.numeric(YEAR)]
DT[, PreviousYearLoanNet := .SD[match(YearNumeric - 1, .SD$YearNumeric), LOANS_NET], by=CODE]

Here, you create a column with previous (-1 year) loan values. Then you create a new column with growth:

DT[,Growth1Y:= (YEARLOANNET- PreviousYearLoanNet)/PreviousYearLoanNet]

And then you do whatever you want:) Cheers!

Quarterly Year over Year Growth Rate

Here's a very simple solution:

YearOverYear<-function (x,periodsPerYear){
if(NROW(x)<=periodsPerYear){
stop("too few rows")
}
else{
indexes<-1:(NROW(x)-periodsPerYear)
return(c(rep(NA,periodsPerYear),(x[indexes+periodsPerYear]-x[indexes])/x[indexes]))
}
}

> cbind(df,YoY=YearOverYear(df$value,4))
date value YoY
1 2000-01-01 1592 NA
2 2000-04-01 1825 NA
3 2000-07-01 1769 NA
4 2000-10-01 1909 NA
5 2001-01-01 2022 0.27010050
6 2001-04-01 2287 0.25315068
7 2001-07-01 2169 0.22611645
8 2001-10-01 2366 0.23939235
9 2002-01-01 2001 -0.01038576
10 2002-04-01 2087 -0.08745081
11 2002-07-01 2099 -0.03227294
12 2002-10-01 2258 -0.04564666

Finding Growth in Dataframe in R

Assuming that you have more years, dplyr handles it beautifully.

library(dplyr)
growth <- function(x)x/lag(x)-1
df %>%
group_by(Website) %>%
mutate_each(funs(growth), V1, V2)
# Website Year V1 V2
#1 Website 1 2013 NA NA
#2 Website 2 2013 NA NA
#3 Website 3 2013 NA NA
#4 Website 1 2014 1.0 2.0
#5 Website 2 2014 0.5 1.0
#6 Website 3 2014 0.4 0.5


Related Topics



Leave a reply



Submit