R: How to get the percentage change from two different columns
You can also use the ddply
function from plyr
package:
library(plyr)
ddply(df, .(Carrier, Station), summarise,
PerentChange = (sum(TYSeats) - sum(LYSeats))/sum(LYSeats))
Carrier Station PerentChange
1 AAL BSB 0.01538462
2 AAL CNF -0.05319134
Applying percentage change between two columns, same row
The answer is below. It was a order issue with the syntax. Here is the working solution which applies a percentage change between two columns.
new.dataframe$close.prct.ema.10 <- apply(new.dataframe[,c('Close', 'ema.10')], 1, function(x) { (x[1]-x[2])/x[2] * 100 } )
(x[1]-x[2])/x[2] * 100
Note the brackets first around (x1-x2) / 2 *100
How can I calculate the percentage change within a group for multiple columns in R?
How about using pct <- function(x) x/lag(x)
? (or (x/lag(x)-1)*100
, or however you wish to specify pct change exactly)
e.g.,
pct(1:3)
[1] NA 2.0 1.5
Edit: Adding Frank's suggestion
pct <- function(x) {x/lag(x)}
dt %>% group_by(ID) %>% mutate_each(funs(pct), c(V1, V2, V3))
ID Date V1 V2 V3
1 Jan NA NA NA
1 Feb 1.500000 1.333333 1.2
1 Mar 2.333333 2.000000 1.5
2 Jan NA NA NA
2 Feb 2.000000 3.000000 4.0
2 Mar 3.500000 2.666667 2.0
How to calculate the percent change of a data frame column, then the next, and so on?
Here are a few different approaches. No packages are used.
1) Divide all but the first 2 columns by all but the first and last columns, subtract 1 and multiply by 100. Combine that with the original first column and NA times the original second column.
data.frame(DF[1], NA * DF[2], 100 * (DF[-(1:2)] / DF[-c(1, ncol(DF))] - 1))
giving:
Product Q1 Q2 Q3
1 X Product NA 1.959081 0.511280
2 Y Product NA -1.222760 2.452749
3 Z Product NA -5.249560 2.611586
1a) A variation of (1) that is even shorter is based on working in the log domain and then converting back:
data.frame(DF[1], NA * DF[2], 100 * t(exp(diff(t(log(DF[-1]))))-1))
giving:
Product Q1 Q2 Q3
1 X Product NA 1.959081 0.511280
2 Y Product NA -1.222760 2.452749
3 Z Product NA -5.249560 2.611586
2) Define a function percent
which calculates the percentages based on vector x
returning a vector the same length as x
filling in the first element with NA since there is no prior value for which to calculate its percent. Then apply that to each row noting that apply
will return the transpose of what we want so transpose it back.
percent <- function(x) 100 * c(NA * x[1], diff(x) / head(x, -1))
data.frame(DF[1], t(apply(DF[-1], 1, percent)))
giving:
Product Q1 Q2 Q3
1 X Product NA 1.959081 0.511280
2 Y Product NA -1.222760 2.452749
3 Z Product NA -5.249560 2.611586
Note: The input DF
in reproducible form was assumed to be:
DF <- structure(list(Product = structure(1:3, .Label = c("X Product",
"Y Product", "Z Product"), class = "factor"), Q1 = c(4.986184956,
2.86990877, 6.58413545), Q2 = c(5.083868356, 2.834816682, 6.238497279
), Q3 = c(5.109861156, 2.904347607, 6.40142101)), .Names = c("Product",
"Q1", "Q2", "Q3"), class = "data.frame", row.names = c(NA, -3L
))
Create variable for percentage change between two values based on group
If you use lag()
to define your rate, you will end up with "NA" for every b == 3, and the correct value for b == 7 when you group_by "b". Also, c()
is a Primitive function in R and it's best not to use "c" as the name of a variable.
Is this the outcome you're after?
library(tidyverse)
a <- c(3,7,3,7,3,7,3,3,7,3,7,3,7,7)
b <- c("a", "a", "b", "b", "c", "c", "d", "e", "e", "f","f", "g", "g", "h")
d <- c(80, 100, runif(12, min=80, max=100))
df <- data.frame(a,b,d)
df %>% group_by(b) %>%
mutate(rate = 100 * (d - lag(d, default = NA))/lag(d, default = NA))
#> # A tibble: 14 × 4
#> # Groups: b [8]
#> a b d rate
#> <dbl> <chr> <dbl> <dbl>
#> 1 3 a 80 NA
#> 2 7 a 100 25
#> 3 3 b 88.0 NA
#> 4 7 b 91.1 3.54
#> 5 3 c 95.1 NA
#> 6 7 c 82.7 -13.1
#> 7 3 d 92.6 NA
#> 8 3 e 84.1 NA
#> 9 7 e 91.8 9.20
#> 10 3 f 81.9 NA
#> 11 7 f 93.6 14.4
#> 12 3 g 88.7 NA
#> 13 7 g 80.6 -9.11
#> 14 7 h 99.2 NA
Created on 2021-12-20 by the reprex package (v2.0.1)
You can be more flexible with ifelse()
's too, e.g. if you want NA's for cases where you have a single group but zeros for cases where a == 3:
library(tidyverse)
a <- c(3,7,3,7,3,7,3,3,7,3,7,3,7,7)
b <- c("a", "a", "b", "b", "c", "c", "d", "e", "e", "f","f", "g", "g", "h")
d <- c(80, 100, runif(12, min=80, max=100))
df <- data.frame(a,b,d)
df %>% group_by(b) %>%
mutate(group_number = n()) %>%
mutate(rate = ifelse(group_number == 1, NA, ifelse(a == 7, 100 * (d - lag(d, default = NA))/lag(d, default = NA), 0))) %>%
select(-group_number) %>%
ungroup()
#> # A tibble: 14 × 4
#> a b d rate
#> <dbl> <chr> <dbl> <dbl>
#> 1 3 a 80 0
#> 2 7 a 100 25
#> 3 3 b 95.8 0
#> 4 7 b 83.9 -12.5
#> 5 3 c 87.0 0
#> 6 7 c 81.5 -6.26
#> 7 3 d 97.0 NA
#> 8 3 e 99.1 0
#> 9 7 e 82.6 -16.6
#> 10 3 f 82.3 0
#> 11 7 f 96.0 16.7
#> 12 3 g 99.5 0
#> 13 7 g 93.4 -6.09
#> 14 7 h 86.8 NA
Created on 2021-12-20 by the reprex package (v2.0.1)
Mutating across multiple columns to create percent score in R
You could do:
df %>%
mutate(across(starts_with('score'), ~ ./5 * 100, .names = 'percent_{col}'))
percentage difference between two dataframe columns (only numeric)
We can subset the numeric columns and then perform the operation
num_cols <- c("colB", "colC")
df3 <- (df2[num_cols] - df1[num_cols])/df2[num_cols] * 100
df3
# colB colC
#1 -100 20
#2 50 0
To get other non numeric columns we can use setdiff
and then cbind
cbind(df1[setdiff(names(df1), num_cols)], df3)
# colA colZ colB colC
#1 mean stringA -100 20
#2 count stringB 50 0
Related Topics
How to Remove Rows With Any Zero Value
Delete Rows That Exist in Another Data Frame
Count Occurrences of Value in a Set of Variables in R (Per Row)
Remove Specific Characters from Column Names in R
Selecting Multiple Odd or Even Columns/Rows for Dataframe
Use First Row Data as Column Names in R
How to Get Rowsums for Selected Columns in R
R Collapse Multiple Rows into 1 Row - Same Columns
How Split Column of List-Values into Multiple Columns
Adding a New Column Based Upon Values in Another Column Using Dplyr
Remove Quotes from a Character Vector in R
Duplicating Rows in R Merge Function
Extract Rows for the First Occurrence of a Variable in a Data Frame
Mean Per Group in a Data.Frame
Is R'S Apply Family More Than Syntactic Sugar
How to Trim Leading and Trailing White Space
How to Split Data into Training/Testing Sets Using Sample Function
Using R to Download Zipped Data File, Extract, and Import Data