Loops in R - Need to use index, anyway to avoid 'for'?
What you offered would be the fractional variation, but if you multiplied by 100 you get the "percent variation":
pv<- vector("numeric",length(x))
pv[1] <- 0
pv[-1] <- 100* ( x[-1] - x[-length(x)] )/ x[-length(x)]
Vectorized solution. ( And you should note that for-loops are going to be just as slow as *apply solutions ... just not as pretty. Always look for a vectorized approach.)
To explain a bit more: The x[-length(x)]
is the vector, x[1:(length{x-1)]
, and the x[-1]
is the vector, x[2:length(x)]
, and the vector operations in R are doing the same operations as in your for-loop body, although not using an explicit loop. R first constructs the differences in those shifted vectors, x[-length(x)] - x[-1]
, and then divides by x[1:(length{x-1)]
.
For Loops: How to avoid loop when index is used to call multiple different column values?
You could also subset for rows where AA_FIRST == 1
and save it as a lookup table (similar to a dictionary in Python), and then match based on ID
.
data<- data.frame(
ID = c(1,1,5,5,5,5,5,6,6,6,6),
DATA = c(0,0,0,0,1,0,0,0,0,1,0),
OFFSET = c(-20,0,-1500, 150, 155, 159, 300, -2000, 30, 100, 120),
AA_FIRST = c(NA, NA, NA, NA, 1, NA, NA, NA, NA, 1, NA),
LABRESULT = c(4.0, 5.0, 3.5, 4.1, NA, 3.0, 5.5, 2.1, 2.5, NA, 3.5) )
dict <- subset(data, data$AA_FIRST==1)[c("ID", "OFFSET")]
data$refOFFSET <- dict[match(data$ID, dict$ID), 2]
Is there a function in R to avoid using loop when we look for all matching index for all element of a vector?
We can use match
which is very fast as a base R
function. Here, we are just matching two column of a dataset without even trying to get both datasets together
with(data, match(id1, id2))
#[1] 4 1 2 3
To make this faster, use fmatch
from fastmatch
library(fastmatch)
with(data, fmatch(id1, id2))
Benchmarks
set.seed(24)
data1 <- data.frame(id1 = sample(1e7), id2 = sample(1e7))
system.time(with(data1, match(id1, id2)))
# user system elapsed
# 1.635 0.079 1.691
system.time(with(data1, fmatch(id1, id2)))
# user system elapsed
# 1.155 0.062 1.195
system.time({
data2 <- data.table(id = data1$id1)
data3 <- data.table(id = data1$id2)
data2[data3, idx := .I, on = .(id)]
})
# user system elapsed
# 2.306 0.051 2.353
Avoid for loops in R/ vectorize
We can use cumsum
on the vector
and index to remove the first two elements
b1 <- cumsum(a)[-(1:2)]
b1
#[1] 8 13 19 20 22 23 24 25
Or another option is Reduce
b1 <- Reduce(`+`, a, accumulate = TRUE)[-(1:2)]
How to get index in a loop in R
You can do something like this, which is literally getting the i value.
names <- c("name1", "name2")
i<-0
for(name in names){
i<-i+1
print(i)
}
Or change the loop to use a numeric index
names <- c("name1", "name2")
for(i in 1:length(names)){
print(i)
}
Or use the which
function.
names <- c("name1", "name2")
for(name in names){
print(which(name == names))
}
Avoiding FOR LOOPs, how can I incrementally-index from a list?
If you are open to a tidyverse
approach, you could try
library(tidyverse)
mydf %>%
mutate(Letter = deframe(map_dfr(mylst, tibble, .id = "name")[2:1])[Location])
This returns
Location Letter
1 A10 A's
2 A10 A's
3 A11 A's
4 A11a A's
5 A12 A's
6 B10 B's
7 B11 B's
8 B12 B's
how to use a loop in R with a non-numeric index
Here i
is the object name as a string. We need get to extract the value of the object. Assuming that we are updating the original object, then use assign
for(i in c('dfa', 'dfb', 'dfpt')) assign(i, get(i)[, "col1"])
Related Topics
Rank a Vector Based on Order and Replace Ties with Their Average
Change Level of Multiple Factor Variables
R Cmd Check Note: Found No Calls To: 'R_Registerroutines', 'R_Usedynamicsymbols'
Writings Functions (Procedures) for Data.Table Objects
Ggplot Aes_String Does Not Work Inside a Function
Azure Put Blob Authentication Fails in R
Ggplot2: Issues with Dual Y-Axes and Loess Smoothing
R Xml - Combining Parent and Child Nodes(W Same Name) into Data Frame
R Column Check If Contains Value from Another Column
Displaying Data in the Chart Based on Plotly_Click in R Shiny
Compute All Fixed Window Averages with Dplyr and Rcpproll
Replacing the Duplicate Values Except 1 Row in R Dataframe
How to Get Rows, by Group, of Data Frame with Earliest Timestamp
Is R Superstitious Regarding Posixct Data Type
Dplyr::First() to Choose First Non Na Value
How to Change the Background Color of the Shiny Dashboard Body