Loops in R - Need to Use Index, Anyway to Avoid 'For'

Loops in R - Need to use index, anyway to avoid 'for'?

What you offered would be the fractional variation, but if you multiplied by 100 you get the "percent variation":

pv<- vector("numeric",length(x))
pv[1] <- 0
pv[-1] <- 100* ( x[-1] - x[-length(x)] )/ x[-length(x)]

Vectorized solution. ( And you should note that for-loops are going to be just as slow as *apply solutions ... just not as pretty. Always look for a vectorized approach.)

To explain a bit more: The x[-length(x)] is the vector, x[1:(length{x-1)], and the x[-1] is the vector, x[2:length(x)], and the vector operations in R are doing the same operations as in your for-loop body, although not using an explicit loop. R first constructs the differences in those shifted vectors, x[-length(x)] - x[-1], and then divides by x[1:(length{x-1)].

For Loops: How to avoid loop when index is used to call multiple different column values?

You could also subset for rows where AA_FIRST == 1 and save it as a lookup table (similar to a dictionary in Python), and then match based on ID.

data<- data.frame(
ID = c(1,1,5,5,5,5,5,6,6,6,6),
DATA = c(0,0,0,0,1,0,0,0,0,1,0),
OFFSET = c(-20,0,-1500, 150, 155, 159, 300, -2000, 30, 100, 120),
AA_FIRST = c(NA, NA, NA, NA, 1, NA, NA, NA, NA, 1, NA),
LABRESULT = c(4.0, 5.0, 3.5, 4.1, NA, 3.0, 5.5, 2.1, 2.5, NA, 3.5) )

dict <- subset(data, data$AA_FIRST==1)[c("ID", "OFFSET")]

data$refOFFSET <- dict[match(data$ID, dict$ID), 2]

Is there a function in R to avoid using loop when we look for all matching index for all element of a vector?

We can use match which is very fast as a base R function. Here, we are just matching two column of a dataset without even trying to get both datasets together

with(data, match(id1, id2))
#[1] 4 1 2 3

To make this faster, use fmatch from fastmatch

library(fastmatch)
with(data, fmatch(id1, id2))

Benchmarks

set.seed(24)
data1 <- data.frame(id1 = sample(1e7), id2 = sample(1e7))

system.time(with(data1, match(id1, id2)))
# user system elapsed
# 1.635 0.079 1.691

system.time(with(data1, fmatch(id1, id2)))
# user system elapsed
# 1.155 0.062 1.195

system.time({
data2 <- data.table(id = data1$id1)
data3 <- data.table(id = data1$id2)
data2[data3, idx := .I, on = .(id)]
})
# user system elapsed
# 2.306 0.051 2.353

Avoid for loops in R/ vectorize

We can use cumsum on the vector and index to remove the first two elements

b1 <- cumsum(a)[-(1:2)]
b1
#[1] 8 13 19 20 22 23 24 25

Or another option is Reduce

b1 <- Reduce(`+`, a, accumulate = TRUE)[-(1:2)]

How to get index in a loop in R

You can do something like this, which is literally getting the i value.

names <- c("name1", "name2")
i<-0
for(name in names){
i<-i+1
print(i)

}

Or change the loop to use a numeric index

names <- c("name1", "name2")
for(i in 1:length(names)){
print(i)

}

Or use the which function.

names <- c("name1", "name2")
for(name in names){

print(which(name == names))

}

Avoiding FOR LOOPs, how can I incrementally-index from a list?

If you are open to a tidyverse approach, you could try

library(tidyverse)

mydf %>%
mutate(Letter = deframe(map_dfr(mylst, tibble, .id = "name")[2:1])[Location])

This returns

  Location Letter
1 A10 A's
2 A10 A's
3 A11 A's
4 A11a A's
5 A12 A's
6 B10 B's
7 B11 B's
8 B12 B's

how to use a loop in R with a non-numeric index

Here i is the object name as a string. We need get to extract the value of the object. Assuming that we are updating the original object, then use assign

for(i in c('dfa', 'dfb', 'dfpt')) assign(i, get(i)[, "col1"])


Related Topics



Leave a reply



Submit