Implementation of Standard Recycling Rules

Implementation of standard recycling rules

I've used this in the past,

expand_args <- function(...){
dots <- list(...)
max_length <- max(sapply(dots, length))
lapply(dots, rep, length.out = max_length)
}

data.table avoid recycling

One can use out of range indices:

library("data.table")

x <- c(1,2,3,4)
y <- c(8,9)
n <- max(length(x), length(y))

dt <- data.table(x = x[1:n], y = y[1:n])
# > dt
# x y
# 1: 1 8
# 2: 2 9
# 3: 3 NA
# 4: 4 NA

Or you can extend y by doing (as @Roland recommended in the comment):

length(y) <- length(x) <- max(length(x), length(y))
dt <- data.table(x, y)

When can mapply's MoreArgs argument not be replaced by R's vector recycling rules?

I've yet to see a case where identical functionality cannot be gained by simply deleting the MoreArgs= argument and letting the relevant parts pass to ....

This is wrong, but only slightly. Compare:

options(max.print = 50)#Before running this, make sure that you know how to undo it.

> mapply(sum,1:5,MoreArgs=list(runif(10),runif(10000)))
[1] 5019.831 5020.831 5021.831 5022.831 5023.831

> mapply(sum,1:5,list(runif(10)),list(runif(10000)))
[1] 5069.321 5070.321 5071.321 5072.321 5073.321

> mapply(sum,1:5,list(runif(10),runif(10000)))
[1] 6.658275 4984.177882 8.658275 4986.177882 10.658275

> mapply(sum,1:5,runif(10),runif(10000))
[1] 1.750417 3.286090 3.186474 5.310268 5.962829 1.343564 2.325567 3.928796 4.955376
[10] 5.507385 1.992290 3.454536 3.399763 5.242883 5.589296 1.637056 2.964259 3.839006
[19] 5.647123 5.883139 1.863512 2.827110 3.633137 5.174900 5.365155 2.022725 3.139846
[28] 3.830624 5.064546 5.697612 1.242803 3.456888 3.726114 5.271773 5.881724 1.533730
[37] 2.489976 3.509690 5.657166 5.400823 1.972689 2.858276 3.571505 5.582752 5.482381
[46] 1.956237 2.497409 3.864434 5.389969 5.965341
[ reached getOption("max.print") -- omitted 9950 entries ]
Warning message:
In mapply(sum, 1:5, list(runif(10), runif(10000))) :
longer argument not a multiple of length of shorter

In the first case, every element of the list in the MoreArgs argument is recycled for each call. Similarly, the second case is recycling both runif(10) and runif(10000) for each call, giving behavior that I'm confident to call identical. The fourth case only exists to shows what we get if we're silly enough to not use any lists at all.

My above quoted claim is that the first and third cases should be identical. This is clearly not the case. If we try to use one list (rather than two, as our second case did) without MoreArgs, R's normal vector recycling rules will have us reuse the value of runif(10) for the first, third, and fifth calls, and use runif(10000) for the second and fourth, and give us a warning due to this odd behavior.

In conclusion, it still appears that the MoreArgs argument can always be replaced by R's vector recycling rules (despite my previous answer), but not in the exact way that I said in the question. The truth appears to be that MoreArgs=list(foo,bar,etc) is equivalent to using list(foo), list(bar), and list(etc) as ... arguments to mapply. Notice that this is not the same as using list(foo,bar,etc) as a ... argument. So, ultimately: You will not always get identical functionality from omitting the MoreArgs argument.

As a final minor detail: Omitting the MoreArgs argument is harmless, but omitting the ... argument and using MoreArgs instead gives unexpected output, usually an empty list.

Preserve mask length when subsetting in R

This is because of recycling. If vectors are of different lengths, the shorter vector is recycled in the order it is specified.

Compare the following:

> mask=c(F)
> v[mask]
numeric(0)

> mask=c(T)
> v[mask]
[1] 1 2 3 4 5

> mask=c(T, F, T, F)
> v[mask]
[1] 1 3 5

In the first example, F is recycled 5 times, so no values are printed - the opposite happens in the second example.

In the third example 2 and 4 are omitted because they are indexed with F, but the mask is recycled to give a T for element 5

Edit
The desired result being 1, 2, 3, 4?
Try mask <- c(T, T, T, T, F)

This is what allows statement like

v[v != 5]

Because that comparison is recycled over the whole vector

is there a way I can recycle elements of the shorter list in purrr:: map2 or purrr::walk2?

You can put both lists in a data frame and let that command repeat your vectors:

input <- data.frame(a = 1:3, b = 4:9)
purrr::map2(input$a, input$b, sum)

rbind vectors of different length: pad with zero (or NA) instead of recycling

use the following:

rbind(v1, v2=v2[seq(v1)])

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
v1 1 2 3 4 8 5 3 11
v2 9 5 2 NA NA NA NA NA

Why it works:
Indexing a vector by a value larger than its length returns a value of NA at that index point.

 #eg: 
{1:3}[c(3,5,1)]
#[1] 3 NA 1

Thus, if you index the shorter one by the indecies of the longer one, you willl get all of the values of the shorter one plus a series of NA's


A generalization:

v <- list(v1, v2)
n <- max(lengths(v))
# same:
# n <- max(sapply(v, length))
do.call(rbind, lapply(v, `[`, seq_len(n)))


Related Topics



Leave a reply



Submit