Implementation of standard recycling rules
I've used this in the past,
expand_args <- function(...){
dots <- list(...)
max_length <- max(sapply(dots, length))
lapply(dots, rep, length.out = max_length)
}
data.table avoid recycling
One can use out of range indices:
library("data.table")
x <- c(1,2,3,4)
y <- c(8,9)
n <- max(length(x), length(y))
dt <- data.table(x = x[1:n], y = y[1:n])
# > dt
# x y
# 1: 1 8
# 2: 2 9
# 3: 3 NA
# 4: 4 NA
Or you can extend y
by doing (as @Roland recommended in the comment):
length(y) <- length(x) <- max(length(x), length(y))
dt <- data.table(x, y)
When can mapply's MoreArgs argument not be replaced by R's vector recycling rules?
I've yet to see a case where identical functionality cannot be gained by simply deleting the
MoreArgs=
argument and letting the relevant parts pass to...
.
This is wrong, but only slightly. Compare:
options(max.print = 50)#Before running this, make sure that you know how to undo it.
> mapply(sum,1:5,MoreArgs=list(runif(10),runif(10000)))
[1] 5019.831 5020.831 5021.831 5022.831 5023.831
> mapply(sum,1:5,list(runif(10)),list(runif(10000)))
[1] 5069.321 5070.321 5071.321 5072.321 5073.321
> mapply(sum,1:5,list(runif(10),runif(10000)))
[1] 6.658275 4984.177882 8.658275 4986.177882 10.658275
> mapply(sum,1:5,runif(10),runif(10000))
[1] 1.750417 3.286090 3.186474 5.310268 5.962829 1.343564 2.325567 3.928796 4.955376
[10] 5.507385 1.992290 3.454536 3.399763 5.242883 5.589296 1.637056 2.964259 3.839006
[19] 5.647123 5.883139 1.863512 2.827110 3.633137 5.174900 5.365155 2.022725 3.139846
[28] 3.830624 5.064546 5.697612 1.242803 3.456888 3.726114 5.271773 5.881724 1.533730
[37] 2.489976 3.509690 5.657166 5.400823 1.972689 2.858276 3.571505 5.582752 5.482381
[46] 1.956237 2.497409 3.864434 5.389969 5.965341
[ reached getOption("max.print") -- omitted 9950 entries ]
Warning message:
In mapply(sum, 1:5, list(runif(10), runif(10000))) :
longer argument not a multiple of length of shorter
In the first case, every element of the list in the MoreArgs
argument is recycled for each call. Similarly, the second case is recycling both runif(10)
and runif(10000)
for each call, giving behavior that I'm confident to call identical. The fourth case only exists to shows what we get if we're silly enough to not use any lists at all.
My above quoted claim is that the first and third cases should be identical. This is clearly not the case. If we try to use one list (rather than two, as our second case did) without MoreArgs
, R's normal vector recycling rules will have us reuse the value of runif(10)
for the first, third, and fifth calls, and use runif(10000)
for the second and fourth, and give us a warning due to this odd behavior.
In conclusion, it still appears that the MoreArgs
argument can always be replaced by R's vector recycling rules (despite my previous answer), but not in the exact way that I said in the question. The truth appears to be that MoreArgs=list(foo,bar,etc)
is equivalent to using list(foo)
, list(bar)
, and list(etc)
as ...
arguments to mapply
. Notice that this is not the same as using list(foo,bar,etc)
as a ...
argument. So, ultimately: You will not always get identical functionality from omitting the MoreArgs argument.
As a final minor detail: Omitting the MoreArgs
argument is harmless, but omitting the ...
argument and using MoreArgs
instead gives unexpected output, usually an empty list.
Preserve mask length when subsetting in R
This is because of recycling. If vectors are of different lengths, the shorter vector is recycled in the order it is specified.
Compare the following:
> mask=c(F)
> v[mask]
numeric(0)
> mask=c(T)
> v[mask]
[1] 1 2 3 4 5
> mask=c(T, F, T, F)
> v[mask]
[1] 1 3 5
In the first example, F
is recycled 5 times, so no values are printed - the opposite happens in the second example.
In the third example 2 and 4 are omitted because they are indexed with F
, but the mask is recycled to give a T
for element 5
Edit
The desired result being 1, 2, 3, 4?
Try mask <- c(T, T, T, T, F)
This is what allows statement like
v[v != 5]
Because that comparison is recycled over the whole vector
is there a way I can recycle elements of the shorter list in purrr:: map2 or purrr::walk2?
You can put both lists in a data frame and let that command repeat your vectors:
input <- data.frame(a = 1:3, b = 4:9)
purrr::map2(input$a, input$b, sum)
rbind vectors of different length: pad with zero (or NA) instead of recycling
use the following:
rbind(v1, v2=v2[seq(v1)])
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
v1 1 2 3 4 8 5 3 11
v2 9 5 2 NA NA NA NA NA
Why it works:
Indexing a vector by a value larger than its length returns a value of NA at that index point.
#eg:
{1:3}[c(3,5,1)]
#[1] 3 NA 1
Thus, if you index the shorter one by the indecies of the longer one, you willl get all of the values of the shorter one plus a series of NA
's
A generalization:
v <- list(v1, v2)
n <- max(lengths(v))
# same:
# n <- max(sapply(v, length))
do.call(rbind, lapply(v, `[`, seq_len(n)))
Related Topics
R - Ggplot2 - Highlighting Selected Points and Strange Behavior
Programming-Safe Version of Subset - to Evaluate Its Condition While Called from Another Function
Should I Avoid Programming Packages with Pipe Operators
Using R to Download Newest Files from Ftp-Server
How to Turn Gpclibpermit() to True
Remove Text After Final Period in String
Adding a Counter Column for a Set of Similar Rows in R
Remove "Showing 1 to N of N Entries" Shiny Dt
How to Find Difference Between Values in Two Rows in an R Dataframe Using Dplyr
How to Read Data with Different Separators
Finding Non-Numeric Data in a Data Frame or Vector
Extract First Word from a Column and Insert into New Column
Applying a Function to Each Row of a Data.Table
Apply Function to Elements Over a List