under what circumstances does R recycle?
Recycling works in your example:
> x <- seq(5)
> y <- seq(11)
> x+y
[1] 2 4 6 8 10 7 9 11 13 15 12
Warning message:
In x + y : longer object length is not a multiple of shorter object length
> v <- 2*x +y +1
Warning message:
In 2 * x + y :
longer object length is not a multiple of shorter object length
> v
[1] 4 7 10 13 16 9 12 15 18 21 14
The "error" that you reported is in fact a "warning" which means that R is notifying you that it is recycling but recycles anyway. You may have options(warn=2)
turned on, which converts warnings into error messages.
In general, avoid relying on recycling. If you get in the habit of ignoring the warnings, some day it will bite you and your code will fail in some very hard to diagnose way.
Implementation of standard recycling rules
I've used this in the past,
expand_args <- function(...){
dots <- list(...)
max_length <- max(sapply(dots, length))
lapply(dots, rep, length.out = max_length)
}
Vector recycling concept in R
You can use:
rep(c(2,4,6), 2) * rep(c(1,2), each=3)
#[1] 2 4 6 4 8 12
or with auto recycling:
c(2,4,6) * rep(c(1,2), each=3)
#[1] 2 4 6 4 8 12
Alternative outer
could be used:
c(outer(c(2,4,6), c(1,2)))
#[1] 2 4 6 4 8 12
Also crossprod
could be used:
c(crossprod(t(c(2,4,6)), c(1,2)))
#[1] 2 4 6 4 8 12
Or %*%
:
c(c(2,4,6) %*% t(c(1,2)))
#[1] 2 4 6 4 8 12
r list force to recycle items
You want ifelse
:
mylist <- list(a = 3, b = c(2,8), c = c(1,4))
sumit<-function(v){x<-v$a + v$b + v$c}
x<-sumit(mylist)
sumit<-function(v){
x<-v$a + v$b + v$c
x <- ifelse (v$c ==1, x*100,x)
}
x<-sumit(mylist)
x
# [1] 600 15
if
works with single length logical, and takes the first element in other cases, ifelse
works with vectors.
Recycling and assignment functions (`split-`)
The split gives the values dat[c(1,2,4)]
and dat[c(3,5,6)]
from the vector.
The assignment is equivalent to dat[c(1,2,4)] <- 100 ; dat[c(3,5,6)] <- 300
and this is where the recycling takes place.
Edited
As for what happens, and why a vector assignment results, see page 21 of the language definition manual (http://cran.r-project.org/doc/manuals/R-lang.pdf). The call:
split(def, f) <- Z
Is interpreted as:
‘*tmp*‘ <- def
def <- "split<-"(‘*tmp*‘, f, value=Z)
rm(‘*tmp*‘)
Note that split<-.default
returns the modified vector.
R matrix values recycling?
I increased n_times
to 10000 and can find no evidence of recycling. While that doesn't mean it isn't happening, it means that unfortunately without a clear setup, we are unfortunately going to be unable to reproduce the problem. So my suggestions here are unproven.
Option 1
Given that you found one such scenario that ends with all agents$state == "e"
, then I'll suggest a trick that will always find at least one "s"
(actually, one of each value that you know about):
out[k,] <- table(c("e", "s", agents$state)) - 1
I'm assuming that the only possible values are "e"
and "s"
; if there are others, this technique relies completely on the premise that we ensure every possible value is seen at least once, and then decrement everything. Since we "add one observation" for each possible value, subtracting one from the table is safe. With this trick, your check should then be
table(agents$state)
# e
# 100
table(c("e", "s", agents$state))
# e s
# 101 1
table(c("e", "s", agents$state)) - 1
# e s
# 100 0
And therefore recycling should not be a factor.
Option 2
Another technique which is more robust (i.e., does not need to include all possible values) is to force the length, assuming we know with certainty what it should be (which I think we do here):
z <- table(agents$state)
z
# s
# 100
length(z) <- 2
z
# s
# 100 NA
Since you "know" that the length should always be 2, you can hard-code the 2
in there.
Option 3
This method is even a little more robust in that you don't need to know the absolute length, they will all be extended to the length of the longest return.
First, reproducible sample data:
set.seed(2021)
agents <- data.frame(agent_no = 1,
state = "e",
mixing = runif(1,0,1))
# specify agent population
pop_size <- 100
# fill agent data
for(i in 2:pop_size){
agent <- data.frame(agent_no = i,
state = "s",
mixing = runif(1,0,1))
agents <- rbind(agents, agent)
}
head(agents)
# agent_no state mixing
# 1 1 e 0.4512674
# 2 2 s 0.7837798
# 3 3 s 0.7096822
# 4 4 s 0.3817443
# 5 5 s 0.6363238
# 6 6 s 0.7013460
Replace your for
loop:
for (k in 1:n_times) {
}
with
out <- lapply(seq_len(n_times), function(k) {
for(i in 1:pop_size){
# likelihood to meet others
likelihood <- agents$mixing[i]
# how many agents will they meet (integer). Add 1 to make sure everybody meets somebody
connect_with <- round(likelihood * 3, 0) + 1
# which agents will they probably meet (list of agents)
which_others <- sample(1:pop_size,
connect_with,
replace = T,
prob = agents$mixing)
for(j in 1:length(which_others)){
contacts <- agents[which_others[j],]
# if exposed, change state
if(contacts$state == "e"){
urand <- runif(1,0,1)
# control probability of state change
if(urand < 0.5){
agents$state[i] <- "e"
}
}
}
}
table(agents$state)
})
At this point, you have a list, likely of length-2 vectors:
out[1:3]
# [[1]]
# e s
# 1 99
# [[2]]
# e s
# 2 98
# [[3]]
# e s
# 3 97
Note that we can determine the length of all of them with
lengths(out)
# [1] 2 2 2 2 2 2 2 2 2 2
Similar to option 2 where we force the length of a vector, we can do the same here:
maxlen <- max(lengths(out))
out <- lapply(out, `length<-`, maxlen)
## or more verbosely
out <- lapply(out, function(vec) { length(vec) <- maxlen; vec; })
You can confirm that they are all the same length with table(lengths(out))
, should be 2
by n_times
of 10.
From here, we can combine all of these vectors into a matrix with
out <- do.call(rbind, out)
out
# e s
# [1,] 1 99
# [2,] 2 98
# [3,] 3 97
# [4,] 2 98
# [5,] 1 99
# [6,] 20 80
# [7,] 12 88
# [8,] 1 99
# [9,] 2 98
# [10,] 1 99
Why is outer recycling a vector that should go unused and not throwing a warning?
This is expected behaviour based on R's recycling rules. It has nothing to do with outer
as such, though it might be a surprise if you think outer
is somehow applying a function across margins.
Instead, outer
takes two vectors X
and Y
as its first two arguments. It takes X
and rep
licates it length(Y)
times. Similarly, it takes Y
and rep
licates it length(X)
times. Then it just runs your function FUN
on these two long vectors, passing the long X
as the first argument and the long Y
as the second argument. Any other arguments to FUN
have to be passed directly as arguments to outer via ...
(as you have done with c = 1:3
).
The result is a single long vector which is turned into a matrix by writing its dim
attribute as the original values of length(X)
by length(Y)
.
Now, in the specific example you gave, X
has 5 elements (1:5) and Y
has 6 (5:10). Therefore your anonymous function is called on two length-30 vectors and a single length-3 vector. R's recycling rules dictate that if the recycled vector fits neatly into the longer vector without partial recycling, no warning is emitted.
To see this, take your anonymous function and try it outside outer
with two length-30 vectors and one length-3 vector:
f <- function(a, b, c) 10*a + 100*b + 1000*c
f(1:30, 1:30, 1:3)
#> [1] 1110 2220 3330 1440 2550 3660 1770 2880 3990 2100 3210 4320 2430
#> [14] 3540 4650 2760 3870 4980 3090 4200 5310 3420 4530 5640 3750 4860
#> [27] 5970 4080 5190 6300
3 recycles nicely into 30, so there is no warning.
Conversely, if the product of the length of the two vectors you pass to outer
is not a multiple of 3, you will get a warning:
outer(1:5,6:10,c=1:3,function(a,b,c) 10*a + 100*b + 1000*c)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1610 3710 2810 1910 4010
#> [2,] 2620 1720 3820 2920 2020
#> [3,] 3630 2730 1830 3930 3030
#> [4,] 1640 3740 2840 1940 4040
#> [5,] 2650 1750 3850 2950 2050
#> Warning message:
#> In 10 * a + 100 * b + 1000 * c :
#> longer object length is not a multiple of shorter object length
Vector recycling is not working when assigning to data.frame
It is because 2000 is not divisible by 7. Partial recycling doesn't work for data frame columns:
d <- data.frame(x=1:10)
d$x <- 1
d$x <- 1:2
d$x <- 1:3
# Error in `$<-.data.frame`(`*tmp*`, "x", value = 1:3) :
# replacement has 3 rows, data has 10
From the relevant help text ?[<-.data.frame
, in the Arguments section:
"value
: A suitable replacement value: it will be repeated a whole number of times if necessary"
Partial recycling works for vectors though:
x <- d$x
x[] <- 1:3
# Warning message:
# In x[] <- 1:3 :
# number of items to replace is not a multiple of replacement length
x
# [1] 1 2 3 1 2 3 1 2 3 1
You can do the assignment to your data frame similarly (if you're sure it's what you want to do):
d$x[] <- 1:3
# Warning message:
# In d$x[] <- 1:3 :
# number of items to replace is not a multiple of replacement length
d
# x
# 1 1
# 2 2
# 3 3
# 4 1
# 5 2
# 6 3
# 7 1
# 8 2
# 9 3
# 10 1
Related Topics
Using Leaflet-Side-By-Side Plugin in R
Why Can't One Have Several 'Value.Var' in 'Dcast'
Include Link to Local HTML File in Datatable in Shiny
How to Split a Dataframe Column by The First Instance of a Character in Its Values
Generating Dropdown Menu for Plotly Charts
R How Many Element Satisfy a Condition
Download Multiple CSV Files with One Button (Downloadhandler) with R Shiny
Get Expression That Evaluated to Dot in Function Called by 'Magrittr' Pipe
Find Second Highest Value on a Raster Stack in R
Change Font Size for All Inline Equations R Markdown
Download File from Internet via R Despite The Popup
Loop Linear Regression and Saving Coefficients
Recursive Function Using Dplyr
Order Dataframe for Given Columns
Heatmap with Values and Some Additional Features in R