What are examples of when seq_along works, but seq produces unintended results?
This should make the difference clear. Basically, seq()
acts like seq_along()
except when passed a vector of length 1, in which case it acts like seq_len()
. If this ever once bites you, you'll never use seq()
again!
a <- c(8, 9, 10)
b <- c(9, 10)
c <- 10
seq_along(a)
# [1] 1 2 3
seq_along(b)
# [1] 1 2
seq_along(c)
# [1] 1
seq(a)
# [1] 1 2 3
seq(b)
# [1] 1 2
seq(c)
# [1] 1 2 3 4 5 6 7 8 9 10
It's probably worth noting that sample()
exhibits similarly crummy behavior:
sample(a)
# [1] 10 8 9
sample(b)
# [1] 9 10
sample(c)
# [1] 8 7 9 3 4 1 6 10 2 5
using seq_along() to handle the empty case
Under the condition that df <- data.frame()
, we have:
Case 1 falling victim to...
Error in .subset2(x, i, exact = exact) : subscript out of bounds
while Case 2 and 3 are not triggered.
In essence, the error in Case 1 is due to ncol(df)
being 0
. This leads the sequence 1:ncol(df)
to be 1:0
, which creates the vector c(1,0)
. In this case, the for
loop tries to access the first element of the vector 1
, which tries to access column 1 does not exist. Hence, the subset is found to be out of bounds.
Meanwhile, in Case 2 and 3 the for
loop is never executed since there are no elements to process within their respective collections since the vectors are empty. Principally, this means that they have length of 0
.
As this question specifically relates to what the heck is happening to seq_along()
, let's take a traditional seq_along
example by constructing a full vector a
and seeing the results:
set.seed(111)
a <- runif(5)
seq_along(a)
#[1] 1 2 3 4 5
In essence, for each element of the vector a
, there is a corresponding index that was created by seq_along
to be accessed.
If we apply seq_along
now to the empty df
in the above case, we get:
seq_along(df)
# integer(0)
Thus, what was created was a zero length vector. Its mighty hard to move along a zero length vector.
Ergo, the Case 1 poorly protects the against the empty case.
Now, under the traditional assumption, that is there is some data within the data.frame
, which is a very bad assumption for any kind of developer to make...
set.seed(1234)
df <- data.frame(matrix(rnorm(40), 4))
All three cases would be operating as expected. That is, you would receive a median per column of the data.frame
.
[1] -0.5555419
[1] -0.4941011
[1] -0.4656169
[1] -0.605349
summarytools::freq gives unintended results when variables are factors without NA
A fix was issued for this. You can install the latest version from GitHub with:devtools::install_github("dcomtois/summarytools")
or, to get the latest development version:devtools::install_github("dcomtois/summarytools", ref = "dev-current)
Seq() producing numbers off by minute amounts (R)
For a workaround you could try seq(-3,3,1)/10
Zero trip R for loop
You should use seq_len
n = 0
for (i in seq_len(n)) {
cat("\nhello")
}
Related Topics
"Unpacking" a Factor List from a Data.Frame
Plot with Ggplot in For-Loop Doesn't Work
Remove Duplicates Column Combinations from a Dataframe in R
R - Svd() Function - Infinite or Missing Values in 'X'
Accessing Y Columns with Duplicated Names in J of X[Y, J] Merges
Export Both Image and Data from R to an Excel Spreadsheet
Extract Data Between a Pattern from a Text File
Text Mining R Package & Regex to Handle Replace Smart Curly Quotes
Update a Ggplot Using a for Loop (R)
Is There a Difference Between the R Functions Fitted() and Predict()
How to Get the Second Sub Element of Every Element in a List
Regression with Heteroskedasticity Corrected Standard Errors
How to Run a High Pass or Low Pass Filter on Data Points in R