How to extend `==` behavior to vectors that include NAs?
Another option, but is it better than mapply('%in%', a , b)
?:
(!is.na(a) & !is.na(b) & a==b) | (is.na(a) & is.na(b))
Following @AnthonyDamico 's suggestion, creation of the "mutt" operator:
"%==%" <- function(a, b) (!is.na(a) & !is.na(b) & a==b) | (is.na(a) & is.na(b))
Edit: or, slightly different and shorter version by @Frank (which is also more efficient)
"%==%" <- function(a, b) (is.na(a) & is.na(b)) | (!is.na(eq <- a==b) & eq)
With the different examples:
a <- c( 1 , 2 , 3 )
b <- c( 1 , 2 , 4 )
a %==% b
# [1] TRUE TRUE FALSE
a <- c( 1 , NA , 3 )
b <- c( 1 , NA , 4 )
a %==% b
# [1] TRUE TRUE FALSE
a <- c( 1 , NA , 3 )
b <- c( 1 , 2 , 4 )
a %==% b
#[1] TRUE FALSE FALSE
a <- c( 1 , NA , 3 )
b <- c( 3 , NA , 1 )
a %==% b
#[1] FALSE TRUE FALSE
aggregate across multiple vectors, retain entries that only have NAs for particular vectors
aggregate(.~site + horizon,data=data,FUN=mean, na.action=na.pass)
Replacing NAs in R with nearest value
Here is a very fast one. It uses findInterval
to find what two positions should be considered for each NA
in your original data:
f1 <- function(dat) {
N <- length(dat)
na.pos <- which(is.na(dat))
if (length(na.pos) %in% c(0, N)) {
return(dat)
}
non.na.pos <- which(!is.na(dat))
intervals <- findInterval(na.pos, non.na.pos,
all.inside = TRUE)
left.pos <- non.na.pos[pmax(1, intervals)]
right.pos <- non.na.pos[pmin(N, intervals+1)]
left.dist <- na.pos - left.pos
right.dist <- right.pos - na.pos
dat[na.pos] <- ifelse(left.dist <= right.dist,
dat[left.pos], dat[right.pos])
return(dat)
}
And here I test it:
# sample data, suggested by @JeffAllen
dat <- as.integer(runif(50000, min=0, max=10))
dat[dat==0] <- NA
# computation times
system.time(r0 <- f0(dat)) # your function
# user system elapsed
# 5.52 0.00 5.52
system.time(r1 <- f1(dat)) # this function
# user system elapsed
# 0.01 0.00 0.03
identical(r0, r1)
# [1] TRUE
Detect change from previous rows with missing values - speed up for loop - R
Here's an alternative approach, which removes any rows with NAs, performs some calculations and joins back the NA rows in the right place.
library(tidyverse)
library(zoo)
# example data
test <- data.frame(resp = c(9, NA, NA, 11, NA, NA, 6, 16, NA, 12, 0, 0, 0, 0, 0, NA, 0, 11, NA, NA, NA, NA, NA, NA, 14))
# add an id for each row
test = test %>% mutate(id = row_number())
test %>%
na.omit() %>% # exclude rows with NAs
mutate(flag = case_when(resp == lag(resp, default = first(resp)) ~ 0,
resp > lag(resp, default = first(resp)) ~ 1,
resp < lag(resp, default = first(resp)) ~ -1)) %>% # check relationship between current and previous value
mutate(g = cumsum(flag != lag(flag, default = first(flag)))) %>% # create a grouping based on change in flag column
group_by(g) %>% # for each group
mutate(change = ifelse(flag != 0, flag * row_number(), flag)) %>% # calculate the change column
ungroup() %>% # forget the grouping
select(id, change) %>% # keep useful columns
right_join(test, by="id") %>% # join back to get NA rows in the right place
select(resp, change) # keep useful columns
As a result you'll get:
# resp change
# 1 9 0
# 2 NA NA
# 3 NA NA
# 4 11 1
# 5 NA NA
# 6 NA NA
# 7 6 -1
# 8 16 1
# 9 NA NA
# 10 12 -1
# 11 0 -2
# 12 0 0
# 13 0 0
# 14 0 0
# 15 0 0
# 16 NA NA
# 17 0 0
# 18 11 1
# 19 NA NA
# 20 NA NA
# 21 NA NA
# 22 NA NA
# 23 NA NA
# 24 NA NA
# 25 14 2
Convert Vector to Matrix without Recycling
You can't turn recycling off, but you can do some manipulations to the vector before you form the matrix. We can extend the length of the vector based on what the dimensions of the matrix will be. The length<-
replacement function will pad the vector with NA
up to the desired length.
x <- 1:11
length(x) <- prod(dim(matrix(x, ncol = 2)))
## you will get a warning here unless suppressWarnings() is used
matrix(x, ncol = 2, byrow = TRUE)
# [,1] [,2]
# [1,] 1 2
# [2,] 3 4
# [3,] 5 6
# [4,] 7 8
# [5,] 9 10
# [6,] 11 NA
Why pmax(dataFrame, int) would introduce NAs?
pmax
is not designed to be used with data.frame input.
The error is introduced in line 35 of pmax
:
mmm[change] <- each[change]
because each
is defined to be as long as the length
of the input, which for a data.frame is the number of columns. Therefore when it tries to address the 5th element, it gets NA.
each
[1] 6 6 6 6
each[change]
[1] 6 6 6 6 NA
The obvious workaround is to convert to data.frame after using pmax
:
data.frame(pmax(matrix(1:16, nrow=4), c(6)))
X1 X2 X3 X4
1 6 6 9 13
2 6 6 10 14
3 6 7 11 15
4 6 8 12 16
Or convert back and forth as required.
`x^(1/3)` behaves differently for negative scalar `x` and vector `x` with negative values
I'm not looking for a workaround e.g.
function(x) {sign(x) * (abs(x)) ^ (1/3)}
.
I'm interested in an answer that explains what is happening differently to the vector than to the negative value when provided as a numeric scalar.
how does the
^
operator think differently about vectors and scalars?
You seem to believe that c(-0.2, 1)^(1/3)
translates to c(-0.2^(1/3), 1^(1/3))
. This is incorrect. Operator ^
is actually a function, that is, (a) ^ (b)
is as same as "^"(a, b)
. Therefore, the correct interpretation goes as follows:
c(-0.2, 1)^(1/3)
=> "^"(c(-0.2, 1), 1/3)
=> c( "^"(-0.2, 1/3), "^"(1, 1/3) )
=> c( (-0.2)^(1/3), (1)^(1/3) )
=> c( NaN, 1 )
Now, why doesn't -0.2^(1/3)
give NaN
? Because ^
has higher operation precedence than +
, -
, *
and /
. So as it is written, it really implies -(0.2^(1/3))
instead of (-0.2)^(1/3)
.
The lesson is that, to avoid buggy code, write your code as (a) ^ (b)
instead of just a ^ b
.
Additional Remark:
I often compare ^
and :
when teaching R to my students, because they have different behaviors. But they all show the importance of protecting operands with brackets.
(-1):2
#[1] -1 0 1 2
-1:2
#[1] -1 0 1 2
-(1:2)
#[1] -1 -2
2*3:10
#[1] 6 8 10 12 14 16 18 20
(2*3):10
#[1] 6 7 8 9 10
2*(3:10)
#[1] 6 8 10 12 14 16 18 20
See ?Syntax
for details of operator precedence.
R Convert NA's only after the first non-zero value
Easy to do using match()
and numeric indices:
- use
match()
to find the first occurence of a non-NA value - use
which()
to convert the logical vector fromis.na()
to a numeric index - use that information to find the correct positions in x
Hence:
x <- c(NA,NA,NA,1,2,3,NA,NA,4,5,NA)
isna <- is.na(x)
nonna <- match(FALSE,isna)
id <- which(isna)
x[id[id>nonna]] <- 0
gives:
> x
[1] NA NA NA 1 2 3 0 0 4 5 0
Force a std::vector to free its memory?
Use the swap trick:
#include <vector>
template <typename T>
void FreeAll( T & t ) {
T tmp;
t.swap( tmp );
}
int main() {
std::vector <int> v;
v.push_back( 1 );
FreeAll( v );
}
Related Topics
How to Change Font Size of the Correlation Coefficient in Corrplot
Sendmailr (Part2): Sending Files as Mail Attachments
Update Graph/Plot with Fixed Interval of Time
How to Write a Function That Calls a Function That Calls Data.Table
Sum of Antidiagonal of a Matrix
Generate Matrix with Iid Normal Random Variables Using R
Adding Time to Posixct Object in R
R: Numeric 'Envir' Arg Not of Length One in Predict()
Differencebetween a List and a Pairlist in R
Excel Cell Coloring Using Xlsx
Filter One Selectinput Based on Selection from Another Selectinput
Distance of Point Feature to Nearest Polygon in R
How to Remove Row If It Has a Na Value in One Certain Column
Changing Font Size in R Datatables (Dt)
Emacs Ess Mode - Tabbing for Comment Region