How to Create a Binary Vector with 1 If Elements Are Part of the Same Vector

How to create a binary vector with 1 if elements are part of the same vector?

Using ifelse function and %in% operator.

matching_a <-  ifelse(dataset %in% var1, 1, 0)

matching_a
# [1] 1 1 0 0 0 1 1

Alternative ways to create a repetitive vector in R

We can use bitwAnd

> bitwAnd(0:9, 1)
[1] 0 1 0 1 0 1 0 1 0 1

or kronecker

> kronecker(as.vector(matrix(1, 5)), 0:1)
[1] 0 1 0 1 0 1 0 1 0 1

> kronecker((1:5)^0, 0:1)
[1] 0 1 0 1 0 1 0 1 0 1

or outer

> as.vector(outer(0:1, (1:5)^0))
[1] 0 1 0 1 0 1 0 1 0 1

binary vector of max

Here:

X = np.array([2,5,8,1])
one_hot = tf.one_hot(indices=tf.argmax(X), depth=tf.shape(X)[0])

sample of a subsample

Here is a complete solution to your problem more along the line of your original idea. The code can be shortened but for now I tried to make it as transparent as I could.

# Data
data <- data.frame(var1 = 1:40, var2 = 40:1)

# Add SampleNo column
data$sampleNo <- 0L

# Randomly select 10 rows as sample 1
pool_idx1 <- 1:nrow(data)
idx1 <- sample(pool_idx1, size = 10)
data[idx1, ]$sampleNo <- 1L

# Draw a second sample from cases where sampleNo != 1 & var1 is even
pool_idx2 <- pool_idx1[data$var1 %% 2 == 0 & data$sampleNo != 1]
idx2 <- sample(pool_idx2, size = 10)
data[idx2, ]$sampleNo <- 2L

Cleaner way of constructing binary matrix from vector

set.seed(1)
playv <- sample(0:5,20,replace=TRUE)
playv <- as.character(playv)
results <- model.matrix(~playv-1)

The columns in result you may rename.

I like the solution provided by Ananda Mahto and compared it to model.matrix. Here is a code

library(microbenchmark)

set.seed(1)
v <- sample(1:10,1e6,replace=TRUE)

f1 <- function(vec) {
vec <- as.character(vec)
model.matrix(~vec-1)
}

f2 <- function(vec) {
table(sequence(length(vec)), vec)
}

microbenchmark(f1(v), f2(v), times=10)

model.matrix was a little bit faster then table

Unit: seconds
expr min lq median uq max neval
f1(v) 2.890084 3.147535 3.296186 3.377536 3.667843 10
f2(v) 4.824832 5.625541 5.757534 5.918329 5.966332 10

Group the near same numbers of a vector

Another option but with rleid from data.table package

> split(v,rleid(v))
$`1`
[1] "a" "a"

$`2`
[1] "b"

$`3`
[1] "a" "a"

$`4`
[1] "c" "c" "c"

or another base R option

> split(v,cumsum(c(TRUE,head(v,-1)!=v[-1])))
$`1`
[1] "a" "a"

$`2`
[1] "b"

$`3`
[1] "a" "a"

$`4`
[1] "c" "c" "c"

Drawing conditional combinations of a binary vector one by one

What the OP is after is an iterator. If we were to do this properly, we would write a class in C++ with a get_next method, and expose this to R. As it stands, with base R, since everything is passed by value, we must call a function on our object-to-be-updated and reassign the object-to-be-updated every time.

Here is a very crude implementation:

get_next <- function(comb, v, m) {
s <- seq(1L, length(comb), length(v))
e <- seq(length(v), length(comb), length(v))

last_comb <- rev(v)
can_be_incr <- sapply(seq_len(m), function(x) {
!identical(comb[s[x]:e[x]], last_comb)
})

if (all(!can_be_incr)) {
return(FALSE)
} else {
idx <- which(can_be_incr)[1L]
span <- s[idx]:e[idx]
j <- which(comb[span] == 1L)
comb[span[j]] <- 0L
comb[span[j + 1L]] <- 1L

if (idx > 1L) {
## Reset previous maxed out sections
for (i in 1:(idx - 1L)) {
comb[s[i]:e[i]] <- v
}
}
}

return(comb)
}

And here is a simple usage:

m <- 3L
v <- as.integer(c(1,0,0))
comb <- rep(v, m)
count <- 1L

while (!is.logical(comb)) {
cat(count, ": ", comb, "\n")
comb <- get_next(comb, v, m)
count <- count + 1L
}

1 : 1 0 0 1 0 0 1 0 0
2 : 0 1 0 1 0 0 1 0 0
3 : 0 0 1 1 0 0 1 0 0
4 : 1 0 0 0 1 0 1 0 0
5 : 0 1 0 0 1 0 1 0 0
6 : 0 0 1 0 1 0 1 0 0
7 : 1 0 0 0 0 1 1 0 0
8 : 0 1 0 0 0 1 1 0 0
9 : 0 0 1 0 0 1 1 0 0
10 : 1 0 0 1 0 0 0 1 0
11 : 0 1 0 1 0 0 0 1 0
12 : 0 0 1 1 0 0 0 1 0
13 : 1 0 0 0 1 0 0 1 0
14 : 0 1 0 0 1 0 0 1 0
15 : 0 0 1 0 1 0 0 1 0
16 : 1 0 0 0 0 1 0 1 0
17 : 0 1 0 0 0 1 0 1 0
18 : 0 0 1 0 0 1 0 1 0
19 : 1 0 0 1 0 0 0 0 1
20 : 0 1 0 1 0 0 0 0 1
21 : 0 0 1 1 0 0 0 0 1
22 : 1 0 0 0 1 0 0 0 1
23 : 0 1 0 0 1 0 0 0 1
24 : 0 0 1 0 1 0 0 0 1
25 : 1 0 0 0 0 1 0 0 1
26 : 0 1 0 0 0 1 0 0 1
27 : 0 0 1 0 0 1 0 0 1

Note, this implementation will be memory efficient, however it will be very slow.



Related Topics



Leave a reply



Submit