How to create a binary vector with 1 if elements are part of the same vector?
Using ifelse
function and %in%
operator.
matching_a <- ifelse(dataset %in% var1, 1, 0)
matching_a
# [1] 1 1 0 0 0 1 1
Alternative ways to create a repetitive vector in R
We can use bitwAnd
> bitwAnd(0:9, 1)
[1] 0 1 0 1 0 1 0 1 0 1
or kronecker
> kronecker(as.vector(matrix(1, 5)), 0:1)
[1] 0 1 0 1 0 1 0 1 0 1
> kronecker((1:5)^0, 0:1)
[1] 0 1 0 1 0 1 0 1 0 1
or outer
> as.vector(outer(0:1, (1:5)^0))
[1] 0 1 0 1 0 1 0 1 0 1
binary vector of max
Here:
X = np.array([2,5,8,1])
one_hot = tf.one_hot(indices=tf.argmax(X), depth=tf.shape(X)[0])
sample of a subsample
Here is a complete solution to your problem more along the line of your original idea. The code can be shortened but for now I tried to make it as transparent as I could.
# Data
data <- data.frame(var1 = 1:40, var2 = 40:1)
# Add SampleNo column
data$sampleNo <- 0L
# Randomly select 10 rows as sample 1
pool_idx1 <- 1:nrow(data)
idx1 <- sample(pool_idx1, size = 10)
data[idx1, ]$sampleNo <- 1L
# Draw a second sample from cases where sampleNo != 1 & var1 is even
pool_idx2 <- pool_idx1[data$var1 %% 2 == 0 & data$sampleNo != 1]
idx2 <- sample(pool_idx2, size = 10)
data[idx2, ]$sampleNo <- 2L
Cleaner way of constructing binary matrix from vector
set.seed(1)
playv <- sample(0:5,20,replace=TRUE)
playv <- as.character(playv)
results <- model.matrix(~playv-1)
The columns in result
you may rename.
I like the solution provided by Ananda Mahto and compared it to model.matrix
. Here is a code
library(microbenchmark)
set.seed(1)
v <- sample(1:10,1e6,replace=TRUE)
f1 <- function(vec) {
vec <- as.character(vec)
model.matrix(~vec-1)
}
f2 <- function(vec) {
table(sequence(length(vec)), vec)
}
microbenchmark(f1(v), f2(v), times=10)
model.matrix
was a little bit faster then table
Unit: seconds
expr min lq median uq max neval
f1(v) 2.890084 3.147535 3.296186 3.377536 3.667843 10
f2(v) 4.824832 5.625541 5.757534 5.918329 5.966332 10
Group the near same numbers of a vector
Another option but with rleid
from data.table
package
> split(v,rleid(v))
$`1`
[1] "a" "a"
$`2`
[1] "b"
$`3`
[1] "a" "a"
$`4`
[1] "c" "c" "c"
or another base R option
> split(v,cumsum(c(TRUE,head(v,-1)!=v[-1])))
$`1`
[1] "a" "a"
$`2`
[1] "b"
$`3`
[1] "a" "a"
$`4`
[1] "c" "c" "c"
Drawing conditional combinations of a binary vector one by one
What the OP is after is an iterator. If we were to do this properly, we would write a class in C++
with a get_next
method, and expose this to R
. As it stands, with base R, since everything is passed by value, we must call a function on our object-to-be-updated and reassign the object-to-be-updated every time.
Here is a very crude implementation:
get_next <- function(comb, v, m) {
s <- seq(1L, length(comb), length(v))
e <- seq(length(v), length(comb), length(v))
last_comb <- rev(v)
can_be_incr <- sapply(seq_len(m), function(x) {
!identical(comb[s[x]:e[x]], last_comb)
})
if (all(!can_be_incr)) {
return(FALSE)
} else {
idx <- which(can_be_incr)[1L]
span <- s[idx]:e[idx]
j <- which(comb[span] == 1L)
comb[span[j]] <- 0L
comb[span[j + 1L]] <- 1L
if (idx > 1L) {
## Reset previous maxed out sections
for (i in 1:(idx - 1L)) {
comb[s[i]:e[i]] <- v
}
}
}
return(comb)
}
And here is a simple usage:
m <- 3L
v <- as.integer(c(1,0,0))
comb <- rep(v, m)
count <- 1L
while (!is.logical(comb)) {
cat(count, ": ", comb, "\n")
comb <- get_next(comb, v, m)
count <- count + 1L
}
1 : 1 0 0 1 0 0 1 0 0
2 : 0 1 0 1 0 0 1 0 0
3 : 0 0 1 1 0 0 1 0 0
4 : 1 0 0 0 1 0 1 0 0
5 : 0 1 0 0 1 0 1 0 0
6 : 0 0 1 0 1 0 1 0 0
7 : 1 0 0 0 0 1 1 0 0
8 : 0 1 0 0 0 1 1 0 0
9 : 0 0 1 0 0 1 1 0 0
10 : 1 0 0 1 0 0 0 1 0
11 : 0 1 0 1 0 0 0 1 0
12 : 0 0 1 1 0 0 0 1 0
13 : 1 0 0 0 1 0 0 1 0
14 : 0 1 0 0 1 0 0 1 0
15 : 0 0 1 0 1 0 0 1 0
16 : 1 0 0 0 0 1 0 1 0
17 : 0 1 0 0 0 1 0 1 0
18 : 0 0 1 0 0 1 0 1 0
19 : 1 0 0 1 0 0 0 0 1
20 : 0 1 0 1 0 0 0 0 1
21 : 0 0 1 1 0 0 0 0 1
22 : 1 0 0 0 1 0 0 0 1
23 : 0 1 0 0 1 0 0 0 1
24 : 0 0 1 0 1 0 0 0 1
25 : 1 0 0 0 0 1 0 0 1
26 : 0 1 0 0 0 1 0 0 1
27 : 0 0 1 0 0 1 0 0 1
Note, this implementation will be memory efficient, however it will be very slow.
Related Topics
Combining Rows Based on a Column
Using Jupyter R Kernel with Visual Studio Code
How to Use Geom_Rect with Discrete Axis Values
Adding a New Column to Matrix Error
R - Calculate Test Mse Given a Trained Model from a Training Set and a Test Set
How to Automate Nested Sections in Rmds Which Include Text, Maps and Tables
Making Sure a Function Does Not Use a Global Variable
How to Extract Text from R's Help Command
Caret Error: "All the Accuracy Metric Values Are Missing"
Changing Names in a List of Dataframes
Why Does 1..99,999 == "1".."99,999" in R, But 100,000 != "100,000"
How to Plot Charts with Nested Categories Axes
Data.Frames in R: Name Autocompletion
Multiplying Combinations of a List of Lists in R
Download .Rdata and .CSV Files from Ftp Using Rcurl (Or Any Other Method)