Create all subvectors of a certain length (moving window)
Try this:
k <- 3
embed(vec, k)[, k:1]
Create matrix from vector's rolling window
We can use embed
apply(embed(my_vector, 4), 1, rev)
# [,1] [,2] [,3]
#[1,] 1 2 3
#[2,] 2 3 4
#[3,] 3 4 5
#[4,] 4 5 6
Or it can be modified to
t(embed(rev(my_vector), 4))[, 3:1]
Or as @lmo suggested
embed(my_vector, 3)[, 3:1]
Or with matrix
matrix(my_vector, 7, 3)[1:4,]
Combinations of vector with sub-vector length n
I can answer the whole question, but it will take a bit longer. This should give you the flavour of the answer.
The package combinat
has a function called permn
which gives you the all the permutations of a vector. You want this, but not quite. What you need is the permutations of all the blocks. So in your first example you have two blocks of length two, and in your second example you have three blocks of length three. If we look at the first, and think about ordering the blocks:
> library(combinat)
> numBlocks = 2
> permn(1:numBlocks)
[[1]]
[1] 1 2
[[2]]
[1] 2 1
So I hope you can see that the first permutation would take the blocks b1 = c(1,2)
, and b2 = c(3,4)
and order them c(b1,b2)
, and the second would order them c(b2,b1)
.
Equally if you had three blocks, b1 = 1:3; b2 = 4:6; b3 = 7:9
then
permn(1:3)
[[1]]
[1] 1 2 3
[[2]]
[1] 1 3 2
[[3]]
[1] 3 1 2
[[4]]
[1] 3 2 1
[[5]]
[1] 2 3 1
[[6]]
[1] 2 1 3
gives you the ordering of these blocks. The more general solution is figuring out how to move the blocks around, but that isn't too hard.
Update: Using my multicool
package. Note co-lexical ordering (coolex) isn't the order you'd come up with by yourself.
library(multicool)
combs = function(v, blockLength){
if(length(v) %% blockLength != 0){
stop("vector length must be divisible by blockLength")
}
numBlocks = length(v) / blockLength
blockWise = matrix(v, nc = blockLength, byrow = TRUE)
m = initMC(1:numBlocks)
Perms = allPerm(m)
t(apply(Perms, 1, function(p)as.vector(t(blockWise[p,]))))
}
> combs(1:4, 2)
[,1] [,2] [,3] [,4]
[1,] 3 4 1 2
[2,] 1 2 3 4
> combs(1:9, 3)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 7 8 9 4 5 6 1 2 3
[2,] 1 2 3 7 8 9 4 5 6
[3,] 7 8 9 1 2 3 4 5 6
[4,] 4 5 6 7 8 9 1 2 3
[5,] 1 2 3 4 5 6 7 8 9
[6,] 4 5 6 1 2 3 7 8 9
Find length of subvectors in R
You can use this approach to find the average signal length:
signal <- 0011100110
mean(nchar(strsplit(as.character(signal), "0+")[[1]]))
# [1] 2.5
Sum of subvectors of a vector in R
Here's another approach which seems to be significantly faster than OP's for loop (by factor ~30) and faster than the other answers currently present (by factor >=18):
n <- 5
x <- 1:5
z <- lapply(1:n, function(i) cumsum(x[i:n]))
m <- mapply(function(y, l) c(rep(NA, n-l), y), z, lengths(z))
m[upper.tri(m)] <- t(m)[upper.tri(m)]
m
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 3 6 10 15
#[2,] 3 2 5 9 14
#[3,] 6 5 3 7 12
#[4,] 10 9 7 4 9
#[5,] 15 14 12 9 5
Benchmarks (scroll down for results)
library(microbenchmark)
n <- 100
x <- 1:n
f1 <- function() {
X <- matrix(0,n,n)
for(i in 1:n) {
for(j in 1:n) {
X[i,j] <- sum(x[i:j])
}
}
X
}
f2 <- function() {
mySum <- function(i,j) sum(x[i:j])
outer(1:n, 1:n, Vectorize(mySum))
}
f3 <- function() {
matrix(apply(expand.grid(1:n, 1:n), 1, function(y) sum(x[y[2]:y[1]])), n, n)
}
f4 <- function() {
z <- lapply(1:n, function(i) cumsum(x[i:n]))
m <- mapply(function(y, l) c(rep(NA, n-l), y), z, lengths(z))
m[upper.tri(m)] <- t(m)[upper.tri(m)]
m
}
f5 <- function() {
X <- diag(x)
for(i in 1:(n-1)) {
for(j in 1:(n-i)){
X[j+i,j] <- X[j,j+i] <- X[j+i,j+i] + X[j+i-1,j]
}
}
X
}
microbenchmark(f1(), f2(), f3(), f4(), f5(), times = 25L, unit = "relative")
#Unit: relative
# expr min lq mean median uq max neval
# f1() 29.90113 29.01193 30.82411 31.15412 32.51668 35.93552 25
# f2() 29.46394 30.93101 31.79682 31.88397 34.52489 28.74846 25
# f3() 56.05807 53.82641 53.63785 55.36704 55.62439 45.94875 25
# f4() 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 25
# f5() 16.30136 17.46371 18.86259 17.87850 21.19914 23.68106 25
all.equal(f1(), f2())
#[1] TRUE
all.equal(f1(), f3())
#[1] TRUE
all.equal(f1(), f4())
#[1] TRUE
all.equal(f1(), f5())
#[1] TRUE
Updated with the edited function by Neal Fultz.
Best way to extract a subvector from a vector?
vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
vector<T> newVec(first, last);
It's an O(N) operation to construct the new vector, but there isn't really a better way.
How do I split a vector to subvectors with a specified subvector lengths in R
We create a grouping index with rep
using the 't' vector and split
the 'ts' vector
split(ts, rep(seq_along(t), t))
#$`1`
#[1] 1 2 3 4
#$`2`
#[1] 5 6 7 8
#$`3`
#[1] 9 10 11 12
#$`4`
#[1] 13 14 15 16
#$`5`
#[1] 17 18
#$`6`
#[1] 19 20
#$`7`
#[1] 21 22
#$`8`
#[1] 23 24
#$`9`
#[1] 25 26 27 28
#$`10`
#[1] 29 30
data
ts <- 1:30
t <- c(4, 4, 4, 4, 2, 2, 2, 2, 4, 2)
NOTE: Both ts
and t
are function names. it is better to specify object names with a different name
Problem k-subvector using dynamic programming
I don't know why you would insist on O(n*k)
, this can be solved in O(n)
with 'sliding window' approach.
- Maintain current 'window'
[left..right]
- At each step, if we can increase
right
by 1 (without violating 'at most k disctint elements' requirement), do it - Otherwise, increase
left
by 1 - Check whether current window is the longest and go back to #2
Checking whether we can increase right
in #2 is a little tricky. We can use hashtable storing for each element inside window how many times it occurred there.
So, the condition to allow right
increase would look like
hash.size < k || hash.contains(V[right + 1])
And each time left
or right
is increased, we'll need to update hash (decrease or increase number of occurrences of the given element).
I'm pretty sure, any DP solution here would be longer and more complicated.
Related Topics
Looping Over Combinations of Regression Model Terms
Why Does Nls Function Not Work in Ggplot2
Align Points and Error Bars in Ggplot When Using 'Jitterdodge'
Compare Two Columns Element-Wise
Selecting Unique Rows in Matrix Using R
R: Using "Microbenchmark" and Ggplot2 to Plot Runtimes
R Function That Uses Its Output as Its Own Input Repeatedly
Select N Rows Above and Below Match
Chi Square Test for Each Row in Data Frame
Changing Line Color in Ggplot Based on Slope
Why Are Probabilities and Response in Ksvm in R Not Consistent
Place Text Values to Right of Sankey Diagram
How to Order a Nominale Variable. E.G Month in R
How to Subset Column Variables in Df1 Based on the Important Variables I Got in Df2