Create All Subvectors of a Certain Length (Moving Window)

Create all subvectors of a certain length (moving window)

Try this:

k <- 3
embed(vec, k)[, k:1]

Create matrix from vector's rolling window

We can use embed

apply(embed(my_vector, 4), 1, rev)
#     [,1] [,2] [,3]
#[1,]    1    2    3
#[2,]    2    3    4
#[3,]    3    4    5
#[4,]    4    5    6

Or it can be modified to

t(embed(rev(my_vector), 4))[, 3:1]

Or as @lmo suggested

embed(my_vector, 3)[, 3:1]

Or with matrix

matrix(my_vector, 7, 3)[1:4,]

Combinations of vector with sub-vector length n

I can answer the whole question, but it will take a bit longer. This should give you the flavour of the answer.

The package combinat has a function called permn which gives you the all the permutations of a vector. You want this, but not quite. What you need is the permutations of all the blocks. So in your first example you have two blocks of length two, and in your second example you have three blocks of length three. If we look at the first, and think about ordering the blocks:

> library(combinat)
> numBlocks = 2
> permn(1:numBlocks)
[[1]]
[1] 1 2

[[2]]
[1] 2 1

So I hope you can see that the first permutation would take the blocks b1 = c(1,2), and b2 = c(3,4) and order them c(b1,b2), and the second would order them c(b2,b1).

Equally if you had three blocks, b1 = 1:3; b2 = 4:6; b3 = 7:9 then

permn(1:3)
[[1]]
[1] 1 2 3

[[2]]
[1] 1 3 2

[[3]]
[1] 3 1 2

[[4]]
[1] 3 2 1

[[5]]
[1] 2 3 1

[[6]]
[1] 2 1 3

gives you the ordering of these blocks. The more general solution is figuring out how to move the blocks around, but that isn't too hard.

Update: Using my multicool package. Note co-lexical ordering (coolex) isn't the order you'd come up with by yourself.

library(multicool)

combs = function(v, blockLength){
  if(length(v) %% blockLength != 0){
    stop("vector length must be divisible by blockLength")
  }

  numBlocks = length(v) / blockLength
  blockWise = matrix(v, nc = blockLength, byrow = TRUE)

  m = initMC(1:numBlocks)
  Perms = allPerm(m)

  t(apply(Perms, 1, function(p)as.vector(t(blockWise[p,]))))
}

> combs(1:4, 2)
     [,1] [,2] [,3] [,4]
[1,]    3    4    1    2
[2,]    1    2    3    4

> combs(1:9, 3)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]    7    8    9    4    5    6    1    2    3
[2,]    1    2    3    7    8    9    4    5    6
[3,]    7    8    9    1    2    3    4    5    6
[4,]    4    5    6    7    8    9    1    2    3
[5,]    1    2    3    4    5    6    7    8    9
[6,]    4    5    6    1    2    3    7    8    9

Find length of subvectors in R

You can use this approach to find the average signal length:

signal <- 0011100110

mean(nchar(strsplit(as.character(signal), "0+")[[1]]))
# [1] 2.5

Sum of subvectors of a vector in R

Here's another approach which seems to be significantly faster than OP's for loop (by factor ~30) and faster than the other answers currently present (by factor >=18):

n <- 5
x <- 1:5
z <- lapply(1:n, function(i) cumsum(x[i:n]))
m <- mapply(function(y, l) c(rep(NA, n-l), y), z, lengths(z))
m[upper.tri(m)] <- t(m)[upper.tri(m)]
m

#     [,1] [,2] [,3] [,4] [,5]
#[1,]    1    3    6   10   15
#[2,]    3    2    5    9   14
#[3,]    6    5    3    7   12
#[4,]   10    9    7    4    9
#[5,]   15   14   12    9    5

Benchmarks (scroll down for results)

library(microbenchmark)
n <- 100
x <- 1:n

f1 <- function() {
  X <- matrix(0,n,n)
  for(i in 1:n) {
    for(j in 1:n) {
      X[i,j] <- sum(x[i:j])
    }
  }
  X
}

f2 <- function() {
  mySum <- function(i,j) sum(x[i:j])
  outer(1:n, 1:n, Vectorize(mySum))
}

f3 <- function() {
  matrix(apply(expand.grid(1:n, 1:n), 1, function(y) sum(x[y[2]:y[1]])), n, n)
}

f4 <- function() {
  z <- lapply(1:n, function(i) cumsum(x[i:n]))
  m <- mapply(function(y, l) c(rep(NA, n-l), y), z, lengths(z))
  m[upper.tri(m)] <- t(m)[upper.tri(m)]
  m
}

f5 <- function() {
  X <- diag(x)
  for(i in 1:(n-1)) {
    for(j in 1:(n-i)){
      X[j+i,j] <- X[j,j+i] <- X[j+i,j+i] + X[j+i-1,j]
    }  
  }
  X
}

microbenchmark(f1(), f2(), f3(), f4(), f5(), times = 25L, unit = "relative")
#Unit: relative
# expr      min       lq     mean   median       uq      max neval
# f1() 29.90113 29.01193 30.82411 31.15412 32.51668 35.93552    25
# f2() 29.46394 30.93101 31.79682 31.88397 34.52489 28.74846    25
# f3() 56.05807 53.82641 53.63785 55.36704 55.62439 45.94875    25
# f4()  1.00000  1.00000  1.00000  1.00000  1.00000  1.00000    25
# f5() 16.30136 17.46371 18.86259 17.87850 21.19914 23.68106    25

all.equal(f1(), f2())
#[1] TRUE
all.equal(f1(), f3())
#[1] TRUE
all.equal(f1(), f4())
#[1] TRUE
all.equal(f1(), f5())
#[1] TRUE

Updated with the edited function by Neal Fultz.

Best way to extract a subvector from a vector?

vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
vector<T> newVec(first, last);

It's an O(N) operation to construct the new vector, but there isn't really a better way.

How do I split a vector to subvectors with a specified subvector lengths in R

We create a grouping index with rep using the 't' vector and split the 'ts' vector

split(ts, rep(seq_along(t), t))
#$`1`
#[1] 1 2 3 4

#$`2`
#[1] 5 6 7 8

#$`3`
#[1]  9 10 11 12

#$`4`
#[1] 13 14 15 16

#$`5`
#[1] 17 18

#$`6`
#[1] 19 20

#$`7`
#[1] 21 22

#$`8`
#[1] 23 24

#$`9`
#[1] 25 26 27 28

#$`10`
#[1] 29 30

data

ts <- 1:30
t <- c(4, 4, 4, 4, 2, 2, 2, 2, 4, 2)

NOTE: Both ts and t are function names. it is better to specify object names with a different name

Problem k-subvector using dynamic programming

I don't know why you would insist on O(n*k), this can be solved in O(n) with 'sliding window' approach.

Maintain current 'window' [left..right]
At each step, if we can increase right by 1 (without violating 'at most k disctint elements' requirement), do it
Otherwise, increase left by 1
Check whether current window is the longest and go back to #2

Checking whether we can increase right in #2 is a little tricky. We can use hashtable storing for each element inside window how many times it occurred there.

So, the condition to allow right increase would look like

hash.size < k || hash.contains(V[right + 1])

And each time left or right is increased, we'll need to update hash (decrease or increase number of occurrences of the given element).

I'm pretty sure, any DP solution here would be longer and more complicated.

Create All Subvectors of a Certain Length (Moving Window)