Split a Vector into Three Vectors of Unequal Length in R

Split a vector into three vectors of unequal length in R

You could use rep to create the indices for each group and then split based on that

split(1:12, rep(1:3, c(2, 3, 7)))

If you wanted the items to be randomly assigned so that it's not just the first 2 items in the first vector, the next 3 items in the second vector, ..., you could just add call to sample

split(1:12, sample(rep(1:3, c(2, 3, 7))))

If you don't have the specific lengths (2,3,7) in mind but just don't want it to be equal length vectors every time then SimonO101's answer is the way to go.

How to split a vector into n vectors and print them in one table in R?

You cannot have a dataframe/matrix of unequal length in R, you can append NA's to vector with shorter length.

sapply(y, `[`, seq_len(max(lengths(y))))

# 1 2
#[1,] 1 5
#[2,] 2 6
#[3,] 3 7
#[4,] 4 8
#[5,] NA 9
#[6,] NA 10

Split a vector into unequal chunks in R

split(example.data, rep(1:4, c(4,2,1,3)))

Split a vector into subverter of the same length and assign each of them to a new vector

We can split the vector 's' with a grouping index created with gl. The output will be a list and it is better to keep it in the list instead of multiple objects in the global environment

lst <- split(s, as.integer(gl(length(s), 10, length(s))))

gl creates a grouping vector

as.integer(gl(length(s), 10, length(s)))
#[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3

and when split the 's' by the output of gl, the first 10 values of 's' are grouped together, then the second 10 and so on. These are stored as list of vectors

Split dataframe into a list with vectors of unequal lengths

Map(function(x, a, b) x[a:b], df, seq_along(df), c(3, 5, 4, 8, 10))
# $X1
# [1] 1 2 3
# $X2
# [1] 2 3 4 5
# $X3
# [1] 3 4
# $X4
# [1] 4 5 6 7 8
# $X5
# [1] 5 6 7 8 9 10

How to split one vector in to multiple vectors with pattern in R

markus's solution is correct. Many thanks.

n <- 2; split(1:14, rep(1:2, n*3:4))

Combining vectors of unequal length and non-unique values

I maintain that your problem might be solved in terms of the shortest common supersequence. It assumes that your two vectors each represent one sequence. Please give the code below a try.

If it still does not solve your problem, you'll have to explain exactly what you mean by "my vector contains not one but many sequences": define what you mean by a sequence and tell us how sequences can be identified by scanning through your two vectors.

Part I: given two sequences, find the longest common subsequence

LongestCommonSubsequence <- function(X, Y) {
m <- length(X)
n <- length(Y)
C <- matrix(0, 1 + m, 1 + n)
for (i in seq_len(m)) {
for (j in seq_len(n)) {
if (X[i] == Y[j]) {
C[i + 1, j + 1] = C[i, j] + 1
} else {
C[i + 1, j + 1] = max(C[i + 1, j], C[i, j + 1])
}
}
}

backtrack <- function(C, X, Y, i, j) {
if (i == 1 | j == 1) {
return(data.frame(I = c(), J = c(), LCS = c()))
} else if (X[i - 1] == Y[j - 1]) {
return(rbind(backtrack(C, X, Y, i - 1, j - 1),
data.frame(LCS = X[i - 1], I = i - 1, J = j - 1)))
} else if (C[i, j - 1] > C[i - 1, j]) {
return(backtrack(C, X, Y, i, j - 1))
} else {
return(backtrack(C, X, Y, i - 1, j))
}
}

return(backtrack(C, X, Y, m + 1, n + 1))
}

Part II: given two sequences, find the shortest common supersequence

ShortestCommonSupersequence <- function(X, Y) {
LCS <- LongestCommonSubsequence(X, Y)[c("I", "J")]
X.df <- data.frame(X = X, I = seq_along(X), stringsAsFactors = FALSE)
Y.df <- data.frame(Y = Y, J = seq_along(Y), stringsAsFactors = FALSE)
ALL <- merge(LCS, X.df, by = "I", all = TRUE)
ALL <- merge(ALL, Y.df, by = "J", all = TRUE)
ALL <- ALL[order(pmax(ifelse(is.na(ALL$I), 0, ALL$I),
ifelse(is.na(ALL$J), 0, ALL$J))), ]
ALL$SCS <- ifelse(is.na(ALL$X), ALL$Y, ALL$X)
ALL
}

Your Example:

ShortestCommonSupersequence(X = c("a","g","b","h","a","g","c"),
Y = c("a","g","b","a","g","b","h","c"))
# J I X Y SCS
# 1 1 1 a a a
# 2 2 2 g g g
# 3 3 3 b b b
# 9 NA 4 h <NA> h
# 4 4 5 a a a
# 5 5 6 g g g
# 6 6 NA <NA> b b
# 7 7 NA <NA> h h
# 8 8 7 c c c

(where the two updated vectors are in columns X and Y.)



Related Topics



Leave a reply



Submit