Merge 2 vectors with different lengths into a data frame
This one maybe:
sq <- seq(max(length(n), length(s)))
data.frame(n[sq], s[sq])
# n.sq. s.sq.
#1 2 aa
#2 3 bb
#3 5 cc
#4 6 <NA>
Combining vectors of unequal length into a data frame
I think that you may be approaching this the wrong way:
If you have time series of unequal length then the absolute best thing to do is to keep them as time series and merge
them. Most time series packages allow this. So you will end up with a multi-variate time series and each value will be properly associated with the same date.
So put your time series into zoo
objects, merge
them, then use my qplot.zoo
function to plot them. That will deal with switching from zoo
into a long data frame.
Here's an example:
> z1 <- zoo(1:8, 1:8)
> z2 <- zoo(2:8, 2:8)
> z3 <- zoo(4:8, 4:8)
> nm <- list("z1", "z2", "z3")
> z <- zoo()
> for(i in 1:length(nm)) z <- merge(z, get(nm[[i]]))
> names(z) <- unlist(nm)
> z
z1 z2 z3
1 1 NA NA
2 2 2 NA
3 3 3 NA
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
>
> x.df <- data.frame(dates=index(x), coredata(x))
> x.df <- melt(x.df, id="dates", variable="val")
> ggplot(na.omit(x.df), aes(x=dates, y=value, group=val, colour=val)) + geom_line() + opts(legend.position = "none")
How to write two vectors of different length into one data frame by writing same values into same row?
One option is match
(tmp <- unique(c(ef1, ef2)))
# [1] "A1" "A2" "B0" "B1" "C1" "C2" "D1" "D2"
out <- data.frame(ef1 = ef1[match(tmp, ef1)],
ef2 = ef2[match(tmp, ef2)])
Result
out
# ef1 ef2
#1 A1 A1
#2 A2 A2
#3 B0 <NA>
#4 B1 <NA>
#5 C1 C1
#6 C2 C2
#7 <NA> D1
#8 <NA> D2
combine vectors of different length into data frame in R
There's a non-exported function charMat
in my "splitstackshape" package that might be useful for something like this.
Here, I've used it in conjunction with mget
:
## library(splitstackshape) # not required since you'll be using ::: anyway...
data.frame(t(splitstackshape:::charMat(mget(ls(pattern = "x\\d")), mode = "value")))
# X1 X2 X3 X4
# a a a a a
# b b b b b
# c c c c <NA>
# d d <NA> d <NA>
# e e <NA> <NA> e
Combining vectors of unequal length and non-unique values
I maintain that your problem might be solved in terms of the shortest common supersequence. It assumes that your two vectors each represent one sequence. Please give the code below a try.
If it still does not solve your problem, you'll have to explain exactly what you mean by "my vector contains not one but many sequences": define what you mean by a sequence and tell us how sequences can be identified by scanning through your two vectors.
Part I: given two sequences, find the longest common subsequence
LongestCommonSubsequence <- function(X, Y) {
m <- length(X)
n <- length(Y)
C <- matrix(0, 1 + m, 1 + n)
for (i in seq_len(m)) {
for (j in seq_len(n)) {
if (X[i] == Y[j]) {
C[i + 1, j + 1] = C[i, j] + 1
} else {
C[i + 1, j + 1] = max(C[i + 1, j], C[i, j + 1])
}
}
}
backtrack <- function(C, X, Y, i, j) {
if (i == 1 | j == 1) {
return(data.frame(I = c(), J = c(), LCS = c()))
} else if (X[i - 1] == Y[j - 1]) {
return(rbind(backtrack(C, X, Y, i - 1, j - 1),
data.frame(LCS = X[i - 1], I = i - 1, J = j - 1)))
} else if (C[i, j - 1] > C[i - 1, j]) {
return(backtrack(C, X, Y, i, j - 1))
} else {
return(backtrack(C, X, Y, i - 1, j))
}
}
return(backtrack(C, X, Y, m + 1, n + 1))
}
Part II: given two sequences, find the shortest common supersequence
ShortestCommonSupersequence <- function(X, Y) {
LCS <- LongestCommonSubsequence(X, Y)[c("I", "J")]
X.df <- data.frame(X = X, I = seq_along(X), stringsAsFactors = FALSE)
Y.df <- data.frame(Y = Y, J = seq_along(Y), stringsAsFactors = FALSE)
ALL <- merge(LCS, X.df, by = "I", all = TRUE)
ALL <- merge(ALL, Y.df, by = "J", all = TRUE)
ALL <- ALL[order(pmax(ifelse(is.na(ALL$I), 0, ALL$I),
ifelse(is.na(ALL$J), 0, ALL$J))), ]
ALL$SCS <- ifelse(is.na(ALL$X), ALL$Y, ALL$X)
ALL
}
Your Example:
ShortestCommonSupersequence(X = c("a","g","b","h","a","g","c"),
Y = c("a","g","b","a","g","b","h","c"))
# J I X Y SCS
# 1 1 1 a a a
# 2 2 2 g g g
# 3 3 3 b b b
# 9 NA 4 h <NA> h
# 4 4 5 a a a
# 5 5 6 g g g
# 6 6 NA <NA> b b
# 7 7 NA <NA> h h
# 8 8 7 c c c
(where the two updated vectors are in columns X
and Y
.)
How to convert a list consisting of vector of different lengths to a usable data frame in R?
try this:
word.list <- list(letters[1:4], letters[1:5], letters[1:2], letters[1:6])
n.obs <- sapply(word.list, length)
seq.max <- seq_len(max(n.obs))
mat <- t(sapply(word.list, "[", i = seq.max))
the trick is, that,
c(1:2)[1:4]
returns the vector + two NAs
The simplest way to convert a list with various length vectors to a data.frame in R
We can use
data.frame(lapply(aa, "length<-", max(lengths(aa))))
Or using tidyverse
library(dplyr)
library(tibble)
library(tidyr)
enframe(aa) %>%
unnest(value)
Related Topics
Cannot Read File with "#" and Space Using Read.Table or Read.CSV in R
How to See All Rows of a Data Frame in a Jupyter Notebook with an R Kernel
Error in Na.Fail.Default: Missing Values in Object - But No Missing Values
The Rolling Regression in R Using Roll Apply
R Sum Every K Columns in Matrix
How to Calculate Confidence Intervals for Nonlinear Least Squares in R
How to Detect That a Vector Is Subset of Specific Vector
R How to Remove Rows in a Data Frame Based on the First Character of a Column
Is There a Difference Between the R Functions Fitted() and Predict()
How to Run a R Language(.R) File Using Batch File
How to Ignore Na in Ifelse Statement
How to Automatically Load Data in an R Package
3D Equivalent of the Curve Function in R
Display Duplicate Records in Data.Frame and Omit Single Ones
Large Integers in Data.Table. Grouping Results Different in 1.9.2 Compared to 1.8.10