R - store a matrix into a single dataframe cell
I think the trick may be to insert it in as a list:
set.seed(123)
dat <- data.frame(women, m=I(replicate(nrow(women), matrix(rnorm(4), 2, 2),
simplify=FALSE)))
str(dat)
'data.frame': 15 obs. of 3 variables:
$ height: num 58 59 60 61 62 63 64 65 66 67 ...
$ weight: num 115 117 120 123 126 129 132 135 139 142 ...
$ m :List of 15
..$ : num [1:2, 1:2] -0.5605 -0.2302 1.5587 0.0705
..$ : num [1:2, 1:2] 0.129 1.715 0.461 -1.265
...
..$ : num [1:2, 1:2] -1.549 0.585 0.124 0.216
..- attr(*, "class")= chr "AsIs"
dat[[1, "m"]]
[,1] [,2]
[1,] -0.5604756 1.55870831
[2,] -0.2301775 0.07050839
dat[[2, "m"]]
[,1] [,2]
[1,] 0.1292877 0.4609162
[2,] 1.7150650 -1.2650612
EDIT: So the question really is about initialising and then assigning. Given that, you should be able to define a data.frame like the one in your question like so:
data.frame(i=1:5, m=I(vector(mode="list", length=5)))
You can then assign to it like so:
dat[[2, "m"]] <- matrix(rnorm(9), 3, 3)
How to place a matrix as an element of a data.frame in R?
We can wrap the matrix
in a list
and then assign it to the cell.
dataframe$out[1] <- list(matrixObj)
Creating data.frames where one column contains matrices
You have two issues:
- To store a matrix in a data.frame (tibble), you simply have to put it in a list.
- To create 2 x 2 matrices (instead of repeating the same 4 x 32 matrix in each cell), you need to work row by row. Currently, when you do
matrix(c(disp, hp, gear, carb))
you create a 4 x 32 matrix! You want only 4 x 1 inputs, reshaped to 2 x 2.
Working with pmap
allows you to process the rows one by one, but alternatively you can use rowwise
which groups by row:
library(tidyverse)
df <-
mtcars %>%
as_tibble() %>%
rowwise() %>%
mutate(mat = list(matrix(c(disp, hp, gear, carb), 2, 2)))
Edit: Now how do you actually use those? Let's take the example of a fisher.test
. Note that a test is a complex object, with components (like p.value) and attributes, so we'll have to store them in a list-column.
You can either keep working rowwise
, in which case the list is automagically "unlist-ed":
df %>%
# keep in mind df is still grouped by row so 'mat' is only one matrix.
# A test is a complex object so we need to store it in a list-column
mutate(test = list(fisher.test(mat)),
# test is just one test so we can extract p-value directly
pval = test$p.value)
Or if you stop working row by row (for which you simply need to ungroup
), then mat
is a list of matrices onto which you can map functions. We use the map
functions from purrr
.
library("purrr")
df %>%
ungroup() %>%
# Apply the test to each mat using `map` from `purrr`
# `map` returns a list so `test` is a list-column
mutate(test = map(mat, fisher.test),
# Now `test` is a list of tests... so you need to map operations onto it
# Extract the p-values from each test, into a numeric column rather than a list-column
pval = map_dbl(test, pluck, "p.value"))
Which one you prefer is a matter of taste :)
How to store mean vectors and covariance matrices in cells of a data table?
We can use mget
instead of get
as get
is for returning a single object value and mget
for one or more
data[, lapply(mget(bd), function(x) mean(x)), by = a]
If we need a list
column
data[, .(mu = .(as.list(lapply(mget(bd), function(x) mean(x))))), by = a]
IF we want both columns i.e. cov
as well
data[, .(mu = .(sapply(mget(bd), function(x) mean(x))),
sigma = .(cov(do.call(cbind, mget(bd)))[2])), by = a]
a mu sigma
1: 1 0.2353046,2.2000000 -2.131663
2: 2 0.1876238,3.3333333 2.062627
3: 3 0.9299794,1.5000000 0.1445644
Is it possible to store an vector inside a dataframe cell?
I agree with Stephen Henderson's comment that you shouldn't use list-columns unless you are absolutely sure that they are the best way to solve your specific problem. That being said, if you do decide to use list columns, you might want to consider using tibbles instead of data frames. Tibbles are an 'upgrade' to regular data frames. They are part of the tidyverse and come in the tibble
package.
Tibbles make it easy to create list columns:
tibble(x = 1:3, y = list(1:5, 1:10, 1:20))
#> # A tibble: 3 x 2
#> x y
#> <int> <list>
#> 1 1 <int [5]>
#> 2 2 <int [10]>
#> 3 3 <int [20]>
Moreover, you can "pack" and "unpack" list-columns using the commands nest
and unnest
from the tidyr
package. For example:
df <- tibble(
x = 1:3,
y = c("a", "d,e,f", "g,h")
)
df %>%
transform(y = strsplit(y, ",")) %>%
unnest(y)
For more information about tibbles you can consult this vignette.
How to save multiple numbers in one cell in a matrix/dataframe?
This is what I ended up doing
stn[1,1] <- toString(temp_warnings$row)
stn[2,1] <- toString((subset(temp_warnings, row <= 31))$day)
stn[3,1] <- toString((subset(temp_warnings, 31 < row & row <= 61))$day)
stn[4,1] <- toString((subset(temp_warnings, 61 < row & row <= 92))$day)
stn[5,1] <- toString((subset(temp_warnings, 92 < row & row <= 123))$day)
stn[6,1] <- toString((subset(temp_warnings, 123 < row))$day)
Is there a way to turn a DataFrame/correlation matrix into a DataFrame with one column per cell combination?
Column names containing [
and ]
are problematic, so I've used a slightly different naming convention to yours, but I believe this gives you the structure you want.
First generate some test data
library(tidyverse)
d <- tibble(
a=c(1, 0.2, 0.4, 0.6),
b=c(0.2, 1, 0.2, 0.4),
c=c(0.4, 0.2, 1, 0.2),
d=c(0.6, 0.4, 0.2, 1)
)
d
# A tibble: 4 × 4
a b c d
<dbl> <dbl> <dbl> <dbl>
1 1 0.2 0.4 0.6
2 0.2 1 0.2 0.4
3 0.4 0.2 1 0.2
4 0.6 0.4 0.2 1
Then do what you want
d %>%
mutate(row=letters[1:nrow(.)]) %>%
pivot_longer(-row) %>%
pivot_wider(
names_from=c(row, name),
values_from=value
)
a_a a_b a_c a_d b_a b_b b_c b_d c_a c_b c_c c_d d_a d_b d_c d_d
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.2 0.4 0.6 0.2 1 0.2 0.4 0.4 0.2 1 0.2 0.6 0.4 0.2 1
Edit
d %>%
mutate(row=letters[1:nrow(.)]) %>%
pivot_wider(
names_from=row,
values_from=-row
)
Gives the same result and is slightly shorter.
Storing vectors in a dataframe element
Not sure if I understood you correctly, anyway, here's an example very similar to the one suggested here :
# your initial data.frame
data <- data.frame(job_id = c('abc','abc1','jsdf'), usetime = c(2345,4353,34985))
# initialize runtime_excluded with an empty list
data$runtime_excluded <- vector(mode = "list",length=nrow(data))
# > data
# job_id usetime runtime_excluded
# 1 abc 2345 NULL
# 2 abc1 4353 NULL
# 3 jsdf 34985 NULL
# example of initialization in a for-loop
for(i in 1:3){
data$runtime_excluded[[i]] <- 1:i
# or, similarly :
# data[['runtime_excluded']][[i]] <- 1:i
}
# > data
# job_id usetime runtime_excluded
# 1 abc 2345 1
# 2 abc1 4353 1, 2
# 3 jsdf 34985 1, 2, 3
EDIT :
Here's a working version of your code :
data <- data.frame(job_id = c('abc','abc1','jsdf'),
starttime = c(1,2,3),
endtime = c(24,24,23),
endtime_modified = c(22,20,23),
usetime = c(22,22,9)
)
# > data
# job_id starttime endtime endtime_modified usetime
# 1 abc 1 24 22 22
# 2 abc1 2 24 20 22
# 3 jsdf 3 23 23 9
# initialize runtime_excluded with an empty list
data$runtime_excluded <- vector(mode = "list",length=nrow(data))
k=nrow(data)
for(i in 1:k)
{
indices_peak<-which((data[i,"endtime"] >= data$starttime) & (data[i,"endtime"] <= data$endtime))
indices_peak95<-which((data[i,"endtime_modified"] >= data$starttime) & (data[i,"endtime_modified"] <= data$endtime_modified))
indices_excluded<-indices_peak[!indices_peak %in% indices_peak95]
data[i,"peak"]<-length(indices_peak)
data[i,"peak_95"]<-length(indices_peak95)
vect <- data[indices_excluded, "usetime"] # here's the integer(0) problem, solved using the if-statement below
if(!is.null(vect)){
data$runtime_excluded[[i]] <- vect
}
}
# > data
# job_id starttime endtime endtime_modified usetime runtime_excluded peak peak_95
# 1 abc 1 24 22 22 22 2 2
# 2 abc1 2 24 20 22 2 3
# 3 jsdf 3 23 23 9 22, 22 3 1
data.frame with a column containing a matrix in R
I find data.frames containing matrices mind-bendingly weird, but: the only way I know to achieve this is hidden in stats:::simulate.lm
Try this, poke through and see what's happening:
d <- data.frame(y=1:5,n=5)
g0 <- glm(cbind(y,n-y)~1,data=d,family=binomial)
debug(stats:::simulate.lm)
s <- simulate(g0,n=5)
This is the weird, back-door solution. Create a list, change its class to data.frame
, and then (this is required) set the names
and row.names
manually (if you don't do those final steps the data will still be in the object, but it will print out as though it had zero rows ...)
m1 <- matrix(1:10,ncol=2)
m2 <- matrix(5:14,ncol=2)
dd <- list(m1,m2)
class(dd) <- "data.frame"
names(dd) <- LETTERS[1:2]
row.names(dd) <- 1:5
dd
Related Topics
Delete Columns Where All Values Are 0
Identify Points Within Specified Distance in R
R: Robust Se's and Model Diagnostics in Stargazer Table
How to Call External R Script from R Markdown (.Rmd) in Rstudio
Factor Order Within Faceted Dotplot Using Ggplot2
Create a 24 Hour Vector with 5 Minutes Time Interval in R
Ggplot2 Avoid Boxes Around Legend Symbols
Remove Fill Around Legend Key in Ggplot
Ggplot: Remove Na Factor Level in Legend
Cannot Coerce Type 'Closure' to Vector of Type 'Character'
How to Highlight Time Ranges on a Plot
How to Write from R to the Clipboard on a MAC
How to Extract Elements from a List with Mixed Elements
Plotting a Curve Around a Set of Points
Data.Table Alternative for Dplyr Case_When