Sparse matrix to a data frame in R
Using summary
, here is an example:
mat <- Matrix(data = c(1, 0, 2, 0, 0, 3, 4, 0, 0), nrow = 3, ncol = 3,
dimnames = list(Origin = c("A", "B", "C"),
Destination = c("X", "Y", "Z")),
sparse = TRUE)
mat
# 3 x 3 sparse Matrix of class "dgCMatrix"
# Destination
# X Y Z
# A 1 . 4
# B . . .
# C 2 3 .
summ <- summary(mat)
summ
# 3 x 3 sparse Matrix of class "dgCMatrix", with 4 entries
# i j x
# 1 1 1 1
# 2 3 1 2
# 3 3 2 3
# 4 1 3 4
data.frame(Origin = rownames(mat)[summ$i],
Destination = colnames(mat)[summ$j],
Weight = summ$x)
# Origin Destination Weight
# 1 A X 1
# 2 C X 2
# 3 C Y 3
# 4 A Z 4
converting a dgCMatrix to data frame
You can use
b = as.data.frame(summary(a))
# i j x
# 1 2 1 2
# 2 2 2 1
# 3 3 3 3
# 4 1 4 1
# 5 3 5 1
If you need the same order as in your example, you can use
b = b[order(b$i),]
# i j x
# 4 1 4 1
# 1 2 1 2
# 2 2 2 1
# 3 3 3 3
# 5 3 5 1
Another alternative, though not quite as neat, is to use
b = as(a, "dgTMatrix")
cbind.data.frame(r = b@i + 1, c = b@j + 1, x = b@x)
how to coerce a data.frame into a sparse matrix in R
Following user20650's comment, first coerce the CUI*
columns to factor with the same levels, then use xtabs
to create a sparse matrix, then add its transpose.
txt <- '
CUI1 CUI2 Count
1 C0000699 C3894683 2
2 C0000699 C0101725 1
3 C0000699 C1882413 3
4 C0000699 C0245531 3
5 C0000699 C0068475 2
6 C0000699 C0538927 3
7 C0000699 C0724693 1
8 C0000699 C0216784 2
9 C0000699 C2248020 1
10 C0000699 C0069449 3
'
test <- read.table(textConnection(txt), header = TRUE)
library(Matrix)
levls <- Reduce(union, test[1:2])
test[1:2] <- lapply(test[1:2], factor, levels = levls)
res <- xtabs(Count ~ CUI1 + CUI2, data = test, sparse = TRUE)
res <- forceSymmetric(res)
class(res)
#> [1] "dsCMatrix"
#> attr(,"package")
#> [1] "Matrix"
Created on 2022-02-13 by the reprex package (v2.0.1)
Build identity matrix from dataframe (sparsematrix) in R
A possible solution:
tidyr::pivot_wider(dis_matrix, id_cols = i, names_from = j,
values_from = distance, values_fill = 0)
#> # A tibble: 2 × 4
#> i Rwanda France `South Korea`
#> <chr> <dbl> <dbl> <dbl>
#> 1 South Korea 10845. 9384 0
#> 2 France 6003 0 9384
R convert matrix or data frame to sparseMatrix
Here are two options:
library(Matrix)
A <- as(regMat, "sparseMatrix") # see also `vignette("Intro2Matrix")`
B <- Matrix(regMat, sparse = TRUE) # Thanks to Aaron for pointing this out
identical(A, B)
# [1] TRUE
A
# 10 x 10 sparse Matrix of class "dgCMatrix"
#
# [1,] . . . . . 45 . . . .
# [2,] . . . . . . . 59 . .
# [3,] . . . . 95 . . . . .
# [4,] . . . . . . . . . .
# [5,] . . . . . . . . . .
# [6,] . . . . . . . . . .
# [7,] . . . 23 . . . . . .
# [8,] . . . 63 . . . . . .
# [9,] . . . . . . . . . .
# [10,] . . . . . . . . . .
Convert a dgCMatrix to data frame
I think you have your matrix inside a list.
library(Matrix)
m <- matrix(c(1,4,rep(0,100),10), ncol = 1)
group_x <- as(m, "Matrix")
as.data.frame(as.matrix(..))
works fine:
str(as.data.frame(as.matrix(group_x)))
## 'data.frame': 103 obs. of 1 variable:
## $ V1: num 1 4 0 0 0 0 0 0 0 0 ...
but:
gx <- list(group_x)
as.data.frame(as.matrix(gx))
## V1
## 1 <S4 class ‘dgCMatrix’ [package “Matrix”] with 6 slots>
So if what you have is gx
, then
data.frame(Group = rownames(gx[[1]]), Value = gx[[1]][,1])
should work.
Create Sparse Matrix from a data frame
The Matrix
package has a constructor made especially for your type of data:
library(Matrix)
UIMatrix <- sparseMatrix(i = trainingData$UserID,
j = trainingData$MovieID,
x = trainingData$Rating)
Otherwise, you might like knowing about that cool feature of the [
function known as matrix indexing. Your could have tried:
buildUserMovieMatrix <- function(trainingData) {
UIMatrix <- Matrix(0, nrow = max(trainingData$UserID),
ncol = max(trainingData$MovieID), sparse = TRUE);
UIMatrix[cbind(trainingData$UserID,
trainingData$MovieID)] <- trainingData$Rating;
return(UIMatrix);
}
(but I would definitely recommend the sparseMatrix
approach over this.)
Related Topics
How to Match by Nearest Date from Two Data Frames
How to Clear Only a Few Specific Objects from the Workspace
How to Document Data Sets with Roxygen
How to Set Fixed Continuous Colour Values in Ggplot2
One-Hot Encoding in [R] | Categorical to Dummy Variables
Is There a Weighted.Median() Function
Shiny: Passing Input$Var to Aes() in Ggplot2
Set Ggplot Plots to Have Same X-Axis Width and Same Space Between Dot Plot Rows
Reasons That Ggplot2 Legend Does Not Appear
How to Specify a Dynamic Position for the Start of Substring
Randomly Insert Nas into Dataframe Proportionaly
Deploying R Shiny App as a Standalone Application
Reading Multiple CSV Files from a Folder into a Single Dataframe in R
Choropleth Map in Ggplot with Polygons That Have Holes
Get Rid of \Addlinespace in Kable