How to save a data.frame in R?
There are several ways. One way is to use save()
to save the exact object. e.g. for data frame foo
:
save(foo,file="data.Rda")
Then load it with:
load("data.Rda")
You could also use write.table()
or something like that to save the table in plain text, or dput()
to obtain R code to reproduce the table.
How to save a data frame in R
You might want to take a look at this question here: R data formats: RData, Rda, Rds etc.
When loading an .rda
object, you are going to load all objects with their original names to the global environment. You can't assign objects to new names using load
as you tried to do.
If you want to save objects that can be loaded with different names later, then you should use the .rds
format (saveRDS
and readRDS
). If you want to save more than one object in a .rds
file, the simplest solution is to put all of them on a list and save only the list. If after reading the .rds
you want to put the objects of the list in the global environment, you can use list2env
.
How to save a large dataframe and quickly load it in R?
You can serialize it easily with:
readr::write_rds(pageInfo_df, "pageInfo_df.Rds")
and then deserialize it like so:
readr::read_rds("pageInfo_df.Rds")
this should handle every valid R object of an arbitrary complexity.
Saving a DataFrame to .txt-file in R (every value in new line)
I'm not certain I've 100% grasped what you're trying to do, but it looks like you're trying to print the data row-wise to a text file. Here's a possible solution using tidyverse
. I'm not sure what your data looks like, so here's a slightly longer tibble
just to show that it's doing what I'm seeing your question as.
To create some data for the example:
## if you need to install tidyverse
# install.packages("tidyverse")
library(tidyverse)
dat <-
tibble(
w = c("First", "Fourth", "Seventh"),
x = c("Second", "Fifth", "Eighth"),
y = c("Third", "Sixth", "Ninth"),
z = c("do", "not", "want")
)
The data looks like this:
w x y z
First Second Third do
Fourth Fifth Sixth not
Seventh Eighth Ninth want
Here we're manipulating the data to the format you want printed.
dat_to_print <-
dat %>%
## whatever columns you do not want printed would go here
## you could also select(w,x,y) instead of dropping the unwanted columns
select(-z) %>%
rowwise() %>%
## whatever columns you want printed would go here... you can also provide it as c(w,x,y)
pivot_longer(w:y) %>%
## pivot longer will come up with two columns:
## the first is 'name' which holds the former name of the variable (i.e. w, x, or y)
## the second is 'value' which is what you want to print as I've understood the problem
## it doesn't look like you care about the old column names, so we remove it here
select(-name)
And creating the text file.
write.table(dat_to_print,
file = "C:\\your\\folder\\location\\dat.txt",
col.names = FALSE,
row.names = FALSE,
quote = FALSE)
dat.txt
will look like this:
First
Second
Third
Fourth
Fifth
Sixth
Seventh
Eighth
Ninth
How do I save a dataframe with a list column of same-columned dataframes to parquet with arrow?
I've found that even if you define a schema (my_schema
) that includes the structure of the list column, write_parquet(df,schema=my_schema)
will still fail if some of the rows of the list_column do not hold the same structure as the rows that do have that structure (i.e. if some of the rows are NA)
For example, if dat
is a data.table with five, columns, one of which is a list column holding data.table...
grp data a b c
<num> <list> <num> <num> <num>
1: 1 <data.table[100x3]> 0.6142948 -1.0359482 -0.3782694
2: 2 NA 0.1192991 0.1889432 0.2735809
3: 3 <data.table[100x3]> 0.4198558 0.6189989 -0.8201980
Then, write_parquet(dat, schema=my_schema)
will fail (i.e. Error: Invalid: Can only convert data frames to Struct type
).
I think the approach of placing a 0-row table of the same structure as the other tables in that list column is a good idea:
# get a null table of same structure
null_table = dat[!is.na(data)]$data[[1]][0,]
# replace the NA with the null_table
dat[is.na(data),data:=list(null_table)]
# write the parquet file
write_parquet(dat, "dat.pqt")
This is easily retrieved:
# Read the file
dat = read_parquet("dat.pqt")
# Convert the arrow list to data.table
dat$data= lapply(dat$data, data.table)
# Convert the data.tables with 0 rows back to NA
dat[sapply(dat$data,nrow)==0,data:=NA][]
grp data a b c
<num> <list> <num> <num> <num>
1: 1 <data.table[100x3]> 0.6142948 -1.0359482 -0.3782694
2: 2 NA 0.1192991 0.1889432 0.2735809
3: 3 <data.table[100x3]> 0.4198558 0.6189989 -0.8201980
How to save t-test output with names of columns into a dataframe in R?
If you don't mind using an external package then:
library(matrixTests)
col_t_welch(df[df$group=="cluster2",-1], df[df$group=="cluster1",-1])
obs.x obs.y obs.tot mean.x mean.y mean.diff var.x var.y stderr df statistic pvalue conf.low conf.high alternative mean.null conf.level
One 7 3 10 -0.03035821 0.16533806 -0.1956963 0.4347748 0.01569194 0.2595021 6.906193 -0.7541221 0.4756968552 -0.8110149 0.41962235 two.sided 0 0.95
two 7 3 10 -0.06497898 0.03928572 -0.1042647 0.7347812 2.39802096 0.9509517 2.545136 -0.1096425 0.9207496910 -3.4608429 3.25231355 two.sided 0 0.95
three 7 3 10 -0.48970882 0.39769370 -0.8874025 0.3385390 0.63615343 0.5103076 2.964909 -1.7389561 0.1815091371 -2.5223572 0.74755220 two.sided 0 0.95
four 7 3 10 -0.52964750 0.86171745 -1.3913649 0.3785659 0.06283704 0.2739097 7.963842 -5.0796483 0.0009668828 -2.0235014 -0.75922852 two.sided 0 0.95
five 7 3 10 -0.29465530 0.48897708 -0.7836324 0.4417576 0.15588465 0.3392194 6.575237 -2.3101050 0.0565252579 -1.5963836 0.02911888 two.sided 0 0.95
six 7 3 10 -0.61128484 0.76991659 -1.3812014 0.6884335 0.09377073 0.3600063 7.996676 -3.8366033 0.0049749882 -2.2114376 -0.55096530 two.sided 0 0.95
Saving dataframe with separated column in R
If we need to automatically update the original object use the magrittr compound operator (%<>%
)
library(magrittr)
four_rows %<>%
separate(Datetime, c('Date', 'Time'), sep=" ")
Now, we check for
four_rows
Related Topics
Export a Graph to .Eps File with R
What Is Integer Overflow in R and How Can It Happen
R Gotcha: Logical-And Operator for Combining Conditions Is & Not &&
R - Emulate the Default Behavior of Hist() with Ggplot2 for Bin Width
Select First Element of Nested List
How to Remove Columns from a Data.Frame
R Shiny Set Datatable Column Width
From Data Table, Randomly Select One Row Per Group
Use of Lapply .Sd in Data.Table R
Unique() for More Than One Variable
Change the Default Colour Palette in Ggplot
Different Legends and Fill Colours for Facetted Ggplot
R Package Lattice Won't Plot If Run Using Source()