Put multiple data frames into list (smart way)
You can use ls()
with get
as follows:
l.df <- lapply(ls(), function(x) if (class(get(x)) == "data.frame") get(x))
This'll load all data.frames from your current environment workspace.
Alternatively, as @agstudy suggests, you can use pattern to load just the data.frame
s you require.
l.df <- lapply(ls(pattern="df[0-9]+"), function(x) get(x))
Loads all data.frame
s in current environment that begins with df
followed by 1 to any amount of numbers.
How do I make a list of data frames?
This isn't related to your question, but you want to use =
and not <-
within the function call. If you use <-
, you'll end up creating variables y1
and y2
in whatever environment you're working in:
d1 <- data.frame(y1 <- c(1, 2, 3), y2 <- c(4, 5, 6))
y1
# [1] 1 2 3
y2
# [1] 4 5 6
This won't have the seemingly desired effect of creating column names in the data frame:
d1
# y1....c.1..2..3. y2....c.4..5..6.
# 1 1 4
# 2 2 5
# 3 3 6
The =
operator, on the other hand, will associate your vectors with arguments to data.frame
.
As for your question, making a list of data frames is easy:
d1 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6))
d2 <- data.frame(y1 = c(3, 2, 1), y2 = c(6, 5, 4))
my.list <- list(d1, d2)
You access the data frames just like you would access any other list element:
my.list[[1]]
# y1 y2
# 1 1 4
# 2 2 5
# 3 3 6
Python: Store multiple dataframe in list
If you will use parameter sheet_name=None
:
dfs = pd.read_excel(..., sheet_name=None)
it will return a dictionary of Dataframes:
sheet_name : string, int, mixed list of strings/ints, or None, default 0
Strings are used for sheet names, Integers are used in zero-indexed
sheet positions.
Lists of strings/integers are used to request multiple sheets.
Specify None to get all sheets.
str|int -> DataFrame is returned.
list|None -> Dict of DataFrames is returned, with keys representing
sheets.
Available Cases
* Defaults to 0 -> 1st sheet as a DataFrame
* 1 -> 2nd sheet as a DataFrame
* "Sheet1" -> 1st sheet as a DataFrame
* [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
* None -> All sheets as a dictionary of DataFrames
How to create multiple data frames with separate names in r based on an input variable
Code
To make the answer generic, since that seems to be what you want, I would make a list
, then populate that list with dataframe
s.
my_list <- list()
for (i in seq(10)) {
my_list[[i]] = data.frame(x=runif(100), y=rnorm(100))
}
Explanation
Upon execution of this code, you will have a list
with 10 items, labelled 1 - 10. Each of those items is its own dataframe
, with 2 columns: one containing 100 uniform random numbers, and another containing 100 Gaussian random numbers (chosen from a standard normal distribution).
If you want to access, say, the third dataframe in the list, you'd simply type
my_list[[3]]
to get the contents of that dataframe.
(Lists use the double bracket notation in R, and you just have to "get used to it". It's fairly easy to figure out how to use them properly, though. E.g., my_list[3]
will return a list
with only 1 item in it, which is that third dataframe
. But my_list[[3]]
- notice the extra bracket - will return a dataframe
, the third dataframe
.)
Converting a list to multiple data frames, to be able to convert only a column of each data frame to numeric data
I think you want
lapply(file, function(x) {
x[[2]] <- as.numeric(x[[2]])
x[order(x[[2]]), ]
})
Explained:
lapply
iterates a function over a listfile
is your list, the one we are operating onfunction(x)
is an "anonymous" function, wherex
will be each individual element offile
x[[2]] <- as.numeric(x[[2]])
converts the second column to numericx[order(x[[2]]), ]
orders the rows of the data frame by the second column.
for loop for creating multiple data frames and assigning values
The assign()
function is made for this. See ?assign()
for syntax.
a <- c(1,2,3,4)
b <- c("kk","km","ll","k3")
time <- c(2001,2001,2002,2003)
df <- data.frame(a,b,time)
myvalues <- c(2001,2002,2003)
for (i in 1:3) {
assign(paste0("y",i), df[df$time==myvalues[i],])
}
See here for more ways to achieve this.
R: Combine list of data frames into single data frame, add column with list index
Try data.table::rbindlist
library(data.table) # v1.9.5+
rbindlist(dfList, idcol = "index")
# index a b c
# 1: 1 g 1.27242932 -0.005767173
# 2: 1 j 0.41464143 2.404653389
# 3: 1 o -1.53995004 0.763593461
# 4: 1 x -0.92856703 -0.799009249
# 5: 1 f -0.29472045 -1.147657009
# 6: 2 k -0.04493361 0.918977372
# 7: 2 a -0.01619026 0.782136301
# 8: 2 j 0.94383621 0.074564983
# 9: 2 w 0.82122120 -1.989351696
# 10: 2 i 0.59390132 0.619825748
# 11: 3 m -1.28459935 -0.649471647
# 12: 3 w 0.04672617 0.726750747
# 13: 3 l -0.23570656 1.151911754
# 14: 3 g -0.54288826 0.992160365
# 15: 3 b -0.43331032 -0.429513109
Splitting dataframe into multiple dataframes
Firstly your approach is inefficient because the appending to the list on a row by basis will be slow as it has to periodically grow the list when there is insufficient space for the new entry, list comprehensions are better in this respect as the size is determined up front and allocated once.
However, I think fundamentally your approach is a little wasteful as you have a dataframe already so why create a new one for each of these users?
I would sort the dataframe by column 'name'
, set the index to be this and if required not drop the column.
Then generate a list of all the unique entries and then you can perform a lookup using these entries and crucially if you only querying the data, use the selection criteria to return a view on the dataframe without incurring a costly data copy.
Use pandas.DataFrame.sort_values
and pandas.DataFrame.set_index
:
# sort the dataframe
df.sort_values(by='name', axis=1, inplace=True)
# set the index to be this and don't drop
df.set_index(keys=['name'], drop=False,inplace=True)
# get a list of names
names=df['name'].unique().tolist()
# now we can perform a lookup on a 'view' of the dataframe
joe = df.loc[df.name=='joe']
# now you can query all 'joes'
smart way to display n columns with pandas
Use:
# if years are int
cols = ['Country Code', 'Country Name'] \
+ list(range(1995, 2016)) \
+ list(range(2025, 2051))
# OR
# if years are str
cols = ['Country Code', 'Country Name'] \
+ [str(y) for y in range(1995, 2016)] \
+ [str(y) for y in range(2025, 2051)]
# Select subset of columns
print(df[cols])
R: how to add borders to multiple data frames?
So, if I get you right you want to have the borders around each table.
Attached you can find an example where there is a border around each table (table names exluded). I've also added an additional table3 to test the implementation. This procedure works for an arbitrary number of rows and columns.
library(openxlsx)
# Data
table1 <- data.frame("Num" = c(5,6,8,10), "Call" = c(1,2,3,4), "Name" = c("a", "b", "c", "d"), stringsAsFactors = FALSE)
table2 <- data.frame("Num" = c(8,1,11,54,3,5), "Call" = c(1,2,3,4,5,6), "Name" = c("f", "g", "h", "i", "j", "k"), "Age" = c(55,21,30,74,16,41), stringsAsFactors = FALSE)
table3 <- data.frame("Num" = c(8,1,11,54,3,5, 10, 10), "Call" = c(0, 0, 1,2,3,4,5,6), "Name" = c("a", "b", "f", "g", "h", "i", "j", "k"), "Age" = c(0, 0, 55,21,30,74,16,41),
"Test" = c(0, 0, 55,21,30,74,16,41), stringsAsFactors = FALSE)
df_list <- list(table1=table1, table2=table2, table3 = table3)
wb <- createWorkbook()
addWorksheet(wb, sheetName ="first")
s1 <- createStyle(border = "TopBottomLeftRight")
curr_row <- 1
curr_col <- 1
for(i in seq_along(df_list)) {
writeData(wb, "first", names(df_list)[i], startCol = 1, startRow = curr_row)
writeData(wb, "first", df_list[[i]], startCol = 1, startRow = curr_row+1, rowNames = TRUE)
addStyle(wb, sheet = "first", style = s1, rows = (curr_row+1):(nrow(df_list[[i]]) + (curr_row+1)), cols = 1:(1 + ncol(df_list[[i]])), gridExpand = TRUE)
curr_row <- curr_row + nrow(df_list[[i]]) + 3
}
saveWorkbook(wb, paste0(Sys.Date()," Test_file (openxlsx)",".xlsx"))
Created on 2022-05-09 by the reprex package (v2.0.1)
Attached a screen.
Related Topics
R Column Check If Contains Value from Another Column
Bookmarking and Saving the Bookmarks in R Shiny
How to Use the Box-Cox Power Transformation in R
Label Minimum and Maximum of Scale Fill Gradient Legend with Text: Ggplot2
How to Syntax Highlight Inline R Code in R Markdown
How to Display a Busy Indicator in a Shiny App
Differences Between %.% (Dplyr) and %>% (Magrittr)
Using Apply on a Multidimensional Array in R
Roll Your Own Linked List/Tree in R
How to Combine Aes() and Aes_String() Options
R Aggregate Data in One Column Based on 2 Other Columns
Clustering Algorithm for Obtaining Equal Sized Clusters
"Un-Register" a Doparallel Cluster
How to Identify the Distribution of the Given Data Using R
What Is a Fast Way to Set Debugging Code at a Given Line in a Function
Formatting Ggplot2 Axis Labels with Commas (And K? Mm) If I Already Have a Y-Scale
How to Get Currency Exchange Rates in R
How to Sort a Data.Frame with Only One Column, Without Losing Rownames