Put Multiple Data Frames into List (Smart Way)

Put multiple data frames into list (smart way)

You can use ls() with get as follows:

l.df <- lapply(ls(), function(x) if (class(get(x)) == "data.frame") get(x))

This'll load all data.frames from your current environment workspace.

Alternatively, as @agstudy suggests, you can use pattern to load just the data.frames you require.

l.df <- lapply(ls(pattern="df[0-9]+"), function(x) get(x))

Loads all data.frames in current environment that begins with df followed by 1 to any amount of numbers.

How do I make a list of data frames?

This isn't related to your question, but you want to use = and not <- within the function call. If you use <-, you'll end up creating variables y1 and y2 in whatever environment you're working in:

d1 <- data.frame(y1 <- c(1, 2, 3), y2 <- c(4, 5, 6))
y1
# [1] 1 2 3
y2
# [1] 4 5 6

This won't have the seemingly desired effect of creating column names in the data frame:

d1
#   y1....c.1..2..3. y2....c.4..5..6.
# 1                1                4
# 2                2                5
# 3                3                6

The = operator, on the other hand, will associate your vectors with arguments to data.frame.

As for your question, making a list of data frames is easy:

d1 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6))
d2 <- data.frame(y1 = c(3, 2, 1), y2 = c(6, 5, 4))
my.list <- list(d1, d2)

You access the data frames just like you would access any other list element:

my.list[[1]]
#   y1 y2
# 1  1  4
# 2  2  5
# 3  3  6

Python: Store multiple dataframe in list

If you will use parameter sheet_name=None:

dfs = pd.read_excel(..., sheet_name=None)

it will return a dictionary of Dataframes:

sheet_name : string, int, mixed list of strings/ints, or None, default 0

    Strings are used for sheet names, Integers are used in zero-indexed
    sheet positions.

    Lists of strings/integers are used to request multiple sheets.

    Specify None to get all sheets.

    str|int -> DataFrame is returned.
    list|None -> Dict of DataFrames is returned, with keys representing
    sheets.

    Available Cases

    * Defaults to 0 -> 1st sheet as a DataFrame
    * 1 -> 2nd sheet as a DataFrame
    * "Sheet1" -> 1st sheet as a DataFrame
    * [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
    * None -> All sheets as a dictionary of DataFrames

How to create multiple data frames with separate names in r based on an input variable

Code

To make the answer generic, since that seems to be what you want, I would make a list, then populate that list with dataframes.

my_list <- list()
for (i in seq(10)) {
    my_list[[i]] = data.frame(x=runif(100), y=rnorm(100))
}

Explanation

Upon execution of this code, you will have a list with 10 items, labelled 1 - 10. Each of those items is its own dataframe, with 2 columns: one containing 100 uniform random numbers, and another containing 100 Gaussian random numbers (chosen from a standard normal distribution).

If you want to access, say, the third dataframe in the list, you'd simply type

my_list[[3]]

to get the contents of that dataframe.

(Lists use the double bracket notation in R, and you just have to "get used to it". It's fairly easy to figure out how to use them properly, though. E.g., my_list[3] will return a list with only 1 item in it, which is that third dataframe. But my_list[[3]] - notice the extra bracket - will return a dataframe, the third dataframe.)

Converting a list to multiple data frames, to be able to convert only a column of each data frame to numeric data

I think you want

lapply(file, function(x) {
    x[[2]] <- as.numeric(x[[2]])        
    x[order(x[[2]]), ]
})

Explained:

lapply iterates a function over a list
file is your list, the one we are operating on
function(x) is an "anonymous" function, where x will be each individual element of file
x[[2]] <- as.numeric(x[[2]]) converts the second column to numeric
x[order(x[[2]]), ] orders the rows of the data frame by the second column.

for loop for creating multiple data frames and assigning values

The assign() function is made for this. See ?assign() for syntax.

a <- c(1,2,3,4)
b <- c("kk","km","ll","k3")
time <- c(2001,2001,2002,2003)
df <- data.frame(a,b,time)
myvalues <- c(2001,2002,2003)

for (i in 1:3) {
  assign(paste0("y",i), df[df$time==myvalues[i],])
  }

See here for more ways to achieve this.

R: Combine list of data frames into single data frame, add column with list index

Try data.table::rbindlist

library(data.table) # v1.9.5+
rbindlist(dfList, idcol = "index")
#     index a           b            c
#  1:     1 g  1.27242932 -0.005767173
#  2:     1 j  0.41464143  2.404653389
#  3:     1 o -1.53995004  0.763593461
#  4:     1 x -0.92856703 -0.799009249
#  5:     1 f -0.29472045 -1.147657009
#  6:     2 k -0.04493361  0.918977372
#  7:     2 a -0.01619026  0.782136301
#  8:     2 j  0.94383621  0.074564983
#  9:     2 w  0.82122120 -1.989351696
# 10:     2 i  0.59390132  0.619825748
# 11:     3 m -1.28459935 -0.649471647
# 12:     3 w  0.04672617  0.726750747
# 13:     3 l -0.23570656  1.151911754
# 14:     3 g -0.54288826  0.992160365
# 15:     3 b -0.43331032 -0.429513109

Splitting dataframe into multiple dataframes

Firstly your approach is inefficient because the appending to the list on a row by basis will be slow as it has to periodically grow the list when there is insufficient space for the new entry, list comprehensions are better in this respect as the size is determined up front and allocated once.

However, I think fundamentally your approach is a little wasteful as you have a dataframe already so why create a new one for each of these users?

I would sort the dataframe by column 'name', set the index to be this and if required not drop the column.

Then generate a list of all the unique entries and then you can perform a lookup using these entries and crucially if you only querying the data, use the selection criteria to return a view on the dataframe without incurring a costly data copy.

Use pandas.DataFrame.sort_values and pandas.DataFrame.set_index:

# sort the dataframe
df.sort_values(by='name', axis=1, inplace=True)

# set the index to be this and don't drop
df.set_index(keys=['name'], drop=False,inplace=True)

# get a list of names
names=df['name'].unique().tolist()

# now we can perform a lookup on a 'view' of the dataframe
joe = df.loc[df.name=='joe']

# now you can query all 'joes'

smart way to display n columns with pandas

Use:

# if years are int
cols = ['Country Code', 'Country Name'] \
       + list(range(1995, 2016)) \
       + list(range(2025, 2051))

# OR

# if years are str
cols = ['Country Code', 'Country Name'] \
       + [str(y) for y in range(1995, 2016)] \
       + [str(y) for y in range(2025, 2051)]

# Select subset of columns
print(df[cols])

R: how to add borders to multiple data frames?

So, if I get you right you want to have the borders around each table.

Attached you can find an example where there is a border around each table (table names exluded). I've also added an additional table3 to test the implementation. This procedure works for an arbitrary number of rows and columns.

library(openxlsx)
# Data
table1 <- data.frame("Num" = c(5,6,8,10), "Call" = c(1,2,3,4), "Name" = c("a", "b", "c", "d"), stringsAsFactors = FALSE)
table2 <- data.frame("Num" = c(8,1,11,54,3,5), "Call" = c(1,2,3,4,5,6), "Name" = c("f", "g", "h", "i", "j", "k"), "Age" = c(55,21,30,74,16,41), stringsAsFactors = FALSE)
table3 <- data.frame("Num" = c(8,1,11,54,3,5, 10, 10), "Call" = c(0, 0, 1,2,3,4,5,6), "Name" = c("a", "b", "f", "g", "h", "i", "j", "k"), "Age" = c(0, 0, 55,21,30,74,16,41),
                     "Test" = c(0, 0, 55,21,30,74,16,41), stringsAsFactors = FALSE)
df_list <- list(table1=table1, table2=table2, table3 = table3)  

wb <- createWorkbook()
addWorksheet(wb, sheetName ="first")
s1 <- createStyle(border = "TopBottomLeftRight")
curr_row <- 1
curr_col <- 1
for(i in seq_along(df_list)) {
  writeData(wb, "first", names(df_list)[i], startCol = 1, startRow = curr_row)
  writeData(wb, "first", df_list[[i]], startCol = 1, startRow = curr_row+1, rowNames = TRUE)
  addStyle(wb, sheet = "first", style = s1, rows = (curr_row+1):(nrow(df_list[[i]]) + (curr_row+1)), cols = 1:(1 + ncol(df_list[[i]])), gridExpand = TRUE)
  
  curr_row <- curr_row + nrow(df_list[[i]]) + 3
}

saveWorkbook(wb, paste0(Sys.Date()," Test_file (openxlsx)",".xlsx"))

^{Created on 2022-05-09 by the reprex package (v2.0.1)}

Attached a screen.

Sample Image

Put Multiple Data Frames into List (Smart Way)