How do I remove empty data frames from a list?
I'm not sure if this is exactly what you're asking for, but if you want to trim mlist
down to contain only non-empty data frames before running the function on it, try mlist[sapply(mlist, function(x) dim(x)[1]) > 0]
.
E.g.:
R> M1 <- data.frame(matrix(1:4, nrow = 2, ncol = 2))
R> M2 <- data.frame(matrix(nrow = 0, ncol = 0))
R> M3 <- data.frame(matrix(9:12, nrow = 2, ncol = 2))
R> mlist <- list(M1, M2, M3)
R> mlist[sapply(mlist, function(x) dim(x)[1]) > 0]
[[1]]
X1 X2
1 1 3
2 2 4
[[2]]
X1 X2
1 9 11
2 10 12
Delete empty dataframes from a list with dataframes
We ca use filter
data = list(filter(lambda df: not df.empty, data))
or list comprehension
data = [df for df in data if not df.empty]
print(data)
[ a
0 1
1 2
2 3, a
0 3
1 4
2 5
3 6
4 7]
Delete empty dataframes from List
Simpler test data:
x <- list(data.frame(x=1:3), data.frame(x=numeric(0)), data.frame(x=1:4))
I would recommend (without a for
loop):
is_empty <- function(x) (nrow(x)==0 || ncol(x) ==0)
x <- x[sapply(x, is_empty)]
? This creates a logical vector which is TRUE
if the data frame is empty, and subsets the original list accordingly.
Setting an element of a list to NULL
(x[[i]] <- NULL
) removes it from the list, but I would worry that your indexing is going to get screwed up. It's very tricky to get this right, because the loop changes under you. For example, consider
x <- list(data.frame(x=1:3), data.frame(x=numeric(0)),
data.frame(x=numeric(0)))
for (i in 1:length(x)) {
if (nrow(x[[i]])==0) x[[i]] <- NULL
}
This gets "error in x[[i]]: subscript out of bounds", because
i==1
, checkx[[1]]
: OK, move on to next elementi==2
; remove the second element. Now we have a list with only two elements (the previous first and third elements)i==3
; error, becausex[[3]]
doesn't exist!
You could do this to avoid messing up the indexing:
j <- 0 ## cumulative number removed
for (i in seq(length(x))) {
if (is_empty(x[[i-j]])) {
x[[i-j]] <- NULL
j <- j + 1
}
}
but this seems like a "code smell", i.e. you're having to work extra hard because you're doing it in an awkward way.
How to remove empty dataframes in a list before using bind_rows()?
Using @akrun data:
lst1[unlist(lapply(lst1, function(x) !(is.null(x) | is_tibble(x))))]
Regarding your question about NA
:
lst1 <- list(data.frame(col1 = 1:3), NULL, tibble(col1 = 1:5,
col2 = 2:6), data.frame(A = 1:5, B = 2:6), NA)
lst <-lst1[unlist(lapply(lst1, function(x) !(is.null(x) | is_tibble(x))))]
lst<-lst[!is.na(lst)]
remove empty dataframe from list and drop corresponding name in second list
You can use the built-in any()
method:
k = [i for i, x in enumerate(dfs) if not any(x)]
The reason your
k = [i for i, x in enumerate(dfs) if not x]
doesn't work is because, regardless of what is in a list, as long as the list is not empty, the truthy value of the list will be True
.
The any()
method will take in an array, and return whether any of the elements in the array has a truthy value of True
. If the array has no elements such, it will return False
. The thruthy value of an empty string, ''
, is False
.
EDIT: The question got edited, here is my updated answer:
You can try creating new lists:
names = ['ID1','ID2','ID3']
dfs = [['car','fast','blue'],[],['red','bike','slow']]
new_names = list()
new_dfs = list()
for i, x in enumerate(dfs):
if x:
new_names.append(names[i])
new_dfs.append(x)
print(new_names)
print(new_dfs)
Output:
['ID1', 'ID3']
[['car', 'fast', 'blue'], ['red', 'bike', 'slow']]
If it doesn't work, try adding a print(x)
to the loop to see what is going on:
names = ['ID1','ID2','ID3']
dfs = [['car','fast','blue'],[],['red','bike','slow']]
new_names = list()
new_dfs = list()
for i, x in enumerate(dfs):
print(x)
if x:
new_names.append(names[i])
new_dfs.append(x)
How to delete empty data.frame in a list after subsetting in R
Simply Filter
by number of rows:
new_list_of_dfs <- Filter(NROW, list_of_dfs)
How can I remove empty dataframes from a series of dataframes in pandas?
A series of DataFrames?
You can check if a dataframe is empty with df.empty, so you could do
serie[[not df.empty for df in serie]]
Remove rows with empty lists from pandas data frame
You could try slicing as though the data frame were strings instead of lists:
import pandas as pd
df = pd.DataFrame({
'donation_orgs' : [[], ['the research of Dr.']],
'donation_context': [[], ['In lieu of flowers , memorial donations']]})
df[df.astype(str)['donation_orgs'] != '[]']
Out[9]:
donation_context donation_orgs
1 [In lieu of flowers , memorial donations] [the research of Dr.]
Related Topics
How to Get a Warning on "Shiny App Will Not Work If the Same Output Is Used Twice"
Convert Comma Separated String to Integer in R
How to Remove Na from Facet_Wrap in Ggplot2
Geom_Col Is Assigning the Wrong Independent Variable
Use Fortran Subroutine in R? Undefined Symbol
R Creating a Sequence Table from Two Columns
Combining Pivoted Rows in R by Common Value
How to Find Index of Match Between Two Set of Data Frame
Display Y-Axis for Each Subplot When Faceting
Get All the Rows with Rownames Starting with Abc111
Subtracting Values Group-Wise by the Average of Each Group in R
Shiny Dynamic Filter Variable Selection and Display of Variable Values for Selection
How to Extract All the Rows If a Level in One Column Contains All the Levels of Another Column in R
Selection of Activity Trace in a Chart and Display in a Data Table in R Shiny