Removing All Empty Columns and Rows in Data.Frame When Rows Don't Go Away

Removing all empty columns and rows in data.frame when rows don't go away

You have NA and also empty rows. You can do

B1[rowSums(is.na(B1) | B1 == "") != ncol(B1), ]

#   study.name  group.name outcome ESL prof scope type
#1  Shin.Ellis    ME.short       1   1    2     1    1
#2  Shin.Ellis     ME.long       1   1    2     1    1
#3  Shin.Ellis   DCF.short       1   1    2     1    2
#4  Shin.Ellis    DCF.long       1   1    2     1    2
#5  Shin.Ellis  Cont.short       1   1    2    NA   NA
#6  Shin.Ellis   Cont.long       1   1    2    NA   NA
#8    Trus.Hsu       Exper       1   2    2     2    1
#.....

We can also use filter_all from dplyr

library(dplyr)
B1 %>% filter_all(any_vars(!is.na(.) & . != ""))

How can i ignore all empty cells in a df, without dropping entire rows / columns?

You could try something like this

library(tidyverse)
set.seed(1234)


df <- tibble(
  v1 = sample(c(letters[1:4], rep(NA, 20)), 20, replace = TRUE),
  v2 = sample(c(letters[1:4], rep(NA, 20)), 20, replace = TRUE),
  v3 = sample(c(letters[1:4], rep(NA, 20)), 20, replace = TRUE),
  v4 = sample(c(letters[1:4], rep(NA, 20)), 20, replace = TRUE)
)


df %>% 
  fill(names(df)) %>% 
  distinct()

Sample Image

If you don't care about the combos and only want unique values you can do this:

df %>% 
  gather() %>% 
  distinct(key, value) %>% 
  filter(!is.na(value)) %>% 
  group_by(key) %>% 
  arrange(value) %>% 
  mutate(ord = row_number()) %>% 
  ungroup() %>% 
  spread(key, value)

#   ord v1    v2    v3    v4   
#     1 b     b     b     c    
#     2 d     c     NA    NA   
#     3 NA    d     NA    NA

Python pandas big DataFrame deleting empty columns/rows

try selecting the index of rows and apply drop method.

temp_df = df[(df.Delivery > -2) & (df.Delivery < 3) & (df.Bill == 0)]
index_lst = temp_df.index.values.tolist()
df = df.drop(index_lst)

Removing empty rows of a data file in R

I assume you want to remove rows that are all NAs. Then, you can do the following :

data <- rbind(c(1,2,3), c(1, NA, 4), c(4,6,7), c(NA, NA, NA), c(4, 8, NA)) # sample data
data
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    1   NA    4
[3,]    4    6    7
[4,]   NA   NA   NA
[5,]    4    8   NA

data[rowSums(is.na(data)) != ncol(data),]
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    1   NA    4
[3,]    4    6    7
[4,]    4    8   NA

If you want to remove rows that have at least one NA, just change the condition :

data[rowSums(is.na(data)) == 0,]
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    6    7

How to remove Empty Cell from data frame row wise

Thanks to @Perennial for his suggestions above. Finally, I did it as follows.

new_lines = []
with open('data.csv', 'r') as csv:
    # skip the first line
    csv.readline()
    for line in csv.readlines():
        words = line.strip().split(',')
        new_words = [w for w in words if w and w.strip()]
        #skip the empty lines
        if len(new_words) != 0:
            new_lines.append(','.join(new_words))
df = pd.DataFrame(new_lines)
df.to_csv('results.csv', sep=',')

The @Scott's solution is elegant but I don't know, it always throws a memoryError exception.

One more thing, I do not want the row numbers in the resultant file. If anyone helps me out. Although, I delete that column using Excel :)

Drop rows containing empty cells from a pandas DataFrame

Pandas will recognise a value as null if it is a np.nan object, which will print as NaN in the DataFrame. Your missing values are probably empty strings, which Pandas doesn't recognise as null. To fix this, you can convert the empty stings (or whatever is in your empty cells) to np.nan objects using replace(), and then call dropna()on your DataFrame to delete rows with null tenants.

To demonstrate, we create a DataFrame with some random values and some empty strings in a Tenants column:

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> df = pd.DataFrame(np.random.randn(10, 2), columns=list('AB'))
>>> df['Tenant'] = np.random.choice(['Babar', 'Rataxes', ''], 10)
>>> print df

          A         B   Tenant
0 -0.588412 -1.179306    Babar
1 -0.008562  0.725239         
2  0.282146  0.421721  Rataxes
3  0.627611 -0.661126    Babar
4  0.805304 -0.834214         
5 -0.514568  1.890647    Babar
6 -1.188436  0.294792  Rataxes
7  1.471766 -0.267807    Babar
8 -1.730745  1.358165  Rataxes
9  0.066946  0.375640

Now we replace any empty strings in the Tenants column with np.nan objects, like so:

>>> df['Tenant'].replace('', np.nan, inplace=True)
>>> print df

          A         B   Tenant
0 -0.588412 -1.179306    Babar
1 -0.008562  0.725239      NaN
2  0.282146  0.421721  Rataxes
3  0.627611 -0.661126    Babar
4  0.805304 -0.834214      NaN
5 -0.514568  1.890647    Babar
6 -1.188436  0.294792  Rataxes
7  1.471766 -0.267807    Babar
8 -1.730745  1.358165  Rataxes
9  0.066946  0.375640      NaN

Now we can drop the null values:

>>> df.dropna(subset=['Tenant'], inplace=True)
>>> print df

          A         B   Tenant
0 -0.588412 -1.179306    Babar
2  0.282146  0.421721  Rataxes
3  0.627611 -0.661126    Babar
5 -0.514568  1.890647    Babar
6 -1.188436  0.294792  Rataxes
7  1.471766 -0.267807    Babar
8 -1.730745  1.358165  Rataxes

R) how to remove rows with empty values?

Try something like this

test[rowSums(is.na(test))!=ncol(test), ] # first set blank to NA

test[rowSums(test=="")!=ncol(test), ]

Remove rows with empty lists from pandas data frame

You could try slicing as though the data frame were strings instead of lists:

import pandas as pd
df = pd.DataFrame({
'donation_orgs' : [[], ['the research of Dr.']],
'donation_context': [[], ['In lieu of flowers , memorial donations']]})

df[df.astype(str)['donation_orgs'] != '[]']

Out[9]: 
                            donation_context          donation_orgs
1  [In lieu of flowers , memorial donations]  [the research of Dr.]

Delete rows with blank values in one particular column

 df[!(is.na(df$start_pc) | df$start_pc==""), ]

Removing All Empty Columns and Rows in Data.Frame When Rows Don't Go Away