Drop rows containing empty cells from a pandas DataFrame
Pandas will recognise a value as null if it is a np.nan
object, which will print as NaN
in the DataFrame. Your missing values are probably empty strings, which Pandas doesn't recognise as null. To fix this, you can convert the empty stings (or whatever is in your empty cells) to np.nan
objects using replace()
, and then call dropna()
on your DataFrame to delete rows with null tenants.
To demonstrate, we create a DataFrame with some random values and some empty strings in a Tenants
column:
>>> import pandas as pd
>>> import numpy as np
>>>
>>> df = pd.DataFrame(np.random.randn(10, 2), columns=list('AB'))
>>> df['Tenant'] = np.random.choice(['Babar', 'Rataxes', ''], 10)
>>> print df
A B Tenant
0 -0.588412 -1.179306 Babar
1 -0.008562 0.725239
2 0.282146 0.421721 Rataxes
3 0.627611 -0.661126 Babar
4 0.805304 -0.834214
5 -0.514568 1.890647 Babar
6 -1.188436 0.294792 Rataxes
7 1.471766 -0.267807 Babar
8 -1.730745 1.358165 Rataxes
9 0.066946 0.375640
Now we replace any empty strings in the Tenants
column with np.nan
objects, like so:
>>> df['Tenant'].replace('', np.nan, inplace=True)
>>> print df
A B Tenant
0 -0.588412 -1.179306 Babar
1 -0.008562 0.725239 NaN
2 0.282146 0.421721 Rataxes
3 0.627611 -0.661126 Babar
4 0.805304 -0.834214 NaN
5 -0.514568 1.890647 Babar
6 -1.188436 0.294792 Rataxes
7 1.471766 -0.267807 Babar
8 -1.730745 1.358165 Rataxes
9 0.066946 0.375640 NaN
Now we can drop the null values:
>>> df.dropna(subset=['Tenant'], inplace=True)
>>> print df
A B Tenant
0 -0.588412 -1.179306 Babar
2 0.282146 0.421721 Rataxes
3 0.627611 -0.661126 Babar
5 -0.514568 1.890647 Babar
6 -1.188436 0.294792 Rataxes
7 1.471766 -0.267807 Babar
8 -1.730745 1.358165 Rataxes
Delete rows from pandas dataframe if all its columns have empty string
You can do:
# df.eq('') compare every cell of `df` to `''`
# .all(1) or .all(axis=1) checks if all cells on rows are True
# ~ is negate operator.
mask = ~df.eq('').all(1)
# equivalently, `ne` for `not equal`,
# mask = df.ne('').any(axis=1)
# mask is a boolean series of same length with `df`
# this is called boolean indexing, similar to numpy's
# which chooses only rows corresponding to `True`
df = df[mask]
Or in one line:
df = df[~df.eq('').all(1)]
Only remove entirely empty rows in pandas
Check the docs page
df.dropna(how='all')
Related Topics
How to Save Opened Page as Pdf in Selenium (Python)
Read Multiple Images on a Folder in Opencv (Python)
Replace a Word in a String by Indexing Without "String Replace Function" -Python
How to Delete Lines from CSV File Using Python
Python, Deleting All Files in a Folder Older Than X Days
Django Development Server, How to Stop It When It Run in Background
How to Send Keys to a Game I Am Playing,Using Python
Python Regex - Finding Phone Number
Sum Numbers of Each Row of a Matrix Python
No Matching Distribution Found for Tkinter
Import Error: Dll Load Failed in Jupyter Notebook But Working in .Py File
Xlsxwriter: How to Open an Existing Worksheet in My Workbook
How to Calculate Range Between the Dataframe Values Using Python
Python Tkinter How to Update a Text Widget in a for Loop
Is There an Easy Way in Python to Wait Until Certain Condition Is True
Python - How to Extract Elements from an Array Based on an Array of Indices