Finding non-numeric rows in dataframe in pandas?
You could use np.isreal
to check the type of each element (applymap applies a function to each element in the DataFrame):
In [11]: df.applymap(np.isreal)
Out[11]:
a b
item
a True True
b True True
c True True
d False True
e True True
If all in the row are True then they are all numeric:
In [12]: df.applymap(np.isreal).all(1)
Out[12]:
item
a True
b True
c True
d False
e True
dtype: bool
So to get the subDataFrame of rouges, (Note: the negation, ~, of the above finds the ones which have at least one rogue non-numeric):
In [13]: df[~df.applymap(np.isreal).all(1)]
Out[13]:
a b
item
d bad 0.4
You could also find the location of the first offender you could use argmin:
In [14]: np.argmin(df.applymap(np.isreal).all(1))
Out[14]: 'd'
As @CTZhu points out, it may be slightly faster to check whether it's an instance of either int or float (there is some additional overhead with np.isreal):
df.applymap(lambda x: isinstance(x, (int, float)))
get non numerical rows in a column pandas python
Use boolean indexing
with mask created by to_numeric
+ isnull
Note: This solution does not find or filter numbers saved as strings: like '1' or '22'
print (pd.to_numeric(df['num'], errors='coerce'))
0 -1.48
1 1.70
2 -6.18
3 0.25
4 NaN
5 0.25
Name: num, dtype: float64
print (pd.to_numeric(df['num'], errors='coerce').isnull())
0 False
1 False
2 False
3 False
4 True
5 False
Name: num, dtype: bool
print (df[pd.to_numeric(df['num'], errors='coerce').isnull()])
N-D num unit
4 Q5 sum(d) UD
Another solution with isinstance
and apply
:
print (df[df['num'].apply(lambda x: isinstance(x, str))])
N-D num unit
4 Q5 sum(d) UD
Find non-numeric values in pandas dataframe column
you can change dtype
df.column.dtype=df.column.astype(int)
Filter out non-numeric values from column
Try with pd.to_numeric()
:
pd.to_numeric(df.col1,errors='coerce').min()
#1.2
#or df.col1.apply(lambda x: pd.to_numeric(x,errors='coerce')).min() <- slow
Need to delete non-numeric rows from a dataframe
I ended up doing it this way.
cols = df_append.columns[:-1]
df_append[cols] = df_append[cols].apply(pd.to_numeric, errors='coerce')
df_append = df_append.fillna(0)
That's good enough for my purpose!
How to display non numeric values from data frame
Use str.isdigit
with ~
to invert the boolean mask:
In[6]: df.loc[~df['Value'].astype(str).str.isdigit()]
Out[6]:
Measure Value
1 B 1000/
2 C 1000*
4 E 1000 0
6 G 5..
8 I w
10 L NaN
If the dtype
of the column is already str
then you don't need the astype(str)
call
Related Topics
How to Track the Number of Times a Function Is Called
How to Convert List into String With Quotes in Python
What Is the Most Efficient Way to Sum a Dict With Multiple Keys by One Key
Truth Value of a Series Is Ambiguous. Use A.Empty, A.Bool(), A.Item(), A.Any() or A.All()
How to Compute Mean() for Particular Column in Pandas Dataframe Without Considering Nan Values
How to Assign Class Instance to a Variable and Use That in Other Class
Python - Ensuring a Variable Holds a Positive Number
How to Select Only One Column Using Sqlalchemy
Plotly: Plot Multiple Figures as Subplots
Generate List of Quarters Betweeen Given Dates
Changing Presence Discord Status
How to Change a Dataframe Column from String Type to Double Type in Pyspark
Python, Deleting All Files in a Folder Older Than X Days
Import Local Module in Jupyter Notebook
How to Get the Url of the Active Google Chrome Tab in Windows