Finding Non-Numeric Rows in Dataframe in Pandas

Finding non-numeric rows in dataframe in pandas?

You could use np.isreal to check the type of each element (applymap applies a function to each element in the DataFrame):

In [11]: df.applymap(np.isreal)
Out[11]:
a b
item
a True True
b True True
c True True
d False True
e True True

If all in the row are True then they are all numeric:

In [12]: df.applymap(np.isreal).all(1)
Out[12]:
item
a True
b True
c True
d False
e True
dtype: bool

So to get the subDataFrame of rouges, (Note: the negation, ~, of the above finds the ones which have at least one rogue non-numeric):

In [13]: df[~df.applymap(np.isreal).all(1)]
Out[13]:
a b
item
d bad 0.4

You could also find the location of the first offender you could use argmin:

In [14]: np.argmin(df.applymap(np.isreal).all(1))
Out[14]: 'd'

As @CTZhu points out, it may be slightly faster to check whether it's an instance of either int or float (there is some additional overhead with np.isreal):

df.applymap(lambda x: isinstance(x, (int, float)))

get non numerical rows in a column pandas python

Use boolean indexing with mask created by to_numeric + isnull

Note: This solution does not find or filter numbers saved as strings: like '1' or '22'

print (pd.to_numeric(df['num'], errors='coerce'))
0 -1.48
1 1.70
2 -6.18
3 0.25
4 NaN
5 0.25
Name: num, dtype: float64

print (pd.to_numeric(df['num'], errors='coerce').isnull())
0 False
1 False
2 False
3 False
4 True
5 False
Name: num, dtype: bool

print (df[pd.to_numeric(df['num'], errors='coerce').isnull()])
N-D num unit
4 Q5 sum(d) UD

Another solution with isinstance and apply:

print (df[df['num'].apply(lambda x: isinstance(x, str))])
N-D num unit
4 Q5 sum(d) UD

Find non-numeric values in pandas dataframe column

you can change dtype

    df.column.dtype=df.column.astype(int)

Filter out non-numeric values from column

Try with pd.to_numeric():

pd.to_numeric(df.col1,errors='coerce').min()
#1.2
#or df.col1.apply(lambda x: pd.to_numeric(x,errors='coerce')).min() <- slow

Need to delete non-numeric rows from a dataframe

I ended up doing it this way.

cols = df_append.columns[:-1]
df_append[cols] = df_append[cols].apply(pd.to_numeric, errors='coerce')
df_append = df_append.fillna(0)

That's good enough for my purpose!

How to display non numeric values from data frame

Use str.isdigit with ~ to invert the boolean mask:

In[6]: df.loc[~df['Value'].astype(str).str.isdigit()]

Out[6]:
Measure Value
1 B 1000/
2 C 1000*
4 E 1000 0
6 G 5..
8 I w
10 L NaN

If the dtype of the column is already str then you don't need the astype(str) call



Related Topics



Leave a reply



Submit