Removing Specific Rows from a Dataframe

Drop a specific row in Pandas

df = pd.DataFrame([['Jhon',15,'A'],['Anna',19,'B'],['Paul',25,'D']])
df. columns = ['Name','Age','Grade']

df
Out[472]:
Name Age Grade
0 Jhon 15 A
1 Anna 19 B
2 Paul 25 D

You can get the index of your row:

i = df[((df.Name == 'jhon') &( df.Age == 15) & (df.Grade == 'A'))].index

and then drop it:

df.drop(i)
Out[474]:
Name Age Grade
1 Anna 19 B
2 Paul 25 D

As @jezrael pointed our, you can also just negate all three:

df[((df.Name != 'jhon') &( df.Age != 15) & (df.Grade != 'A'))]
Out[477]:
Name Age Grade
1 Anna 19 B
2 Paul 25 D

How to delete the selected rows from the dataframe

Why even delete them? Just filter them out. You want to keep the rows that don't satisfy the conditions you specified here, so you want the negation of A&B, which is ~A|~B:

sv = sv[(sv['Visit_Duration'] != 'Never')  |  (sv['Visit_Plan'] != 'never')]

or equivalently (~(A&B)),

sv = sv[~((sv['Visit_Duration'] == 'Never')  &  (sv['Visit_Plan'] == 'never'))]

Deleting DataFrame row in Pandas based on column value

If I'm understanding correctly, it should be as simple as:

df = df[df.line_race != 0]

Remove all rows having a specific value in dataframe

you check df1 != 999.9 in this check only first row delete because for other row you have column that != 999.9.

Try this:

>>> mask = (df1 == 999.9).any(1)
>>> df1[~mask]

# for more explanation

>>> df1 == 999.9
a b
0 True True
1 True False
2 False True
3 False False
4 False False

in your solution:

>>> (df1 != 999.9)
a b
0 False False
1 False True
2 True False
3 True True
4 True True

>>> (df1 != 999.9).any(axis = 1) # for check rows
0 False
1 True
2 True
3 True
4 True
dtype: bool

>>> (df1 != 999.9).any(axis = 0) # for check columns
a True
b True
dtype: bool

removing specific rows from pandas dataframe

Use value_counts() on studentID

import pandas as pd

df = pd.DataFrame({'studentID':['a','a','a','b','b','b', 'c'],
'problemID':[1,2,3,1,2,3,1]})
print(df)
tmp = df['studentID'].value_counts()
tmp = tmp[tmp >= 3]
new_df = df[df['studentID'].isin(tmp.index)]
print(new_df)

Output:

  studentID  problemID
0 a 1
1 a 2
2 a 3
3 b 1
4 b 2
5 b 3
6 c 1

studentID problemID
0 a 1
1 a 2
2 a 3
3 b 1
4 b 2
5 b 3

Remove dataframe row containing a specific in a list value from a list

You can approach in the following steps:

  1. You can use pd.Series.explode() on each column/element to expand the list of strings into multiple rows, with each row contains only strings (all lists already got expanded / exploded into rows).

  2. Then check the dataframe for strings in the to_delete list by using .isin().

  3. Group by index level 0 (which contains original row index before explode) to aggregate and summarize the multiple rows matching result back into one row (using .sum() under groupby()).

  4. Then .sum(axis=1) to check row-wise any matching string to delete.

  5. Check for rows with 0 match (those rows to retain) and form a boolean index of the resulting rows.

  6. Finally, use .loc to filter the rows without matching to retain.



df.loc[df.apply(pd.Series.explode).isin(to_delete).groupby(level=0).sum().sum(axis=1).eq(0)]

Result:

         A        B          C           D           E
1 string2 string5 [string8] [string13] [string16]

The original dataframe can be built for testing from the following codes:

data = {'A': ['string1', 'string2', 'string3'],
'B': ['string4', 'string5', 'string6'],
'C': [['string7', 'string10'], ['string8'], ['string9']],
'D': [['string11', 'string 12'], ['string13'], ['string14']],
'E': [['string15'], ['string16'], ['string17']]}

df = pd.DataFrame(data)

pandas dataframe drop problem, want to delete specific rows?

You can use the .isin() method of pandas Series

df2["Account Number"].isin(df1["Account Number"])

This will give you Series of boolean values which will be true for all rows where Account Number in df2 is present in df1 as well. Since, you want to discard those rows, you can use series indexing along with ~ (negation operator) like this:

df3 = df2[~df2["Account Number"].isin(df1["Account Number"])]

Pandas delete a row in a dataframe based on a value

Find the row you want to delete, and use drop.

delete_row = df[df["Int"]==0].index
df = df.drop(delete_row)
print(df)
Code Int
1 A 1
2 B 1

Further more. you can use iloc to find the row, if you know the position of the column

delete_row = df[df.iloc[:,1]==0].index
df = df.drop(delete_row)

Pandas + Delete specific rows not by index

I suggest:

df = df[df.fullname!='xxx']

and for a list:

names=['xxx','bbb']
df = df[~df.fullname.isin(names)]

the ~ operator means "not"



Related Topics



Leave a reply



Submit