Deleting Specific Rows from a Data Frame

Drop a specific row in Pandas

df = pd.DataFrame([['Jhon',15,'A'],['Anna',19,'B'],['Paul',25,'D']])
df. columns = ['Name','Age','Grade']

df
Out[472]:
Name Age Grade
0 Jhon 15 A
1 Anna 19 B
2 Paul 25 D

You can get the index of your row:

i = df[((df.Name == 'jhon') &( df.Age == 15) & (df.Grade == 'A'))].index

and then drop it:

df.drop(i)
Out[474]:
Name Age Grade
1 Anna 19 B
2 Paul 25 D

As @jezrael pointed our, you can also just negate all three:

df[((df.Name != 'jhon') &( df.Age != 15) & (df.Grade != 'A'))]
Out[477]:
Name Age Grade
1 Anna 19 B
2 Paul 25 D

How to delete the selected rows from the dataframe

Why even delete them? Just filter them out. You want to keep the rows that don't satisfy the conditions you specified here, so you want the negation of A&B, which is ~A|~B:

sv = sv[(sv['Visit_Duration'] != 'Never')  |  (sv['Visit_Plan'] != 'never')]

or equivalently (~(A&B)),

sv = sv[~((sv['Visit_Duration'] == 'Never')  &  (sv['Visit_Plan'] == 'never'))]

Remove all rows having a specific value in dataframe

you check df1 != 999.9 in this check only first row delete because for other row you have column that != 999.9.

Try this:

>>> mask = (df1 == 999.9).any(1)
>>> df1[~mask]

# for more explanation

>>> df1 == 999.9
a b
0 True True
1 True False
2 False True
3 False False
4 False False

in your solution:

>>> (df1 != 999.9)
a b
0 False False
1 False True
2 True False
3 True True
4 True True

>>> (df1 != 999.9).any(axis = 1) # for check rows
0 False
1 True
2 True
3 True
4 True
dtype: bool

>>> (df1 != 999.9).any(axis = 0) # for check columns
a True
b True
dtype: bool

Deleting DataFrame row in Pandas based on column value

If I'm understanding correctly, it should be as simple as:

df = df[df.line_race != 0]

Pandas delete a row in a dataframe based on a value

Find the row you want to delete, and use drop.

delete_row = df[df["Int"]==0].index
df = df.drop(delete_row)
print(df)
Code Int
1 A 1
2 B 1

Further more. you can use iloc to find the row, if you know the position of the column

delete_row = df[df.iloc[:,1]==0].index
df = df.drop(delete_row)

Remove dataframe row containing a specific in a list value from a list

You can approach in the following steps:

  1. You can use pd.Series.explode() on each column/element to expand the list of strings into multiple rows, with each row contains only strings (all lists already got expanded / exploded into rows).

  2. Then check the dataframe for strings in the to_delete list by using .isin().

  3. Group by index level 0 (which contains original row index before explode) to aggregate and summarize the multiple rows matching result back into one row (using .sum() under groupby()).

  4. Then .sum(axis=1) to check row-wise any matching string to delete.

  5. Check for rows with 0 match (those rows to retain) and form a boolean index of the resulting rows.

  6. Finally, use .loc to filter the rows without matching to retain.



df.loc[df.apply(pd.Series.explode).isin(to_delete).groupby(level=0).sum().sum(axis=1).eq(0)]

Result:

         A        B          C           D           E
1 string2 string5 [string8] [string13] [string16]

The original dataframe can be built for testing from the following codes:

data = {'A': ['string1', 'string2', 'string3'],
'B': ['string4', 'string5', 'string6'],
'C': [['string7', 'string10'], ['string8'], ['string9']],
'D': [['string11', 'string 12'], ['string13'], ['string14']],
'E': [['string15'], ['string16'], ['string17']]}

df = pd.DataFrame(data)

Removing specific rows from a dataframe

DF[ ! ( ( DF$sub ==1 & DF$day==2) | ( DF$sub ==3 & DF$day==4) ) , ]   # note the ! (negation)

Or if sub is a factor as suggested by your use of quotes:

DF[ ! paste(sub,day,sep="_") %in% c("1_2", "3_4"), ]

Could also use subset:

subset(DF,  ! paste(sub,day,sep="_") %in% c("1_2", "3_4") )

(And I endorse the use of which in Dirk's answer when using "[" even though some claim it is not needed.)

Remove rows in pandas dataframe after certain value (while for looping?)

You can check the condition and then use cummax to set the condition to True after the first time it occurs within group. Then we slice the DataFrame:

mask = ~(a['event_type'].eq('purchase').groupby(a['user_session']).cummax())

a[mask]
# user_session event_type product_id
#0 1 view a
#1 1 cart b
#2 2 cart b
#3 2 cart c
#4 2 view d

Or if you need to also keep the purchase row use two groupbys, with a shift for the second:

mask = ~(a['event_type'].eq('purchase')
.groupby(a['user_session']).cummax()
.groupby(a['user_session']).shift()
.fillna(False))

a[mask]
# user_session event_type product_id
#0 1 view a
#1 1 cart b
#2 2 cart b
#3 2 cart c
#4 2 view d
#5 2 purchase d

Remove rows in pandas dataframe if any of specific columns contains a specific value

I think this is what you're asking:

df[~(df.le(95) & df.columns.str.contains("test"))].dropna()

Example (df):

    pump.test  Speed  feed.test  water
0 100 1000 70 0.2
1 100 2000 100 0.3
2 100 3000 100 0.4
3 95 4000 100 0.5

Output of the operation above:

    pump.test  Speed  feed.test  water
1 100 2000 100 0.3
2 100 3000 100 0.4


Related Topics



Leave a reply



Submit