Drop a specific row in Pandas
df = pd.DataFrame([['Jhon',15,'A'],['Anna',19,'B'],['Paul',25,'D']])
df. columns = ['Name','Age','Grade']
df
Out[472]:
Name Age Grade
0 Jhon 15 A
1 Anna 19 B
2 Paul 25 D
You can get the index of your row:
i = df[((df.Name == 'jhon') &( df.Age == 15) & (df.Grade == 'A'))].index
and then drop it:
df.drop(i)
Out[474]:
Name Age Grade
1 Anna 19 B
2 Paul 25 D
As @jezrael pointed our, you can also just negate all three:
df[((df.Name != 'jhon') &( df.Age != 15) & (df.Grade != 'A'))]
Out[477]:
Name Age Grade
1 Anna 19 B
2 Paul 25 D
How to delete the selected rows from the dataframe
Why even delete them? Just filter them out. You want to keep the rows that don't satisfy the conditions you specified here, so you want the negation of A&B
, which is ~A|~B
:
sv = sv[(sv['Visit_Duration'] != 'Never') | (sv['Visit_Plan'] != 'never')]
or equivalently (~(A&B)
),
sv = sv[~((sv['Visit_Duration'] == 'Never') & (sv['Visit_Plan'] == 'never'))]
Remove all rows having a specific value in dataframe
you check df1 != 999.9
in this check only first row delete because for other row you have column that != 999.9
.
Try this:
>>> mask = (df1 == 999.9).any(1)
>>> df1[~mask]
# for more explanation
>>> df1 == 999.9
a b
0 True True
1 True False
2 False True
3 False False
4 False False
in your solution:
>>> (df1 != 999.9)
a b
0 False False
1 False True
2 True False
3 True True
4 True True
>>> (df1 != 999.9).any(axis = 1) # for check rows
0 False
1 True
2 True
3 True
4 True
dtype: bool
>>> (df1 != 999.9).any(axis = 0) # for check columns
a True
b True
dtype: bool
Deleting DataFrame row in Pandas based on column value
If I'm understanding correctly, it should be as simple as:
df = df[df.line_race != 0]
Pandas delete a row in a dataframe based on a value
Find the row you want to delete, and use drop.
delete_row = df[df["Int"]==0].index
df = df.drop(delete_row)
print(df)
Code Int
1 A 1
2 B 1
Further more. you can use iloc to find the row, if you know the position of the column
delete_row = df[df.iloc[:,1]==0].index
df = df.drop(delete_row)
Remove dataframe row containing a specific in a list value from a list
You can approach in the following steps:
You can use
pd.Series.explode()
on each column/element to expand the list of strings into multiple rows, with each row contains only strings (all lists already got expanded / exploded into rows).Then check the dataframe for strings in the
to_delete
list by using.isin()
.Group by index level 0 (which contains original row index before explode) to aggregate and summarize the multiple rows matching result back into one row (using
.sum()
undergroupby()
).Then
.sum(axis=1)
to check row-wise any matching string to delete.Check for rows with 0 match (those rows to retain) and form a boolean index of the resulting rows.
Finally, use
.loc
to filter the rows without matching to retain.
df.loc[df.apply(pd.Series.explode).isin(to_delete).groupby(level=0).sum().sum(axis=1).eq(0)]
Result:
A B C D E
1 string2 string5 [string8] [string13] [string16]
The original dataframe can be built for testing from the following codes:
data = {'A': ['string1', 'string2', 'string3'],
'B': ['string4', 'string5', 'string6'],
'C': [['string7', 'string10'], ['string8'], ['string9']],
'D': [['string11', 'string 12'], ['string13'], ['string14']],
'E': [['string15'], ['string16'], ['string17']]}
df = pd.DataFrame(data)
Removing specific rows from a dataframe
DF[ ! ( ( DF$sub ==1 & DF$day==2) | ( DF$sub ==3 & DF$day==4) ) , ] # note the ! (negation)
Or if sub is a factor as suggested by your use of quotes:
DF[ ! paste(sub,day,sep="_") %in% c("1_2", "3_4"), ]
Could also use subset:
subset(DF, ! paste(sub,day,sep="_") %in% c("1_2", "3_4") )
(And I endorse the use of which
in Dirk's answer when using "[" even though some claim it is not needed.)
Remove rows in pandas dataframe after certain value (while for looping?)
You can check the condition and then use cummax
to set the condition to True after the first time it occurs within group. Then we slice the DataFrame:
mask = ~(a['event_type'].eq('purchase').groupby(a['user_session']).cummax())
a[mask]
# user_session event_type product_id
#0 1 view a
#1 1 cart b
#2 2 cart b
#3 2 cart c
#4 2 view d
Or if you need to also keep the purchase row use two groupbys, with a shift for the second:
mask = ~(a['event_type'].eq('purchase')
.groupby(a['user_session']).cummax()
.groupby(a['user_session']).shift()
.fillna(False))
a[mask]
# user_session event_type product_id
#0 1 view a
#1 1 cart b
#2 2 cart b
#3 2 cart c
#4 2 view d
#5 2 purchase d
Remove rows in pandas dataframe if any of specific columns contains a specific value
I think this is what you're asking:
df[~(df.le(95) & df.columns.str.contains("test"))].dropna()
Example (df
):
pump.test Speed feed.test water
0 100 1000 70 0.2
1 100 2000 100 0.3
2 100 3000 100 0.4
3 95 4000 100 0.5
Output of the operation above:
pump.test Speed feed.test water
1 100 2000 100 0.3
2 100 3000 100 0.4
Related Topics
R Xml - Combining Parent and Child Nodes into Data Frame
Convert Month Year to a Date in R
R: Find Vector in List of Vectors
Making Plot Functions with Ggplot and Aes_String
Determine Level of Nesting in R
Knitr: Include Figures in Report *And* Output Figures to Separate Files
How to Create a Bar Plot for Two Variables Mirrored Across the X-Axis in R
The Perils of Aligning Plots in Ggplot
Empty Factors in "By" Data.Table
Using R to Download Newest Files from Ftp-Server
Add Axis Tick-Marks on Top and to the Right to a Ggplot
How to Pass "Nothing" as an Argument to '[' for Subsetting
Reverse and Change Limit of Axis
Why Is := Allowed as an Infix Operator