Deleting DataFrame row in Pandas based on column value
If I'm understanding correctly, it should be as simple as:
df = df[df.line_race != 0]
How to delete rows from a pandas DataFrame based on a conditional expression
When you do len(df['column name'])
you are just getting one number, namely the number of rows in the DataFrame (i.e., the length of the column itself). If you want to apply len
to each element in the column, use df['column name'].map(len)
. So try
df[df['column name'].map(len) < 2]
Delete some rows in dataframe based on condition in another column
Not the most beautiful of ways to do it but this should work.
df = df.loc[df['value'].groupby(df['name']).cumsum().groupby(df['name']).cumsum() <=1]
Pandas: Remove rows where all values equal a certain value
I think you need .all
:
df = df[df.iloc[:, 1:5].ne('NF').all(axis=1)]
That will remove all rows where every value in the row is equal to NF
.
For multiple values:
df = df[~df.iloc[:, 1:5].isin(['NF', 'ABC', 'DEF']).all(axis=1)]
How to remove rows based on next value in a sequence? (pandas)
Let us do
m1 = (df['outcome'] !=
df['outcome'].shift()).cumsum()
out = df.groupby([df['id'],m1]).head(1)
id date outcome
0 3 03/05/2019 no
3 3 30/10/2019 yes
4 3 03/05/2020 no
5 5 03/12/2019 no
7 5 27/01/2020 yes
9 6 04/05/2019 no
11 6 26/11/2019 yes
26 6 05/05/2020 no
How to delete row in pandas dataframe based on condition if string is found in cell value of type list?
Since you filter on a list column, apply lambda would probably be the easiest:
df.loc[df.jpgs.apply(lambda x: "123.jpg" not in x)]
Quick comments on your attempts:
In
df = df.drop(df["123.jpg" in df.jpgs].index)
you are checking whether the exact value "123.jpg" is contained in the column ("123.jpg" in df.jpgs
) rather than in any of the lists, which is not what you want.In
df = df[df['jpgs'].str.contains('123.jpg') == False]
goes in the right direction, but you are missing theregex=False
keyword, as shown in Ibrahim's answer.df[df.jpgs.count("123.jpg") == 0]
is also not applicable here, sincecount
returns the total number of non-NaN values in the Series.
Pandas : How to drop a row where column values match with a specific value (all value are list of value)
Impractical solution that may trigger some new learning:
df = pd.DataFrame(
columns=" index drug prescript ".split(),
data= [
[ 0, 1, ['a', 's', 'd', 'f'], ],
[ 1, 2, ['e', 'a', 'e', 'f'], ],
[ 2, 3, ['e', 'a'], ],
[ 3, 4, ['a', 'complementary'], ],]).set_index("index", drop=True)
df.loc[
df['prescript'].explode().replace({'complementary': np.nan}).groupby(level=0).agg(lambda x: ~pd.isnull(x).any())
]
Delete repeating rows in a DataFrame based on a condition pandas
groupby
on SessionId
and pagePath
and find cumulative count of each pair's occurrence; then find difference of consecutive elements using np.ediff1d
and assign it to df['cumcount']
, and since we want to filter out consecutive duplicates, we filter out df['cumcount']!=1
:
cols = df.columns
df['cumcount'] = np.concatenate(([0], np.ediff1d(df.groupby(['SessionId','pagePath']).cumcount())))
out = df.loc[df['cumcount']!=1, cols]
Related Topics
What Does the "At" (@) Symbol Do in Python
Pip Install from Git Repo Branch
Python Function Global Variables
How Are Python'S Built in Dictionaries Implemented
Count the Frequency That a Value Occurs in a Dataframe Column
When Is "I += X" Different from "I = I + X" in Python
Adding a New Pandas Column With Mapped Value from a Dictionary
Filter Dataframe Rows If Value in Column Is in a Set List of Values
Read Subprocess Stdout Line by Line
Explanation of How Nested List Comprehension Works
How to Play Wav File in Python
Is There Any Pythonic Way to Combine Two Dicts (Adding Values For Keys That Appear in Both)
How to Concatenate Text Files in Python
Find If 24 Hrs Have Passed Between Datetimes
How to Resize an Image Using Pil and Maintain Its Aspect Ratio