Deleting Dataframe Row in Pandas Based on Column Value

Deleting DataFrame row in Pandas based on column value

If I'm understanding correctly, it should be as simple as:

df = df[df.line_race != 0]

How to delete rows from a pandas DataFrame based on a conditional expression

When you do len(df['column name']) you are just getting one number, namely the number of rows in the DataFrame (i.e., the length of the column itself). If you want to apply len to each element in the column, use df['column name'].map(len). So try

df[df['column name'].map(len) < 2]

Delete some rows in dataframe based on condition in another column

Not the most beautiful of ways to do it but this should work.

df = df.loc[df['value'].groupby(df['name']).cumsum().groupby(df['name']).cumsum() <=1]

Pandas: Remove rows where all values equal a certain value

I think you need .all:

df = df[df.iloc[:, 1:5].ne('NF').all(axis=1)]

That will remove all rows where every value in the row is equal to NF.

For multiple values:

df = df[~df.iloc[:, 1:5].isin(['NF', 'ABC', 'DEF']).all(axis=1)]

How to remove rows based on next value in a sequence? (pandas)

Let us do

m1 = (df['outcome'] !=
df['outcome'].shift()).cumsum()
out = df.groupby([df['id'],m1]).head(1)
id date outcome
0 3 03/05/2019 no
3 3 30/10/2019 yes
4 3 03/05/2020 no
5 5 03/12/2019 no
7 5 27/01/2020 yes
9 6 04/05/2019 no
11 6 26/11/2019 yes
26 6 05/05/2020 no

How to delete row in pandas dataframe based on condition if string is found in cell value of type list?

Since you filter on a list column, apply lambda would probably be the easiest:

df.loc[df.jpgs.apply(lambda x: "123.jpg" not in x)]

Quick comments on your attempts:

  • In df = df.drop(df["123.jpg" in df.jpgs].index) you are checking whether the exact value "123.jpg" is contained in the column ("123.jpg" in df.jpgs) rather than in any of the lists, which is not what you want.

  • In df = df[df['jpgs'].str.contains('123.jpg') == False] goes in the right direction, but you are missing the regex=False keyword, as shown in Ibrahim's answer.

  • df[df.jpgs.count("123.jpg") == 0] is also not applicable here, since count returns the total number of non-NaN values in the Series.

Pandas : How to drop a row where column values match with a specific value (all value are list of value)

Impractical solution that may trigger some new learning:

df = pd.DataFrame(
columns=" index drug prescript ".split(),
data= [
[ 0, 1, ['a', 's', 'd', 'f'], ],
[ 1, 2, ['e', 'a', 'e', 'f'], ],
[ 2, 3, ['e', 'a'], ],
[ 3, 4, ['a', 'complementary'], ],]).set_index("index", drop=True)

df.loc[
df['prescript'].explode().replace({'complementary': np.nan}).groupby(level=0).agg(lambda x: ~pd.isnull(x).any())
]

Delete repeating rows in a DataFrame based on a condition pandas

groupby on SessionId and pagePath and find cumulative count of each pair's occurrence; then find difference of consecutive elements using np.ediff1d and assign it to df['cumcount'], and since we want to filter out consecutive duplicates, we filter out df['cumcount']!=1:

cols = df.columns
df['cumcount'] = np.concatenate(([0], np.ediff1d(df.groupby(['SessionId','pagePath']).cumcount())))
out = df.loc[df['cumcount']!=1, cols]


Related Topics



Leave a reply



Submit