Get the Row(S) Which Have the Max Value in Groups Using Groupby

Groupby and filter by max value in pandas

You can do this:

latest = df.query('Value==1').groupby("ID").max("year").assign(Latest = "Latest")
pd.merge(df,latest,how="outer")

   Value  ID  Date  Latest
0      1   5  2012     NaN
1      1   5  2013  Latest
2      0  12  2017     NaN
3      0  12  2022     NaN
4      1  27  2005     NaN
5      1  27  2011  Latest

Python Pandas Dataframe select row by max value in group

A standard approach is to use groupby(keys)[column].idxmax().
However, to select the desired rows using idxmax you need idxmax to return unique index values. One way to obtain a unique index is to call reset_index.

Once you obtain the index values from groupby(keys)[column].idxmax() you can then select the entire row using df.loc:

In [20]: df.loc[df.reset_index().groupby(['F_Type'])['to_date'].idxmax()]
Out[20]: 
                       start    end
F_Type to_date                     
A      20150908143000    345    316
B      20150908143000  10743   8803
C      20150908143000  19522  16659
D      20150908143000    433     65
E      20150908143000   7290   7375
F      20150908143000      0      0
G      20150908143000   1796    340

Note: idxmax returns index labels, not necessarily ordinals. After using reset_index the index labels happen to also be ordinals, but since idxmax is returning labels (not ordinals) it is better to always use idxmax in conjunction with df.loc, not df.iloc (as I originally did in this post.)

How to select row with max value in column from pandas groupby() groups?

You can do this by combining this answer with a groupby to get the list of stores they have worked at.

# Get stores that each person works at
stores_for_each_name = df.groupby('name')['store'].apply(','.join)

# Get row with largest order value for each name
df = df.sort_values('orders', ascending=False).drop_duplicates('name').rename({'orders': 'max_orders'}, axis=1)

# Replace store column with comma-separated list of stores they have worked at
df = df.drop('store', axis=1)
df = df.join(stores_for_each_name, on='name')

Output:

   name   stuff  max_orders  store
3   bob  xcxfcd           5      A
1   ann  dsdfds           3    A,C
4  john  uityuu           3  A,B,C

get rows with largest value in grouping

Use DataFrameGroupBy.idxmax if need select only one max value:

df = df.loc[df.groupby('id')['value'].idxmax()]
print (df)
    id other_value  value
2    1           b      5
5    2           d      6
7    3           f      4
10   4           e      7

If multiple max values and want seelct all rows by max values:

df = pd.DataFrame({'id' : [1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 4],
                   'other_value' : ['a', 'e', 'b', 'b', 'a', 'd', 'b', 'f' ,'a' ,'c', 'e', 'f'],
                   'value' : [1, 3, 5, 2, 5, 6, 2, 4, 6, 1, 7, 7]
                   })

print (df)
    id other_value  value
0    1           a      1
1    1           e      3
2    1           b      5
3    2           b      2
4    2           a      5
5    2           d      6
6    3           b      2
7    3           f      4
8    4           a      6
9    4           c      1
10   4           e      7
11   4           f      7

df = df[df.groupby('id')['value'].transform('max') == df['value']]
print (df)
    id other_value  value
2    1           b      5
5    2           d      6
7    3           f      4
10   4           e      7
11   4           f      7

how do you fill row values of a column groupby with the max value of the grouped data

You can use a group by in combination with a transform "max." I'm not sure if you would simply want to replace the 'fail' column or if you would want to make a new column but this should get you the expected results.

df['fail'] = df.groupby(['Cow', 'Lact'])['fail'].transform(max)