Select Row with Most Recent Date by Group

Select row with most recent date by group

You can try

library(dplyr)
df %>%
group_by(ID) %>%
slice(which.max(as.Date(date, '%m/%d/%Y')))

data

df <- data.frame(ID= rep(1:3, each=3), date=c('02/20/1989',
'03/14/2001', '02/25/1990', '04/20/2002', '02/04/2005', '02/01/2008',
'08/22/2011','08/20/2009', '08/25/2010' ), stringsAsFactors=FALSE)

SQL Group By most recent date and sales value


SELECT ID, Name, Order, Date FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC) AS sn
FROM your_table_name
) A WHERE sn = 1;

Select row with most recent date per location and increment recent date by 1 for each row by location using MariaDB

You could use analytic window functions and update the original table by joining to a sub-query (works for MariaDB):

update t
join (
select Id,
Date_Add(First_Value(date) over(partition by locationId order by date desc),
interval (13 + row_number() over(partition by locationId order by date desc)) day
) NewDate
from t
)nd on t.id = nd.id
set t.Newdate = nd.NewDate;

See DB<>Fiddle example

Keeping only rows with most recent date in dataframe

This can be done by sort_values & drop_duplicates:

df = df.sort_values(by=['Modified Date'], ascending=False)
df = drop_duplicates(subset='School ID', keep='first)

Where the sort ensures that for each school the newest date will appear first, and the drop duplicates takes the first appearance of each school, which is the newest.



Related Topics



Leave a reply



Submit