How to get rows, by group, of data frame with earliest timestamp?
You could use dplyr
library(dplyr)
group_by(df, group) %>% summarise(min = min(ts), letter = letter[which.min(ts)])
# group min letter
# 1 1 2013-02-01 e
# 2 2 2014-02-11 d
# 3 3 2014-02-11 i
# 4 4 2014-02-02 f
You could also slice
the ranked rows
group_by(df, group) %>%
mutate(rank = row_number(ts)) %>%
arrange(rank) %>%
slice(1)
group by pandas dataframe and select latest in each group
use idxmax
in groupby
and slice df
with loc
df.loc[df.groupby('id').date.idxmax()]
id product date
2 220 6647 2014-10-16
5 826 3380 2015-05-19
8 901 4555 2014-11-01
Slice the rows with the earliest datetime
df %>%
group_by(ID) %>%
filter(DATETIME_OF_PROCEDURE == min(DATETIME_OF_PROCEDURE))
Pandas dataframe get first row of each group
>>> df.groupby('id').first()
value
id
1 first
2 first
3 first
4 second
5 first
6 first
7 fourth
If you need id
as column:
>>> df.groupby('id').first().reset_index()
id value
0 1 first
1 2 first
2 3 first
3 4 second
4 5 first
5 6 first
6 7 fourth
To get n first records, you can use head():
>>> df.groupby('id').head(2).reset_index(drop=True)
id value
0 1 first
1 1 second
2 2 first
3 2 second
4 3 first
5 3 third
6 4 second
7 4 fifth
8 5 first
9 6 first
10 6 second
11 7 fourth
12 7 fifth
how to select oldest record of each group in a dataframe? using python
On Pandas, you can use groupby
command to group values. Also, by using head
command with groupby
command, you can select first two values in the group. So, in your case, to group first two ticker, the command will be:
df.sort_values('date').groupby('ticker').head(2)
pandas groupby date select earliest per day
If you have unique index, you can use idxmin
on timestamp
to find out the indices of the minimum timestamp and extract them with loc
:
df.timestamp = pd.to_datetime(df.timestamp)
df.loc[df.groupby(df.timestamp.dt.date, as_index=False).timestamp.idxmin()]
# value timestamp
#7 Fire 2017-10-03 14:31:55
#6 Water 2017-10-04 14:32:01
#5 Water 2017-10-05 14:32:13
Pandas dataframe: How to sort groups by the earliest time of a group
we can group by the eventid and get first(min) time as group value.
will get data like this
time
eventid
1 9:10
2 9:00
3 9:40
then we merge to dataframe,and sort by the grouped time
groups = df.groupby('eventid').min('time')
df = df.merge(groups,on='eventid',suffixes=('','_right'))
df = df.sort_values('time_right')
eventid time time_right
2 2 9:20 9:00
3 2 9:00 9:00
0 1 9:10 9:10
1 1 9:30 9:10
4 3 9:40 9:40
5 3 9:50 9:40
Select row with most recent date by group
You can try
library(dplyr)
df %>%
group_by(ID) %>%
slice(which.max(as.Date(date, '%m/%d/%Y')))
data
df <- data.frame(ID= rep(1:3, each=3), date=c('02/20/1989',
'03/14/2001', '02/25/1990', '04/20/2002', '02/04/2005', '02/01/2008',
'08/22/2011','08/20/2009', '08/25/2010' ), stringsAsFactors=FALSE)
Related Topics
Running Multiple Linear Regressions Across Several Columns of a Data Frame in R
Referring to Variables by Name in a Dplyr Function Returns Object Not Found Error
Which Library Could Be Used to Make a Chord Diagram in R
Select Unique Values with 'Select' Function in 'Dplyr' Library
Compare If Two Dataframe Objects in R Are Equal
Writing Functions VS. Line-By-Line Interpretation in an R Workflow
Plot Every Column in a Data Frame as a Histogram on One Page Using Ggplot
Time Difference in Years with Lubridate
Creating Vector of Results of Repeated Function Calls in R
How to Calculate Wind Direction from U and V Wind Components in R
Scale_Color_Manual Colors Won't Change
Extract Nested List Elements Using Bracketed Numbers and Names
Increase Legend Font Size Ggplot2
Install a Local R Package with Dependencies from Cran Mirror
Extract Random Effect Variances from Lme4 Mer Model Object
How to Add Rmse, Slope, Intercept, R^2 to R Plot
Duplicate a Column in Data Frame and Rename It to Another Column Name