Select DataFrame rows between two dates
There are two possible solutions:
- Use a boolean mask, then use
df.loc[mask]
- Set the date column as a DatetimeIndex, then use
df[start_date : end_date]
Using a boolean mask:
Ensure df['date']
is a Series with dtype datetime64[ns]
:
df['date'] = pd.to_datetime(df['date'])
Make a boolean mask. start_date
and end_date
can be datetime.datetime
s,np.datetime64
s, pd.Timestamp
s, or even datetime strings:
#greater than the start date and smaller than the end date
mask = (df['date'] > start_date) & (df['date'] <= end_date)
Select the sub-DataFrame:
df.loc[mask]
or re-assign to df
df = df.loc[mask]
For example,
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
mask = (df['date'] > '2000-6-1') & (df['date'] <= '2000-6-10')
print(df.loc[mask])
yields
0 1 2 date
153 0.208875 0.727656 0.037787 2000-06-02
154 0.750800 0.776498 0.237716 2000-06-03
155 0.812008 0.127338 0.397240 2000-06-04
156 0.639937 0.207359 0.533527 2000-06-05
157 0.416998 0.845658 0.872826 2000-06-06
158 0.440069 0.338690 0.847545 2000-06-07
159 0.202354 0.624833 0.740254 2000-06-08
160 0.465746 0.080888 0.155452 2000-06-09
161 0.858232 0.190321 0.432574 2000-06-10
Using a DatetimeIndex:
If you are going to do a lot of selections by date, it may be quicker to set thedate
column as the index first. Then you can select rows by date usingdf.loc[start_date:end_date]
.
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
df = df.set_index(['date'])
print(df.loc['2000-6-1':'2000-6-10'])
yields
0 1 2
date
2000-06-01 0.040457 0.326594 0.492136 # <- includes start_date
2000-06-02 0.279323 0.877446 0.464523
2000-06-03 0.328068 0.837669 0.608559
2000-06-04 0.107959 0.678297 0.517435
2000-06-05 0.131555 0.418380 0.025725
2000-06-06 0.999961 0.619517 0.206108
2000-06-07 0.129270 0.024533 0.154769
2000-06-08 0.441010 0.741781 0.470402
2000-06-09 0.682101 0.375660 0.009916
2000-06-10 0.754488 0.352293 0.339337
While Python list indexing, e.g. seq[start:end]
includes start
but not end
, in contrast, Pandas df.loc[start_date : end_date]
includes both end-points in the result if they are in the index. Neither start_date
nor end_date
has to be in the index however.
Also note that pd.read_csv
has a parse_dates
parameter which you could use to parse the date
column as datetime64
s. Thus, if you use parse_dates
, you would not need to use df['date'] = pd.to_datetime(df['date'])
.
SELECT row by DATEPART()
There is no DATEPART
function in MySQL. Use MONTH(date_column)
or EXTRACT(MONTH FROM date_column)
instead.
SQL - Selecting rows within a date range?
You should use date() for timestamp and proper quote around the date value
SELECT *
FROM tbl_recordings
WHERE date(timestamp)
between str_to_date('2019-03-01', '%Y-%m-%d')
and str_to_date('2019-03-08', '%Y-%m-%d');
or
SELECT *
FROM tbl_recordings
WHERE date(timestamp) between '2019-03-01' and '2019-03-08';
How to select rows between two date with next and previous row
Try the following query
create table TestData(ID int,PersonID int,[Date] date)
insert TestData(ID,PersonID,[Date])values
(1 ,1,'20170401'),(2 ,1,'20170415'),(3 ,1,'20170513'),
(4 ,1,'20170615'),(5 ,1,'20170813'),(6 ,1,'20171002'),
(7 ,2,'20170504'),(8 ,2,'20170916'),(9 ,3,'20170423'),
(10,3,'20170706'),(11,4,'20170601')
----------------
DECLARE
@FromDate date='20170501',
@ToDate date='20170826'
SELECT *
FROM
(
SELECT
*,
LAG(IIF([Date] BETWEEN @FromDate AND @ToDate,1,0))OVER(PARTITION BY PersonID ORDER BY [Date],ID) LagOK,
LEAD(IIF([Date] BETWEEN @FromDate AND @ToDate,1,0))OVER(PARTITION BY PersonID ORDER BY [Date],ID) LeadOK
FROM TestData
) q
WHERE ([Date] BETWEEN @FromDate AND @ToDate OR LagOK=1 OR LeadOK=1)
A variant with CTE
and ROW_NUMBER
;WITH numCTE AS(
SELECT
*,
ROW_NUMBER()OVER(PARTITION BY PersonID ORDER BY [Date],ID) N
FROM TestData
)
SELECT n.*
FROM
(
SELECT PersonID,MIN(N)-1 MinN,MAX(N)+1 MaxN
FROM numCTE
WHERE [Date] BETWEEN @FromDate AND @ToDate
GROUP BY PersonID
) q
JOIN numCTE n on n.PersonID=q.PersonID AND n.N BETWEEN q.MinN AND q.MaxN
I've added new test data and modified queries for this case, too
create table TestData(ID int,PersonID int,[Date] date)
insert TestData(ID,PersonID,[Date])values
(1 ,1,'20170401'),(2 ,1,'20170415'),(3 ,1,'20170513'),
(4 ,1,'20170615'),(5 ,1,'20170813'),(6 ,1,'20171002'),
(7 ,2,'20170504'),(8 ,2,'20170916'),(9 ,3,'20170423'),
(10,3,'20170706'),(11,4,'20170601'),
(14,6,'20170415'),(15,6,'20170913'),(16,6,'20171015') -- new test data
DECLARE
@FromDate date='20170501',
@ToDate date='20170826'
SELECT *
FROM
(
SELECT
*,
LAG(IIF([Date] BETWEEN @FromDate AND @ToDate,1,0))OVER(PARTITION BY PersonID ORDER BY [Date],ID) LagOK,
LEAD(IIF([Date] BETWEEN @FromDate AND @ToDate,1,0))OVER(PARTITION BY PersonID ORDER BY [Date],ID) LeadOK
FROM
(
SELECT ID,PersonID,[Date]
FROM TestData
UNION ALL
SELECT DISTINCT NULL,PersonID,@FromDate -- add phantom rows for some people
FROM TestData p
WHERE NOT EXISTS(SELECT * FROM TestData d WHERE d.[Date] BETWEEN @FromDate AND @ToDate AND d.PersonID=p.PersonID)
) q
) q
WHERE ([Date] BETWEEN @FromDate AND @ToDate OR LagOK=1 OR LeadOK=1)
AND ID IS NOT NULL -- exclude phantom rows from result
And a new variant with CTE and ROW_NUMBER
;WITH numCTE AS(
SELECT
*,
ROW_NUMBER()OVER(PARTITION BY PersonID ORDER BY [Date],ID) N
FROM
(
SELECT ID,PersonID,[Date]
FROM TestData
UNION ALL
SELECT DISTINCT NULL,PersonID,@FromDate -- add phantom rows for some people
FROM TestData p
WHERE NOT EXISTS(SELECT * FROM TestData d WHERE d.[Date] BETWEEN @FromDate AND @ToDate AND d.PersonID=p.PersonID)
) q
)
SELECT n.*
FROM
(
SELECT PersonID,MIN(N)-1 MinN,MAX(N)+1 MaxN
FROM numCTE
WHERE [Date] BETWEEN @FromDate AND @ToDate
GROUP BY PersonID
) q
JOIN numCTE n on n.PersonID=q.PersonID AND n.N BETWEEN q.MinN AND q.MaxN
WHERE ID IS NOT NULL -- exclude phantom rows from result
select rows in sql with latest date from 3 tables in each group
This is for SQL Server (you didn't specify exactly what RDBMS you're using):
if you want to get the "latest row for each QuizId
" - this sounds like you need a CTE (Common Table Expression) with a ROW_NUMBER()
value - something like this (updated: you obviously want to "partition" not just by QuizId
, but also by UserName
):
WITH BaseData AS
(
SELECT
mAttempt.Id AS Id,
mAttempt.QuizModelId AS QuizId,
mAttempt.StartedAt AS StartsOn,
mUser.UserName,
mDetail.Score AS Score,
RowNum = ROW_NUMBER() OVER (PARTITION BY mAttempt.QuizModelId, mUser.UserName
ORDER BY mAttempt.TakenOn DESC)
FROM
UserQuizAttemptModels mAttempt
INNER JOIN
AspNetUsers mUser ON mAttempt.UserId = muser.Id
INNER JOIN
QuizAttemptDetailModels mDetail ON mDetail.UserQuizAttemptModelId = mAttempt.Id
)
SELECT *
FROM BaseData
WHERE QuizId = 10053
AND RowNum = 1
The BaseData
CTE basically selects the data (as you did) - but it also adds a ROW_NUMBER()
column. This will "partition" your data into groups of data - based on the QuizModelId
- and it will number all the rows inside each data group, starting at 1, and ordered by the second condition - the ORDER BY
clause. You said you want to order by "Taken On" date - but there's no such date visible in your query - so I just guessed it might be on the UserQuizAttemptModels
table - change and adapt as needed.
Now you can select from that CTE with your original WHERE
condition - and you specify, that you want only the first row for each data group (for each "QuizId") - the one with the most recent "Taken On" date value.
Related Topics
PHP Domdocument Getting Attribute of Tag
Convert This String to Timestamp PHP
Regular Expression: Find Range Except for One Letter or a Range
Mysqli Query Doesn't Work Twice
How to Add a Condition Inside a PHP Array
Best Way to Handle Dirty State in an Orm Model
PHP MySQL SQL Parser (Insert and Update)
Retrieving Get and Post Data Inside Laravel Controller
Phpmailer Not Sending and Not Giving Error
Error Executing "Putobject" on Aws, Upload Fails
How to Upload a File Using Jquery's $.Ajax Function with JSON and PHP
Phpstorm 2020.2 - PHP Built-In Functions Are Not Recognized
Convert a Big Integer to a Full String in PHP
Check If Http Request Comes from My Android App
Simple PHP Mail Function Not Working on Amazon Server Ec2