How to Get Start and End Range from List of Timestamps

How to get start and end range from list of timestamps?

OffsetDateTime

Parse those ISO 8601 strings into java.time.OffsetDateTime objects.

OffsetDateTime.parse( "2016-01-15T00:44:59+08:30" )

Add those date-time objects to a Collection and sort. You probably want a List such as ArrayList or a SortedSet.

The java.time classes implement the compareTo method, to fulfill their contract as a Comparable. So these objects know how to sort.

Like this:

List<OffsetDateTime> odts = new ArrayList<>();

OffsetDateTime odt = OffsetDateTime.parse( "2016-01-15T00:44:59+08:30" ) ;
odts.add( odt );
… // Parse remaining ISO 8601 strings, adding each new OffsetDateTime object to collection.

Collections.sort( odts );

Creating a range of dates in Python

Marginally better...

base = datetime.datetime.today()
date_list = [base - datetime.timedelta(days=x) for x in range(numdays)]

How to get all times within a time range

Using the date time module might be useful. Here's my idea for your problem if you were to use military time:

import datetime

start = datetime.time(10,0) # 10:00
end = datetime.time(10,5) # 10:05
TIME_FORMAT = "%H:%M" # Format for hours and minutes
times = [] # List of times
while start <= end:
times.append(start)
if start.minute == 59: # Changes the hour at the top of the hour and set the minutes back to 0
start = start.replace(minute=0) # have to use the replace method for changing the object
start = start.replace(hour=start.hour + 1)
else:
start = start.replace(minute=start.minute + 1)
times = [x.strftime(TIME_FORMAT) for x in times] # Uses list comprehension to format the objects
print(times)

Iterating through a range of dates in Python

Why are there two nested iterations? For me it produces the same list of data with only one iteration:

for single_date in (start_date + timedelta(n) for n in range(day_count)):
print ...

And no list gets stored, only one generator is iterated over. Also the "if" in the generator seems to be unnecessary.

After all, a linear sequence should only require one iterator, not two.

Update after discussion with John Machin:

Maybe the most elegant solution is using a generator function to completely hide/abstract the iteration over the range of dates:

from datetime import date, timedelta

def daterange(start_date, end_date):
for n in range(int((end_date - start_date).days)):
yield start_date + timedelta(n)

start_date = date(2013, 1, 1)
end_date = date(2015, 6, 2)
for single_date in daterange(start_date, end_date):
print(single_date.strftime("%Y-%m-%d"))

NB: For consistency with the built-in range() function this iteration stops before reaching the end_date. So for inclusive iteration use the next day, as you would with range().

Get start date & end date from the range of timestamp

Use window lead function for this case.

Example:

val df=Seq((1,"aa","2019-01-01 08:02:05.1"),(1,"aa","2019-09-02 08:02:05.2"),(1,"cc","2019-12-24 08:02:05.3"),(2,"dd","2013-01-22 08:02:05.4")).toDF("key","col1","timestamp")
import org.apache.spark.sql.expressions._
import org.apache.spark.sql.functions._
import org.apache.spark.sql._
val df1=df.withColumn("start_date",col("timestamp"))
val windowSpec = Window.partitionBy("key").orderBy("start_date")

df1.withColumn("end_date",lead(col("start_date"),1).over(windowSpec)).show(10,false)
//+---+----+---------------------+---------------------+---------------------+
//|key|col1|timestamp |start_date |end_date |
//+---+----+---------------------+---------------------+---------------------+
//|1 |aa |2019-01-01 08:02:05.1|2019-01-01 08:02:05.1|2019-09-02 08:02:05.2|
//|1 |aa |2019-09-02 08:02:05.2|2019-09-02 08:02:05.2|2019-12-24 08:02:05.3|
//|1 |cc |2019-12-24 08:02:05.3|2019-12-24 08:02:05.3|null |
//|2 |dd |2013-01-22 08:02:05.4|2013-01-22 08:02:05.4|null |
//+---+----+---------------------+---------------------+---------------------+

Python generating a list of dates between two dates

You can use pandas.date_range() for this:

import pandas
pandas.date_range(sdate,edate-timedelta(days=1),freq='d')


DatetimeIndex(['2019-03-22', '2019-03-23', '2019-03-24', '2019-03-25',
'2019-03-26', '2019-03-27', '2019-03-28', '2019-03-29',
'2019-03-30', '2019-03-31', '2019-04-01', '2019-04-02',
'2019-04-03', '2019-04-04', '2019-04-05', '2019-04-06',
'2019-04-07', '2019-04-08'],
dtype='datetime64[ns]', freq='D')

Fill list of start/end dates with dates in between

You can use timedelta from datetime module to iterate from start to end date, as below

from datetime import datetime as dt, timedelta as td
strp,strf,fmt=dt.strptime,dt.strftime,"%m-%d-%Y"

a=['10-23-2019', '10-26-2019' , '11-02-2019', '11-06-2019']

print([[strf(k,fmt) for k in (strp(i,fmt)+td(days=n) for n in range((strp(j,fmt)-strp(i,fmt)).days+1))] for i,j in zip(a[::2],a[1::2])])

Output

[['10-23-2019', '10-24-2019', '10-25-2019', '10-26-2019'], ['11-02-2019', '11-03-2019', '11-04-2019', '11-05-2019', '11-06-2019']]


Related Topics



Leave a reply



Submit