Creating a range of dates in Python
Marginally better...
base = datetime.datetime.today()
date_list = [base - datetime.timedelta(days=x) for x in range(numdays)]
Generate list of months between interval in python
>>> from datetime import datetime, timedelta
>>> from collections import OrderedDict
>>> dates = ["2014-10-10", "2016-01-07"]
>>> start, end = [datetime.strptime(_, "%Y-%m-%d") for _ in dates]
>>> OrderedDict(((start + timedelta(_)).strftime(r"%b-%y"), None) for _ in xrange((end - start).days)).keys()
['Oct-14', 'Nov-14', 'Dec-14', 'Jan-15', 'Feb-15', 'Mar-15', 'Apr-15', 'May-15', 'Jun-15', 'Jul-15', 'Aug-15', 'Sep-15', 'Oct-15', 'Nov-15', 'Dec-15', 'Jan-16']
Update: a bit of explanation, as requested in one comment. There are three problems here: parsing the dates into appropriate data structures (strptime
); getting the date range given the two extremes and the step (one month); formatting the output dates (strftime
). The datetime
type overloads the subtraction operator, so that end - start
makes sense. The result is a timedelta
object that represents the difference between the two dates, and the .days
attribute gets this difference expressed in days. There is no .months
attribute, so we iterate one day at a time and convert the dates to the desired output format. This yields a lot of duplicates, which the OrderedDict
removes while keeping the items in the right order.
Now this is simple and concise because it lets the datetime module do all the work, but it's also horribly inefficient. We're calling a lot of methods for each day while we only need to output months. If performance is not an issue, the above code will be just fine. Otherwise, we'll have to work a bit more. Let's compare the above implementation with a more efficient one:
from datetime import datetime, timedelta
from collections import OrderedDict
dates = ["2014-10-10", "2016-01-07"]
def monthlist_short(dates):
start, end = [datetime.strptime(_, "%Y-%m-%d") for _ in dates]
return OrderedDict(((start + timedelta(_)).strftime(r"%b-%y"), None) for _ in xrange((end - start).days)).keys()
def monthlist_fast(dates):
start, end = [datetime.strptime(_, "%Y-%m-%d") for _ in dates]
total_months = lambda dt: dt.month + 12 * dt.year
mlist = []
for tot_m in xrange(total_months(start)-1, total_months(end)):
y, m = divmod(tot_m, 12)
mlist.append(datetime(y, m+1, 1).strftime("%b-%y"))
return mlist
assert monthlist_fast(dates) == monthlist_short(dates)
if __name__ == "__main__":
from timeit import Timer
for func in "monthlist_short", "monthlist_fast":
print func, Timer("%s(dates)" % func, "from __main__ import dates, %s" % func).timeit(1000)
On my laptop, I get the following output:
monthlist_short 2.3209939003
monthlist_fast 0.0774540901184
The concise implementation is about 30 times slower, so I would not recommend it in time-critical applications :)
Python generating a list of dates between two dates
You can use pandas.date_range()
for this:
import pandas
pandas.date_range(sdate,edate-timedelta(days=1),freq='d')
DatetimeIndex(['2019-03-22', '2019-03-23', '2019-03-24', '2019-03-25',
'2019-03-26', '2019-03-27', '2019-03-28', '2019-03-29',
'2019-03-30', '2019-03-31', '2019-04-01', '2019-04-02',
'2019-04-03', '2019-04-04', '2019-04-05', '2019-04-06',
'2019-04-07', '2019-04-08'],
dtype='datetime64[ns]', freq='D')
Iterating through a range of dates in Python
Why are there two nested iterations? For me it produces the same list of data with only one iteration:
for single_date in (start_date + timedelta(n) for n in range(day_count)):
print ...
And no list gets stored, only one generator is iterated over. Also the "if" in the generator seems to be unnecessary.
After all, a linear sequence should only require one iterator, not two.
Update after discussion with John Machin:
Maybe the most elegant solution is using a generator function to completely hide/abstract the iteration over the range of dates:
from datetime import date, timedelta
def daterange(start_date, end_date):
for n in range(int((end_date - start_date).days)):
yield start_date + timedelta(n)
start_date = date(2013, 1, 1)
end_date = date(2015, 6, 2)
for single_date in daterange(start_date, end_date):
print(single_date.strftime("%Y-%m-%d"))
NB: For consistency with the built-in range()
function this iteration stops before reaching the end_date
. So for inclusive iteration use the next day, as you would with range()
.
How to create a date range on specific dates for each month in Python?
If need also get first and last value add Index.isin
by last and first value - so get all values unique, not duplicates if first or last day is 10
:
dates = pd.date_range(start=start_date, end=end_date)
dates = dates[dates.isin(dates[[0,-1]]) | (dates.day == querydate)]
print (dates)
DatetimeIndex(['2020-01-03', '2020-01-10', '2020-02-10', '2020-03-10',
'2020-04-10', '2020-05-10', '2020-06-10', '2020-07-10',
'2020-08-10', '2020-09-10', '2020-10-10', '2020-10-19'],
dtype='datetime64[ns]', freq=None)
If need list:
print (list(dates.strftime('%Y-%m-%d')))
['2020-01-03', '2020-01-10', '2020-02-10', '2020-03-10',
'2020-04-10', '2020-05-10', '2020-06-10', '2020-07-10',
'2020-08-10', '2020-09-10', '2020-10-10', '2020-10-19']
Changed sample data:
start_date = '2020-01-10'
end_date = '2020-10-10'
querydate = 10
dates = pd.date_range(start=start_date, end=end_date)
dates = dates[dates.isin(dates[[0,-1]]) | (dates.day == querydate)]
print (dates)
DatetimeIndex(['2020-01-10', '2020-02-10', '2020-03-10', '2020-04-10',
'2020-05-10', '2020-06-10', '2020-07-10', '2020-08-10',
'2020-09-10', '2020-10-10'],
dtype='datetime64[ns]', freq=None)
Create a dataframe from a date range in python
I hope I coded exactly what you need.
import pandas as pd
def create_interval(ts1, ts2, interval_name):
ts_list_dt = pd.date_range(start=ts1, end=ts2).to_pydatetime().tolist()
ts_list = list(map(lambda x: ''.join(str(x)), ts_list_dt))
d = {'date': ts_list, 'interval_name': [interval_name]*len(ts_list)}
df = pd.DataFrame(data=d)
return df
df = create_interval('2022-01-12', '2022-01-17', 'Holidays')
print(df)
output:
date interval_name
0 2022-01-12 00:00:00 Holidays
1 2022-01-13 00:00:00 Holidays
2 2022-01-14 00:00:00 Holidays
3 2022-01-15 00:00:00 Holidays
4 2022-01-16 00:00:00 Holidays
5 2022-01-17 00:00:00 Holidays
If you want DataFrame without Index column, use df = df.set_index('date')
after creating DataFrame df = pd.DataFrame(data=d)
. And then you will get:
date interval_name
2022-01-12 00:00:00 Holidays
2022-01-13 00:00:00 Holidays
2022-01-14 00:00:00 Holidays
2022-01-15 00:00:00 Holidays
2022-01-16 00:00:00 Holidays
2022-01-17 00:00:00 Holidays
Creating datetime range from unique dates and list of time range
You can use nested list comprehension to achieve this:
import datetime
date = [datetime.date(2020, 12, 28), datetime.date(2020, 12, 29), datetime.date(2020, 12, 30), datetime.date(2020, 12, 31)]
time = [datetime.time(9, 15), datetime.time(10, 30), datetime.time(11, 45), datetime.time(13, 0), datetime.time(14, 15)]
output_list = ["{} {}".format(d, t) for d in date for t in time]
where output_list
contains:
[
'2020-12-28 09:15:00',
'2020-12-28 10:30:00',
'2020-12-28 11:45:00',
'2020-12-28 13:00:00',
'2020-12-28 14:15:00',
'2020-12-29 09:15:00',
'2020-12-29 10:30:00',
'2020-12-29 11:45:00',
'2020-12-29 13:00:00',
'2020-12-29 14:15:00',
'2020-12-30 09:15:00',
'2020-12-30 10:30:00',
'2020-12-30 11:45:00',
'2020-12-30 13:00:00',
'2020-12-30 14:15:00',
'2020-12-31 09:15:00',
'2020-12-31 10:30:00',
'2020-12-31 11:45:00',
'2020-12-31 13:00:00',
'2020-12-31 14:15:00'
]
Creating date range pairs in pandas
There was a typo (start=end
) that caused dates
to have only 1 value.
But fixing the typo only gives you a flat range of dates. If you want those nested pairs, you could shift dates
by 4 hours and zip()
:
dates = pandas.date_range(start=start, end=end, freq='4H')
shift = dates + pandas.Timedelta(hours=4)
pairs = list(zip(dates, shift))
# [(Timestamp('2021-04-02 20:40:00', freq='4H'),
# Timestamp('2021-04-03 00:40:00', freq='4H')),
# (Timestamp('2021-04-03 00:40:00', freq='4H'),
# Timestamp('2021-04-03 04:40:00', freq='4H')),
# (Timestamp('2021-04-03 04:40:00', freq='4H'),
# Timestamp('2021-04-03 08:40:00', freq='4H')),
# ...
Or for a list of lists instead of list of tuples:
pairs = list(map(list, zip(dates, shift)))
Related Topics
Converting a Pandas Groupby Output from Series to Dataframe
Generate Random Integers Between 0 and 9
How to Run Multiple Python Versions on Windows
How to Urlencode a Querystring in Python
How to Find the Time Difference Between Two Datetime Objects in Python
What Does the _File_ Variable Mean/Do
Setting Y-Axis Limit in Matplotlib
How Does Zip(*[Iter(S)]*N) Work in Python
Proper Name for Python * Operator
I Can't Install Pyaudio on Windows? How to Solve "Error: Microsoft Visual C++ 14.0 Is Required."
How to Set Time Limit on Raw_Input
How to Groupby Consecutive Values in Pandas Dataframe
Lxml Error "Ioerror: Error Reading File" When Parsing Facebook Mobile in a Python Scraper Script
Docker.Errors.Dockerexception: Error While Fetching Server API Version