Iterating through a range of dates in Python
I have the following code to do this, but how can I do it better? Right now I think it's better than nested loops, but it starts to get Perl-one-linerish when you have a generator in a list comprehension.
day_count = (end_date - start_date).days + 1 for single_date in [d for d in (start_date + timedelta(n) for n in range(day_count)) if d <= end_date]: print strftime("%Y-%m-%d", single_date.timetuple())
- I'm not actually using this to print. That's just for demo purposes.
datetime.dateobjects because I don't need the timestamps. (They're going to be used to generate a report).
For a start date of
2009-05-30 and an end date of
2009-05-30 2009-05-31 2009-06-01 2009-06-02 2009-06-03 2009-06-04 2009-06-05 2009-06-06 2009-06-07 2009-06-08 2009-06-09
Why are there two nested iterations? For me it produces the same list of data with only one iteration:
for single_date in (start_date + timedelta(n) for n in range(day_count)): print ...
And no list gets stored, only one generator is iterated over. Also the "if" in the generator seems to be unnecessary.
After all, a linear sequence should only require one iterator, not two.
Update after discussion with John Machin:
Maybe the most elegant solution is using a generator function to completely hide/abstract the iteration over the range of dates:
from datetime import timedelta, date def daterange(start_date, end_date): for n in range(int((end_date - start_date).days)): yield start_date + timedelta(n) start_date = date(2013, 1, 1) end_date = date(2015, 6, 2) for single_date in daterange(start_date, end_date): print(single_date.strftime("%Y-%m-%d"))
NB: For consistency with the built-in
range() function this iteration stops before reaching the
end_date. So for inclusive iteration use the next day, as you would with
This might be more clear:
from datetime import date, timedelta start_date = date(2019, 1, 1) end_date = date(2020, 1, 1) delta = timedelta(days=1) while start_date <= end_date: print (start_date.strftime("%Y-%m-%d")) start_date += delta
Read more... Read less...
from datetime import date from dateutil.rrule import rrule, DAILY a = date(2009, 5, 30) b = date(2009, 6, 9) for dt in rrule(DAILY, dtstart=a, until=b): print dt.strftime("%Y-%m-%d")
This python library has many more advanced features, some very useful, like
relative deltas—and is implemented as a single file (module) that's easily included into a project.
Pandas is great for time series in general, and has direct support for date ranges.
import pandas as pd daterange = pd.date_range(start_date, end_date)
You can then loop over the daterange to print the date:
for single_date in daterange: print (single_date.strftime("%Y-%m-%d"))
It also has lots of options to make life easier. For example if you only wanted weekdays, you would just swap in bdate_range. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html#generating-ranges-of-timestamps
The power of Pandas is really its dataframes, which support vectorized operations (much like numpy) that make operations across large quantities of data very fast and easy.
EDIT: You could also completely skip the for loop and just print it directly, which is easier and more efficient:
import datetime def daterange(start, stop, step=datetime.timedelta(days=1), inclusive=False): # inclusive=False to behave like range by default if step.days > 0: while start < stop: yield start start = start + step # not +=! don't modify object passed in if it's mutable # since this function is not restricted to # only types from datetime module elif step.days < 0: while start > stop: yield start start = start + step if inclusive and start == stop: yield start # ... for date in daterange(start_date, end_date, inclusive=True): print strftime("%Y-%m-%d", date.timetuple())
This function does more than you strictly require, by supporting negative step, etc. As long as you factor out your range logic, then you don't need the separate
day_count and most importantly the code becomes easier to read as you call the function from multiple places.
This is the most human-readable solution I can think of.
import datetime def daterange(start, end, step=datetime.timedelta(1)): curr = start while curr < end: yield curr curr += step
Why not try:
import datetime as dt start_date = dt.datetime(2012, 12,1) end_date = dt.datetime(2012, 12,5) total_days = (end_date - start_date).days + 1 #inclusive 5 days for day_number in range(total_days): current_date = (start_date + dt.timedelta(days = day_number)).date() print current_date