How to preserve timezone when parsing date/time strings with strptime()?
The datetime
module documentation says:
Return a datetime corresponding to date_string, parsed according to format. This is equivalent to
datetime(*(time.strptime(date_string, format)[0:6]))
.
See that [0:6]
? That gets you (year, month, day, hour, minute, second)
. Nothing else. No mention of timezones.
Interestingly, [Win XP SP2, Python 2.6, 2.7] passing your example to time.strptime
doesn't work but if you strip off the " %Z" and the " EST" it does work. Also using "UTC" or "GMT" instead of "EST" works. "PST" and "MEZ" don't work. Puzzling.
It's worth noting this has been updated as of version 3.2 and the same documentation now also states the following:
When the %z directive is provided to the strptime() method, an aware datetime object will be produced. The tzinfo of the result will be set to a timezone instance.
Note that this doesn't work with %Z, so the case is important. See the following example:
In [1]: from datetime import datetime
In [2]: start_time = datetime.strptime('2018-04-18-17-04-30-AEST','%Y-%m-%d-%H-%M-%S-%Z')
In [3]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: None
In [4]: start_time = datetime.strptime('2018-04-18-17-04-30-+1000','%Y-%m-%d-%H-%M-%S-%z')
In [5]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: UTC+10:00
strptime example for datetime with tz offset
You're looking for %z:
>>> datetime.strptime('2020-10-23T11:50:19+00:00', '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2020, 10, 23, 11, 50, 19, tzinfo=datetime.timezone.utc)
Beware of some Python version compatibility notes:
Changed in version 3.7: When the
%z
directive is provided to thestrptime()
method, the UTC offsets can have a colon as a separator between hours, minutes and seconds. For example,'+01:00:00'
will be parsed as an offset of one hour. In addition, providing'Z'
is identical to'+00:00'
.
More robust approach, it's not strptime
, but it's still in stdlib since Python 3.7:
>>> datetime.fromisoformat('2020-10-23T11:50:19+00:00')
datetime.datetime(2020, 10, 23, 11, 50, 19, tzinfo=datetime.timezone.utc)
As documented this function supports strings in the format:
YYYY-MM-DD[*HH[:MM[:SS[.fff[fff]]]][+HH:MM[:SS[.ffffff]]]]
where * can match any single character (not just a T).
Python datetime strptime() and strftime(): how to preserve the timezone information
Part of the problem here is that the strings usually used to represent timezones are not actually unique. "EST" only means "America/New_York" to people in North America. This is a limitation in the C time API, and the Python solution is… to add full tz features in some future version any day now, if anyone is willing to write the PEP.
You can format and parse a timezone as an offset, but that loses daylight savings/summer time information (e.g., you can't distinguish "America/Phoenix" from "America/Los_Angeles" in the summer). You can format a timezone as a 3-letter abbreviation, but you can't parse it back from that.
If you want something that's fuzzy and ambiguous but usually what you want, you need a third-party library like dateutil
.
If you want something that's actually unambiguous, just append the actual tz name to the local datetime string yourself, and split it back off on the other end:
d = datetime.datetime.now(pytz.timezone("America/New_York"))
dtz_string = d.strftime(fmt) + ' ' + "America/New_York"
d_string, tz_string = dtz_string.rsplit(' ', 1)
d2 = datetime.datetime.strptime(d_string, fmt)
tz2 = pytz.timezone(tz_string)
print dtz_string
print d2.strftime(fmt) + ' ' + tz_string
Or… halfway between those two, you're already using the pytz
library, which can parse (according to some arbitrary but well-defined disambiguation rules) formats like "EST". So, if you really want to, you can leave the %Z
in on the formatting side, then pull it off and parse it with pytz.timezone()
before passing the rest to strptime
.
Python timezone '%z' directive for datetime.strptime() not available
strptime()
is implemented in pure Python. Unlike strftime()
; it [which directives are supported] doesn't depend on platform. %z
is supported since Python 3.2:
>>> from datetime import datetime
>>> datetime.strptime('24/Aug/2014:17:57:26 +0200', '%d/%b/%Y:%H:%M:%S %z')
datetime.datetime(2014, 8, 24, 17, 57, 26, tzinfo=datetime.timezone(datetime.timedelta(0, 7200)))
how to parse Email time zone indicator using strptime() without being aware of locale time?
There is no concrete timezone implementation in Python 2.7. You could easily implement the UTC offset parsing, see How to parse dates with -0400 timezone string in python?
Convert string with timezone included into datetime object
It is a common misconception that %Z
can parse arbitrary abbreviated time zone names. It cannot. See especially the "Notes" section #6 under technical detail in the docs.
You'll have to do that "by hand" since many of those abbreviations are ambiguous. Here's an option how to deal with it using only the standard lib:
from datetime import datetime
from zoneinfo import ZoneInfo
# we need to define which abbreviation corresponds to which time zone
zoneMapping = {'PDT' : ZoneInfo('America/Los_Angeles'),
'PST' : ZoneInfo('America/Los_Angeles'),
'CET' : ZoneInfo('Europe/Berlin'),
'CEST': ZoneInfo('Europe/Berlin')}
# some example inputs; last should fail
timestrings = ('Jun 8, 2021 PDT', 'Feb 8, 2021 PST', 'Feb 8, 2021 CET',
'Aug 9, 2020 WTF')
for t in timestrings:
# we can split off the time zone abbreviation
s, z = t.rsplit(' ', 1)
# parse the first part to datetime object
# and set the time zone; use dict.get if it should be None if not found
dt = datetime.strptime(s, "%b %d, %Y").replace(tzinfo=zoneMapping[z])
print(t, "->", dt)
gives
Jun 8, 2021 PDT -> 2021-06-08 00:00:00-07:00
Feb 8, 2021 PST -> 2021-02-08 00:00:00-08:00
Feb 8, 2021 CET -> 2021-02-08 00:00:00+01:00
Traceback (most recent call last):
dt = datetime.strptime(s, "%b %d, %Y").replace(tzinfo=zoneMapping[z])
KeyError: 'WTF'
How to convert a timezone aware string to datetime in Python without dateutil?
As of Python 3.7, datetime.datetime.fromisoformat()
can handle your format:
>>> import datetime
>>> datetime.datetime.fromisoformat('2012-11-01T04:16:13-04:00')
datetime.datetime(2012, 11, 1, 4, 16, 13, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))
In older Python versions you can't, not without a whole lot of painstaking manual timezone defining.
Python does not include a timezone database, because it would be outdated too quickly. Instead, Python relies on external libraries, which can have a far faster release cycle, to provide properly configured timezones for you.
As a side-effect, this means that timezone parsing also needs to be an external library. If dateutil
is too heavy-weight for you, use iso8601
instead, it'll parse your specific format just fine:
>>> import iso8601
>>> iso8601.parse_date('2012-11-01T04:16:13-04:00')
datetime.datetime(2012, 11, 1, 4, 16, 13, tzinfo=<FixedOffset '-04:00'>)
iso8601
is a whopping 4KB small. Compare that tot python-dateutil
's 148KB.
As of Python 3.2 Python can handle simple offset-based timezones, and %z
will parse -hhmm
and +hhmm
timezone offsets in a timestamp. That means that for a ISO 8601 timestamp you'd have to remove the :
in the timezone:
>>> from datetime import datetime
>>> iso_ts = '2012-11-01T04:16:13-04:00'
>>> datetime.strptime(''.join(iso_ts.rsplit(':', 1)), '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2012, 11, 1, 4, 16, 13, tzinfo=datetime.timezone(datetime.timedelta(-1, 72000)))
The lack of proper ISO 8601 parsing is being tracked in Python issue 15873.
String to DateTime in Python Correct Format error
Using a couple other questions, I found my solution:
Parser must be a string or character stream, not Series
how to convert a string datetime with unknown timezone to timestamp in python
From string to Posix/Unix int:
import datetime as dt
from time import mktime
from dateutil import parser
def timeCorrect(stringDate):
stamp = parser.parse(stringDate, tzinfos={"EDT": -4 * 3600})
work = mktime(stamp.timetuple())
return work
tripOrig['Correct Time'] = tripOrig[' Time'].apply(timeCorrect)
Related Topics
Checking If a String Can Be Converted to Float in Python
How to Convert Number Words to Integers
Accessing Pandas Column Using Squared Brackets VS Using a Dot (Like an Attribute)
Why Does Id({}) == Id({}) and Id([]) == Id([]) in Cpython
What Are "Named Tuples" in Python
Pg_Config Executable Not Found
Having Django Serve Downloadable Files
Setting Y-Axis Limit in Matplotlib
How Does Zip(*[Iter(S)]*N) Work in Python
Proper Name for Python * Operator
I Can't Install Pyaudio on Windows? How to Solve "Error: Microsoft Visual C++ 14.0 Is Required."
How to Set Time Limit on Raw_Input
How to Groupby Consecutive Values in Pandas Dataframe
CSV in Python Adding an Extra Carriage Return, on Windows
How to Install Writable Shared and User Specific Data Files with Setuptools
Change Parent Shell's Environment from a Subprocess
How to Make Selenium Not Wait Till Full Page Load, Which Has a Slow Script