How to Preserve Timezone When Parsing Date/Time Strings with Strptime()

How to preserve timezone when parsing date/time strings with strptime()?

The datetime module documentation says:

Return a datetime corresponding to date_string, parsed according to format. This is equivalent to datetime(*(time.strptime(date_string, format)[0:6])).

See that [0:6]? That gets you (year, month, day, hour, minute, second). Nothing else. No mention of timezones.

Interestingly, [Win XP SP2, Python 2.6, 2.7] passing your example to time.strptime doesn't work but if you strip off the " %Z" and the " EST" it does work. Also using "UTC" or "GMT" instead of "EST" works. "PST" and "MEZ" don't work. Puzzling.

It's worth noting this has been updated as of version 3.2 and the same documentation now also states the following:

When the %z directive is provided to the strptime() method, an aware datetime object will be produced. The tzinfo of the result will be set to a timezone instance.

Note that this doesn't work with %Z, so the case is important. See the following example:

In [1]: from datetime import datetime

In [2]: start_time = datetime.strptime('2018-04-18-17-04-30-AEST','%Y-%m-%d-%H-%M-%S-%Z')

In [3]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: None

In [4]: start_time = datetime.strptime('2018-04-18-17-04-30-+1000','%Y-%m-%d-%H-%M-%S-%z')

In [5]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: UTC+10:00

strptime example for datetime with tz offset

You're looking for %z:

>>> datetime.strptime('2020-10-23T11:50:19+00:00', '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2020, 10, 23, 11, 50, 19, tzinfo=datetime.timezone.utc)

Beware of some Python version compatibility notes:

Changed in version 3.7: When the %z directive is provided to the strptime() method, the UTC offsets can have a colon as a separator between hours, minutes and seconds. For example, '+01:00:00' will be parsed as an offset of one hour. In addition, providing 'Z' is identical to '+00:00'.

More robust approach, it's not strptime, but it's still in stdlib since Python 3.7:

>>> datetime.fromisoformat('2020-10-23T11:50:19+00:00')
datetime.datetime(2020, 10, 23, 11, 50, 19, tzinfo=datetime.timezone.utc)

As documented this function supports strings in the format:

YYYY-MM-DD[*HH[:MM[:SS[.fff[fff]]]][+HH:MM[:SS[.ffffff]]]]

where * can match any single character (not just a T).

Python datetime strptime() and strftime(): how to preserve the timezone information

Part of the problem here is that the strings usually used to represent timezones are not actually unique. "EST" only means "America/New_York" to people in North America. This is a limitation in the C time API, and the Python solution is… to add full tz features in some future version any day now, if anyone is willing to write the PEP.

You can format and parse a timezone as an offset, but that loses daylight savings/summer time information (e.g., you can't distinguish "America/Phoenix" from "America/Los_Angeles" in the summer). You can format a timezone as a 3-letter abbreviation, but you can't parse it back from that.

If you want something that's fuzzy and ambiguous but usually what you want, you need a third-party library like dateutil.

If you want something that's actually unambiguous, just append the actual tz name to the local datetime string yourself, and split it back off on the other end:

d = datetime.datetime.now(pytz.timezone("America/New_York"))
dtz_string = d.strftime(fmt) + ' ' + "America/New_York"

d_string, tz_string = dtz_string.rsplit(' ', 1)
d2 = datetime.datetime.strptime(d_string, fmt)
tz2 = pytz.timezone(tz_string)

print dtz_string
print d2.strftime(fmt) + ' ' + tz_string

Or… halfway between those two, you're already using the pytz library, which can parse (according to some arbitrary but well-defined disambiguation rules) formats like "EST". So, if you really want to, you can leave the %Z in on the formatting side, then pull it off and parse it with pytz.timezone() before passing the rest to strptime.

Python timezone '%z' directive for datetime.strptime() not available

strptime() is implemented in pure Python. Unlike strftime(); it [which directives are supported] doesn't depend on platform. %z is supported since Python 3.2:

>>> from datetime import datetime
>>> datetime.strptime('24/Aug/2014:17:57:26 +0200', '%d/%b/%Y:%H:%M:%S %z')
datetime.datetime(2014, 8, 24, 17, 57, 26, tzinfo=datetime.timezone(datetime.timedelta(0, 7200)))

how to parse Email time zone indicator using strptime() without being aware of locale time?

There is no concrete timezone implementation in Python 2.7. You could easily implement the UTC offset parsing, see How to parse dates with -0400 timezone string in python?

Convert string with timezone included into datetime object

It is a common misconception that %Z can parse arbitrary abbreviated time zone names. It cannot. See especially the "Notes" section #6 under technical detail in the docs.

You'll have to do that "by hand" since many of those abbreviations are ambiguous. Here's an option how to deal with it using only the standard lib:

from datetime import datetime
from zoneinfo import ZoneInfo

# we need to define which abbreviation corresponds to which time zone
zoneMapping = {'PDT' : ZoneInfo('America/Los_Angeles'),
'PST' : ZoneInfo('America/Los_Angeles'),
'CET' : ZoneInfo('Europe/Berlin'),
'CEST': ZoneInfo('Europe/Berlin')}

# some example inputs; last should fail
timestrings = ('Jun 8, 2021 PDT', 'Feb 8, 2021 PST', 'Feb 8, 2021 CET',
'Aug 9, 2020 WTF')

for t in timestrings:
# we can split off the time zone abbreviation
s, z = t.rsplit(' ', 1)
# parse the first part to datetime object
# and set the time zone; use dict.get if it should be None if not found
dt = datetime.strptime(s, "%b %d, %Y").replace(tzinfo=zoneMapping[z])
print(t, "->", dt)

gives

Jun 8, 2021 PDT -> 2021-06-08 00:00:00-07:00
Feb 8, 2021 PST -> 2021-02-08 00:00:00-08:00
Feb 8, 2021 CET -> 2021-02-08 00:00:00+01:00

Traceback (most recent call last):

dt = datetime.strptime(s, "%b %d, %Y").replace(tzinfo=zoneMapping[z])

KeyError: 'WTF'

How to convert a timezone aware string to datetime in Python without dateutil?

As of Python 3.7, datetime.datetime.fromisoformat() can handle your format:

>>> import datetime
>>> datetime.datetime.fromisoformat('2012-11-01T04:16:13-04:00')
datetime.datetime(2012, 11, 1, 4, 16, 13, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))

In older Python versions you can't, not without a whole lot of painstaking manual timezone defining.

Python does not include a timezone database, because it would be outdated too quickly. Instead, Python relies on external libraries, which can have a far faster release cycle, to provide properly configured timezones for you.

As a side-effect, this means that timezone parsing also needs to be an external library. If dateutil is too heavy-weight for you, use iso8601 instead, it'll parse your specific format just fine:

>>> import iso8601
>>> iso8601.parse_date('2012-11-01T04:16:13-04:00')
datetime.datetime(2012, 11, 1, 4, 16, 13, tzinfo=<FixedOffset '-04:00'>)

iso8601 is a whopping 4KB small. Compare that tot python-dateutil's 148KB.

As of Python 3.2 Python can handle simple offset-based timezones, and %z will parse -hhmm and +hhmm timezone offsets in a timestamp. That means that for a ISO 8601 timestamp you'd have to remove the : in the timezone:

>>> from datetime import datetime
>>> iso_ts = '2012-11-01T04:16:13-04:00'
>>> datetime.strptime(''.join(iso_ts.rsplit(':', 1)), '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2012, 11, 1, 4, 16, 13, tzinfo=datetime.timezone(datetime.timedelta(-1, 72000)))

The lack of proper ISO 8601 parsing is being tracked in Python issue 15873.

String to DateTime in Python Correct Format error

Using a couple other questions, I found my solution:

Parser must be a string or character stream, not Series

how to convert a string datetime with unknown timezone to timestamp in python

From string to Posix/Unix int:

import datetime as dt
from time import mktime
from dateutil import parser

def timeCorrect(stringDate):
stamp = parser.parse(stringDate, tzinfos={"EDT": -4 * 3600})
work = mktime(stamp.timetuple())
return work

tripOrig['Correct Time'] = tripOrig[' Time'].apply(timeCorrect)


Related Topics



Leave a reply



Submit