Parsing Datetime in Python

Convert string Jun 1 2005 1:33PM into datetime

datetime.strptime parses an input string in the user-specified format into a timezone-naive datetime object:

>>> from datetime import datetime
>>> datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
datetime.datetime(2005, 6, 1, 13, 33)

To obtain a date object using an existing datetime object, convert it using .date():

>>> datetime.strptime('Jun 1 2005', '%b %d %Y').date()
date(2005, 6, 1)

Links:

  • strptime docs: Python 2, Python 3

  • strptime/strftime format string docs: Python 2, Python 3

  • strftime.org format string cheatsheet

Notes:

  • strptime = "string parse time"
  • strftime = "string format time"

Parsing datetime in Python..?

As @TimPietzcker suggested, the dateutil package is the way to go, it handles the first 3 formats correctly and automatically:

>>> from dateutil.parser import parse
>>> parse("Fri Sep 25 18:09:49 -0500 2009")
datetime.datetime(2009, 9, 25, 18, 9, 49, tzinfo=tzoffset(None, -18000))
>>> parse("2008-06-29T00:42:18.000Z")
datetime.datetime(2008, 6, 29, 0, 42, 18, tzinfo=tzutc())
>>> parse("2011-07-16T21:46:39Z")
datetime.datetime(2011, 7, 16, 21, 46, 39, tzinfo=tzutc())

The unixtime format it seems to hiccough on, but luckily the standard datetime.datetime is up for the task:

>>> from datetime import datetime
>>> datetime.utcfromtimestamp(float("1294989360"))
datetime.datetime(2011, 1, 14, 7, 16)

It is rather easy to make a function out of this that handles all 4 formats:

from dateutil.parser import parse
from datetime import datetime

def parse_time(s):
try:
ret = parse(s)
except ValueError:
ret = datetime.utcfromtimestamp(s)
return ret

Parse date string and change format

datetime module could help you with that:

datetime.datetime.strptime(date_string, format1).strftime(format2)

For the specific example you could do

>>> import datetime
>>> datetime.datetime.strptime('Mon Feb 15 2010', '%a %b %d %Y').strftime('%d/%m/%Y')
'15/02/2010'
>>>

Parse Datetime with +0 timezone

Any ideas what I have overseen?

strftime.org claims that %z

UTC offset in the form ±HHMM[SS[.ffffff]] (empty string if the object
is naive).

this mean that it must contain at least 4 digits after + or - (HHMM part, which is compulsory), taking this is account Dec 03 2020 01: +0 is not compliant with used format string, whilst Dec 03 2020 01: +0000 is

import datetime
dtObj = datetime.datetime.strptime("Dec 03 2020 01: +0000", '%b %d %Y %I: %z')
print(dtObj)

gives output

2020-12-03 01:00:00+00:00

Parse datetime-range or duration given a partial datetime

Addressing specifically this section of the question:

EDIT: It seems that the internal function pandas._libs.tslibs.parsing.parse_datetime_string_with_reso returns what I want. Does anyone know how can I access it (not accessible using from pandas._libs.tslibs.parsing import parse_datetime_string_with_reso)?

You can use from pandas._libs.tslibs.parsing import parse_time_string which internally calls parse_datetime_string_with_reso and also returns the resolution.

How to use parser on multiple time objects

You only get one value back from your to_date function because you exit the function in the first loop iteration. You need to introduce an list storing your parsed dates temporary:

from dateutil import parser

def to_date(date_list):
parsed_date_list = []
for date in date_list:
parsed_date_list.append(parser.parse(date))
return parsed_date_list

date_list = ['2022-06-01', '2022-02-02']
res = to_date(date_list)

Or using a list comprehension to keep your code more concise:

from dateutil import parser

def to_date(date_list):
return [parser.parse(date) for date in date_list]

date_list = ['2022-06-01', '2022-02-02']
res = to_date(date_list)

And to format your string, simply use the strftime function as pointed out by kpie
in his comment:

# res = to_date(date_list)

date_format = "%b %d, %Y"
print(f"From: {res[0].strftime(date_format)} | To: {res[1].strftime(date_format)}")

Do not use list as a variable name. list is a data structure and therefore already in use by the class list.

Parsing OFX datetime in Python

Some things to note here, first (as commented):

  • Python built-in strptime will have a hard time here - %z won't parse a single digit offset hour, and %Z won't parse some (potentially) ambiguous time zone abbreviation.

Then, the OFX Banking Version 2.3 docs (sect. 3.2.8.2 Date and Datetime) leave some questions open to me:

  • Is the UTC offset optional ?
  • Why is EST called a time zone while it's just an abbreviation ?
  • Why in the example the UTC offset is -5 hours while on 1996-10-05, US/Eastern was at UTC-4 ?
  • What about offsets that have minutes specified, e.g. +5:30 for Asia/Calcutta ?
  • (opinionated) Why re-invent the wheel in the first place instead of using a commonly used standard like ISO 8601 ?

Anyway, here's an attempt at a custom parser:

from datetime import datetime, timedelta, timezone
from zoneinfo import ZoneInfo

def parseOFXdatetime(s, tzinfos=None, _tz=None):
"""
parse OFX datetime string to an aware Python datetime object.
"""
# first, treat formats that have no UTC offset specified.
if not '[' in s:
# just make sure default format is satisfied by filling with zeros if needed
s = s.ljust(14, '0') + '.000' if not '.' in s else s
return datetime.strptime(s, "%Y%m%d%H%M%S.%f").replace(tzinfo=timezone.utc)

# offset and tz are specified, so first get the date/time, offset and tzname components
s, off = s.strip(']').split('[')
off, name = off.split(':')
s = s.ljust(14, '0') + '.000' if not '.' in s else s
# if tzinfos are specified, map the tz name:
if tzinfos:
_tz = tzinfos.get(name) # this might still leave _tz as None...
if not _tz: # ...so we derive a tz from a timedelta
_tz = timezone(timedelta(hours=int(off)), name=name)
return datetime.strptime(s, "%Y%m%d%H%M%S.%f").replace(tzinfo=_tz)

# some test strings

t = ["19961005132200.124[-5:EST]", "19961005132200.124", "199610051322", "19961005",
"199610051322[-5:EST]", "19961005[-5:EST]"]

for s in t:
print(# normal parsing
f'{s}\n {repr(parseOFXdatetime(s))}\n'
# parsing with tzinfo mapping supplied; abbreviation -> timezone object
f' {repr(parseOFXdatetime(s, tzinfos={"EST": ZoneInfo("US/Eastern")}))}\n\n')

Parse date in python to datetime

You can try:

from datetime import datetime

test = '2020-10-06 03:39:51.000000'
print("\n{}\n".format(test))
print(datetime.fromisoformat(test))

Output

2020-10-06 03:39:51.000000

2020-10-06 03:39:51

For this case is better to use the datetime class instead of the date class since you are parsing a datetime value and not just a date.

If the test value was test = '2020-10-06' then the date class fromisoformat() method was parsing the test value.



Related Topics



Leave a reply



Submit