Keep Only Date Part When Using Pandas.To_Datetime

Extract only date from date time column pandas

If you have Date_Time column as a string, start from converting it
to datetime type:

df.Date_Time = pd.to_datetime(df.Date_Time)

Then run:

df['Date'] = df.Date_Time.dt.date

Other solution can be almost like yours, but with the format
fitting the actual formatting of the source data (year-month-day):

pd.to_datetime(df['Date_Time'], format='%Y-%m-%d').dt.floor('D')

or even without format:

pd.to_datetime(df['Date_Time']).dt.floor('D')

Caution: Although both variants give the same printout, the
actual results are different, what you can check running e.g. df.iloc[0,2].

  • In the first case the result is datetime.date(2019, 2, 27) (just date).
  • But in the second case the result is Timestamp('2019-02-27 00:00:00')
    (timestamp with "zeroed" time part).

Pandas pd.to_datetime only keep time do not date

try this

df = pd.DataFrame(data={'date':['2019-06-29 09:25:04','2019-06-29 09:30:02'],
'col2':[2,3]})

df['time'] = pd.to_datetime(df['date']).dt.time

if you want to make this as index then just do

df.set_index('time',inplace=True)

How do I remove hours and seconds from my DataFrame column in python?

You can use pd.to_datetime to convert Date column to datetime object.

df['Date'] = pd.to_datetime(df['Date']).dt.date
# or
df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%Y-%m-%d')
# or
df['Date'] = df['Date'].str.split(' ').str[0]

Converting datetime only to time in pandas

How about that?

>>> df['TimeOnly']=df['Date Created'].dt.strftime('%H:%M:%S')

>>> df
Date Created TimeOnly
0 2016-02-20 09:26:45 09:26:45
1 2016-02-19 19:30:25 19:30:25
2 2016-02-19 18:13:39 18:13:39
3 2016-03-01 14:15:36 14:15:36
4 2016-03-04 14:47:57 14:47:57

Pandas datetime - keep time only as dtype datetime

You could do it like Vishnudev suggested but then you would have dtype: object (or even strings, after using dt.strftime), which you said you didn't want.

What you are looking for doesn't exist, but the closest thing that I can get you is converting to timedeltas. Which won't seem like a solution at first but is actually very useful.

Convert it like this:

# sample df
df
>>
time
0 2021-02-07 09:22:00
1 2021-05-10 19:45:00
2 2021-01-14 06:53:00
3 2021-05-27 13:42:00
4 2021-01-18 17:28:00

df["timed"] = df.time - df.time.dt.normalize()
df
>>

time timed
0 2021-02-07 09:22:00 0 days 09:22:00 # this is just the time difference
1 2021-05-10 19:45:00 0 days 19:45:00 # since midnight, which is essentially the
2 2021-01-14 06:53:00 0 days 06:53:00 # same thing as regular time, except
3 2021-05-27 13:42:00 0 days 13:42:00 # that you can go over 24 hours
4 2021-01-18 17:28:00 0 days 17:28:00

this allows you to calculate periods between times like this:

# subtract the last time from the current
df["difference"] = df.timed - df.timed.shift()
df
Out[48]:
time timed difference
0 2021-02-07 09:22:00 0 days 09:22:00 NaT
1 2021-05-10 19:45:00 0 days 19:45:00 0 days 10:23:00
2 2021-01-14 06:53:00 0 days 06:53:00 -1 days +11:08:00 # <-- this is because the last
3 2021-05-27 13:42:00 0 days 13:42:00 0 days 06:49:00 # time was later than the current
4 2021-01-18 17:28:00 0 days 17:28:00 0 days 03:46:00 # (see below)

to get rid of odd differences, make it absolute:

df["abs_difference"] = df.difference.abs()
df
>>
time timed difference abs_difference
0 2021-02-07 09:22:00 0 days 09:22:00 NaT NaT
1 2021-05-10 19:45:00 0 days 19:45:00 0 days 10:23:00 0 days 10:23:00
2 2021-01-14 06:53:00 0 days 06:53:00 -1 days +11:08:00 0 days 12:52:00 ### <<--
3 2021-05-27 13:42:00 0 days 13:42:00 0 days 06:49:00 0 days 06:49:00
4 2021-01-18 17:28:00 0 days 17:28:00 0 days 03:46:00 0 days 03:46:00

Conduct the calculation only when the date value is valid

I would convert the whole Date column to be a date time object, using pd.to_datetime(), with the errors set to coerce, to replace the 'N/A' string to NaT (Not a Timestamp) with the below:

dft['Date'] = pd.to_datetime(dft['Date'], errors='coerce')

So the column will now look like this:

0   2022-02-01
1 2022-03-01
2 NaT
3 2022-03-11
4 2022-03-15
5 2022-05-01
Name: Date, dtype: datetime64[ns]

You can then subtract that column from the current date in one go, which will automatically ignore the NaT value, and assign this as a new column:

dft['Days'] = datetime.now() - dft['Date']

This will make dft look like below:

        Date  Total Value                     Days
0 2022-02-01 2 148 days 15:49:03.406935
1 2022-03-01 6 120 days 15:49:03.406935
2 NaT 4 NaT
3 2022-03-11 4 110 days 15:49:03.406935
4 2022-03-15 4 106 days 15:49:03.406935
5 2022-05-01 4 59 days 15:49:03.406935

If you just want the number instead of 59 days 15:49:03.406935, you can do the below instead:

df['Days'] = (datetime.now() - df['Date']).dt.days

Which will give you:

        Date  Total Value   Days
0 2022-02-01 2 148.0
1 2022-03-01 6 120.0
2 NaT 4 NaN
3 2022-03-11 4 110.0
4 2022-03-15 4 106.0
5 2022-05-01 4 59.0

Extracting just Month and Year separately from Pandas Datetime column

If you want new columns showing year and month separately you can do this:

df['year'] = pd.DatetimeIndex(df['ArrivalDate']).year
df['month'] = pd.DatetimeIndex(df['ArrivalDate']).month

or...

df['year'] = df['ArrivalDate'].dt.year
df['month'] = df['ArrivalDate'].dt.month

Then you can combine them or work with them just as they are.



Related Topics



Leave a reply



Submit