Convert Pandas Timezone-Aware Datetimeindex to Naive Timestamp, But in Certain Timezone

Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone

To answer my own question, this functionality has been added to pandas in the meantime. Starting from pandas 0.15.0, you can use tz_localize(None) to remove the timezone resulting in local time.

See the whatsnew entry: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#timezone-handling-improvements

So with my example from above:

In [4]: t = pd.date_range(start="2013-05-18 12:00:00", periods=2, freq='H',
tz= "Europe/Brussels")

In [5]: t
Out[5]: DatetimeIndex(['2013-05-18 12:00:00+02:00', '2013-05-18 13:00:00+02:00'],
dtype='datetime64[ns, Europe/Brussels]', freq='H')

using tz_localize(None) removes the timezone information resulting in naive local time:

In [6]: t.tz_localize(None)
Out[6]: DatetimeIndex(['2013-05-18 12:00:00', '2013-05-18 13:00:00'],
dtype='datetime64[ns]', freq='H')

Further, you can also use tz_convert(None) to remove the timezone information but converting to UTC, so yielding naive UTC time:

In [7]: t.tz_convert(None)
Out[7]: DatetimeIndex(['2013-05-18 10:00:00', '2013-05-18 11:00:00'],
dtype='datetime64[ns]', freq='H')

This is much more performant than the datetime.replace solution:

In [31]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10000, freq='H',
tz="Europe/Brussels")

In [32]: %timeit t.tz_localize(None)
1000 loops, best of 3: 233 µs per loop

In [33]: %timeit pd.DatetimeIndex([i.replace(tzinfo=None) for i in t])
10 loops, best of 3: 99.7 ms per loop

Trying to convert aware local datetime to naive local datetime in Panda DataFrame

If you read a datetime string with UTC offset like "2020-07-20 20:30:00-07:00", this will give you a Series of type datetime.datetime (not the pandas datetime64[ns]). So if I get this right, what you want to do is remove the tzinfo. This is basically described here and you can do that like

import pandas as pd

df = pd.DataFrame({'startDate':pd.to_datetime(['2020-07-20 20:30:00-07:00',
'2020-07-21 16:00:00-04:00',
'2020-07-20 20:30:00-07:00'])})
# df['startDate'].iloc[0]
# datetime.datetime(2020, 7, 20, 20, 30, tzinfo=tzoffset(None, -25200))

df['startDate_naive'] = df['startDate'].apply(lambda t: t.replace(tzinfo=None))

# df['startDate_naive']
# 0 2020-07-20 20:30:00
# 1 2020-07-21 16:00:00
# 2 2020-07-20 20:30:00
# Name: startDate_naive, dtype: datetime64[ns]

If you work with timezone aware pandas datetime column, see my answer here on how you can remove the timezone awareness.

Handling CSV with timezone-aware and timezone-naive datetime column

for given example with column date time as string datatype,

df['date time']
0 2019-10-08T01:00:00+01:00
1 2019-10-08T02:00:00+01:00
2 2019-10-08T03:00:00+01:00
3 2019-12-08T01:00:00Z
4 2019-12-08T01:00:00Z
5 2019-12-08T01:00:00Z
Name: date time, dtype: object

convert to datetime datatype using pd.to_datetime with keyword utc=True, then convert to the appropriate time zone:

df['date time'] = pd.to_datetime(df['date time'], utc=True).dt.tz_convert('Europe/London')

to get

df['date time']
0 2019-10-08 01:00:00+01:00
1 2019-10-08 02:00:00+01:00
2 2019-10-08 03:00:00+01:00
3 2019-12-08 01:00:00+00:00
4 2019-12-08 01:00:00+00:00
5 2019-12-08 01:00:00+00:00
Name: date time, dtype: datetime64[ns, Europe/London]

Now the groupby works as intended:

df.groupby([df['date time'].dt.date]).agg(['mean', 'count'])
id value
mean count mean count
date time
2019-10-08 1 3 33.333333 3
2019-12-08 1 3 21.666667 3

How to remove timezone from a Timestamp column in a pandas dataframe

The column must be a datetime dtype, for example after using pd.to_datetime.
Then, you can use tz_localize to change the time zone, a naive timestamp corresponds to time zone None:

testdata['time'].dt.tz_localize(None)

Unless the column is an index (DatetimeIndex), the .dt accessor must be used to access pandas datetime functions.

How can I convert my datetime column in pandas all to the same timezone

I think that it is not necessary to apply lambdas:

df_res['DateTime'] = pd.to_datetime(df_res['DateTime'], utc=True)

documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html

How to convert datetime.time object coming from psycopg2 to specific time zone?

A long winded way to do this:

import pytz
import datetime

zurich = pytz.timezone('Europe/Zurich')
dt = datetime.datetime.utcnow()
utc_offset = zurich.utcoffset(dt).seconds/3600

t = datetime.time(13,0,tzinfo=pytz.utc)
t
datetime.time(13, 0, tzinfo=<UTC>)

t = t.replace(tzinfo=None)
t
datetime.time(13, 0)

zurich_t = t.replace(hour=t.hour+int(utc_offset))

zurich_t
datetime.time(15, 0)

zurich_t.hour
15

Though this would probably easier to do in the database:

select '13:00+0'::timetz at time zone 'europe/zurich';
timezone
-------------
15:00:00+02

This assumes the field is timetz and the TimeZone on the server is UTC.



Related Topics



Leave a reply



Submit