Strip Timezone Info in Pandas

Strip timezone info in pandas

Maybe help strip last 6 chars:

print df
datetime
0 2015-12-01 00:00:00-06:00
1 2015-12-01 00:00:00-06:00
2 2015-12-01 00:00:00-06:00

df['datetime'] = df['datetime'].astype(str).str[:-6]
print df
datetime
0 2015-12-01 00:00:00
1 2015-12-01 00:00:00
2 2015-12-01 00:00:00

How to remove timezone from a Timestamp column in a pandas dataframe

The column must be a datetime dtype, for example after using pd.to_datetime.
Then, you can use tz_localize to change the time zone, a naive timestamp corresponds to time zone None:

testdata['time'].dt.tz_localize(None)

Unless the column is an index (DatetimeIndex), the .dt accessor must be used to access pandas datetime functions.

How to remove timezone from a Timestamp column in a pandas dataframe

The column must be a datetime dtype, for example after using pd.to_datetime.
Then, you can use tz_localize to change the time zone, a naive timestamp corresponds to time zone None:

testdata['time'].dt.tz_localize(None)

Unless the column is an index (DatetimeIndex), the .dt accessor must be used to access pandas datetime functions.

Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone

To answer my own question, this functionality has been added to pandas in the meantime. Starting from pandas 0.15.0, you can use tz_localize(None) to remove the timezone resulting in local time.

See the whatsnew entry: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#timezone-handling-improvements

So with my example from above:

In [4]: t = pd.date_range(start="2013-05-18 12:00:00", periods=2, freq='H',
tz= "Europe/Brussels")

In [5]: t
Out[5]: DatetimeIndex(['2013-05-18 12:00:00+02:00', '2013-05-18 13:00:00+02:00'],
dtype='datetime64[ns, Europe/Brussels]', freq='H')

using tz_localize(None) removes the timezone information resulting in naive local time:

In [6]: t.tz_localize(None)
Out[6]: DatetimeIndex(['2013-05-18 12:00:00', '2013-05-18 13:00:00'],
dtype='datetime64[ns]', freq='H')

Further, you can also use tz_convert(None) to remove the timezone information but converting to UTC, so yielding naive UTC time:

In [7]: t.tz_convert(None)
Out[7]: DatetimeIndex(['2013-05-18 10:00:00', '2013-05-18 11:00:00'],
dtype='datetime64[ns]', freq='H')

This is much more performant than the datetime.replace solution:

In [31]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10000, freq='H',
tz="Europe/Brussels")

In [32]: %timeit t.tz_localize(None)
1000 loops, best of 3: 233 µs per loop

In [33]: %timeit pd.DatetimeIndex([i.replace(tzinfo=None) for i in t])
10 loops, best of 3: 99.7 ms per loop

How can I remove a pytz timezone from a datetime object?

To remove a timezone (tzinfo) from a datetime object:

# dt_tz is a datetime.datetime object
dt = dt_tz.replace(tzinfo=None)

If you are using a library like arrow, then you can remove timezone by simply converting an arrow object to to a datetime object, then doing the same thing as the example above.

# <Arrow [2014-10-09T10:56:09.347444-07:00]>
arrowObj = arrow.get('2014-10-09T10:56:09.347444-07:00')

# datetime.datetime(2014, 10, 9, 10, 56, 9, 347444, tzinfo=tzoffset(None, -25200))
tmpDatetime = arrowObj.datetime

# datetime.datetime(2014, 10, 9, 10, 56, 9, 347444)
tmpDatetime = tmpDatetime.replace(tzinfo=None)

Why would you do this? One example is that mysql does not support timezones with its DATETIME type. So using ORM's like sqlalchemy will simply remove the timezone when you give it a datetime.datetime object to insert into the database. The solution is to convert your datetime.datetime object to UTC (so everything in your database is UTC since it can't specify timezone) then either insert it into the database (where the timezone is removed anyway) or remove it yourself. Also note that you cannot compare datetime.datetime objects where one is timezone aware and another is timezone naive.

##############################################################################
# MySQL example! where MySQL doesn't support timezones with its DATETIME type!
##############################################################################

arrowObj = arrow.get('2014-10-09T10:56:09.347444-07:00')

arrowDt = arrowObj.to("utc").datetime

# inserts datetime.datetime(2014, 10, 9, 17, 56, 9, 347444, tzinfo=tzutc())
insertIntoMysqlDatabase(arrowDt)

# returns datetime.datetime(2014, 10, 9, 17, 56, 9, 347444)
dbDatetimeNoTz = getFromMysqlDatabase()

# cannot compare timzeone aware and timezone naive
dbDatetimeNoTz == arrowDt # False, or TypeError on python versions before 3.3

# compare datetimes that are both aware or both naive work however
dbDatetimeNoTz == arrowDt.replace(tzinfo=None) # True

Remove timezone (+01:00) from DateTime

In the first line, the parameter utc=True is not necessary as it converts the input to UTC (subtracting one hour in your case).

In the second line, I get an AttributeError: 'Timestamp' object has no attribute 'dt'. Be aware that to_datetime can return different objects depending on the input.

So the following works for me (using a Timestamp object):

mood['response_time'] = '2019-02-21 15:31:37+01:00'
# Convert to date
mood['response_time'] = pd.to_datetime(mood['response_time'])
# Remove +01:00
mood['response_time'] = mood['response_time'].strftime('%Y-%m-%d %H:%M:%S')
# -> '2019-02-21 15:31:37'


Related Topics



Leave a reply



Submit