Pandas Datetime to Unix Timestamp Seconds

pandas datetime to unix timestamp seconds

I think you misunderstood what the argument is for. The purpose of origin='unix' is to convert an integer timestamp to datetime, not the other way.

pd.to_datetime(1.547559e+09, unit='s', origin='unix') 
# Timestamp('2019-01-15 13:30:00')

Here are some options:

Option 1: integer division

Conversely, you can get the timestamp by converting to integer (to get nanoseconds) and divide by 109.

pd.to_datetime(['2019-01-15 13:30:00']).astype(int) / 10**9
# Float64Index([1547559000.0], dtype='float64')

Pros:

  • super fast

Cons:

  • makes assumptions about how pandas internally stores dates


Option 2: recommended by pandas

Pandas docs recommend using the following method:

# create test data
dates = pd.to_datetime(['2019-01-15 13:30:00'])

# calculate unix datetime
(dates - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')

[out]:
Int64Index([1547559000], dtype='int64')

Pros:

  • "idiomatic", recommended by the library

Cons:

  • unweildy
  • not as performant as integer division


Option 3: pd.Timestamp

If you have a single date string, you can use pd.Timestamp as shown in the other answer:

pd.Timestamp('2019-01-15 13:30:00').timestamp()
# 1547559000.0

If you have to cooerce multiple datetimes (where pd.to_datetime is your only option), you can initialize and map:

pd.to_datetime(['2019-01-15 13:30:00']).map(pd.Timestamp.timestamp)
# Float64Index([1547559000.0], dtype='float64')

Pros:

  • best method for a single datetime string
  • easy to remember

Cons:

  • not as performant as integer division

Convert pandas DateTimeIndex to Unix Time?

As DatetimeIndex is ndarray under the hood, you can do the conversion without a comprehension (much faster).

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: from datetime import datetime

In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
...: index = pd.DatetimeIndex(dates)
...:
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000],
dtype=int64)

In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)

%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop

%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop

Convert unix time to readable date in pandas dataframe

These appear to be seconds since epoch.

In [20]: df = DataFrame(data['values'])

In [21]: df.columns = ["date","price"]

In [22]: df
Out[22]:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 358 entries, 0 to 357
Data columns (total 2 columns):
date 358 non-null values
price 358 non-null values
dtypes: float64(1), int64(1)

In [23]: df.head()
Out[23]:
date price
0 1349720105 12.08
1 1349806505 12.35
2 1349892905 12.15
3 1349979305 12.19
4 1350065705 12.15
In [25]: df['date'] = pd.to_datetime(df['date'],unit='s')

In [26]: df.head()
Out[26]:
date price
0 2012-10-08 18:15:05 12.08
1 2012-10-09 18:15:05 12.35
2 2012-10-10 18:15:05 12.15
3 2012-10-11 18:15:05 12.19
4 2012-10-12 18:15:05 12.15

In [27]: df.dtypes
Out[27]:
date datetime64[ns]
price float64
dtype: object

pandas datetime to unixtime

I think you can subtract the date 1970-1-1 to create a timedelta and then access the attribute total_seconds:

In [130]:    
s = pd.Series(pd.datetime(2012,1,1))
s

Out[130]:
0 2012-01-01
dtype: datetime64[ns]

In [158]:
(s - dt.datetime(1970,1,1)).dt.total_seconds()

Out[158]:
0 1325376000
dtype: float64

How to properly convert a UNIX timestamp to pd.Timestamp object via pandas?

Problem is quite simple but not obvious. utcnow() gives you a naive datetime object, meaning that it is not aware of the fact that it represents UTC. Therefor, once you call .timestamp(), Python assumes local time because the datetime object is naive! Thus converts to UTC first before calculating Unix time, adding any UTC offset that your local tz might have.

Solution: construct a datetime object that is aware of UTC. Same goes for fromtimestamp: set UTC as tz !

from datetime import datetime, timezone
import pandas as pd

d = datetime.now(timezone.utc)
timestamp = d.timestamp()

assert datetime.fromtimestamp(timestamp, tz=timezone.utc) == d
assert pd.to_datetime(timestamp, unit="s", utc=True).to_pydatetime() == d

pandas is kind of a different story; naive datetime is treated internally as UTC, so pd.to_datetime(timestamp, unit="s") gives you the UTC timestamp. But the conversion to Python datetime does not take into account that Python will treat it as local time again... Here, keeping it consistent and setting utc=True (i.e. using an aware Timestamp) makes it work nicely.

  • Further reading: Stop using utcnow and utcfromtimestamp

python dataframe convert epoch to readable datetime hour minutes seconds as zero

If use https://www.epochconverter.com/ is added timezone.

If need add timezones to column use Series.dt.tz_localize and then Series.dt.tz_convert:

df['period'] = (pd.to_datetime(df['period'], unit='ms')
.dt.tz_localize('GMT')
.dt.tz_convert('Asia/Kathmandu'))
print (df)
period
0 2022-05-04 05:45:00+05:45
1 2022-05-03 05:45:00+05:45
2 2022-05-02 05:45:00+05:45
3 2022-05-01 05:45:00+05:45
4 2022-04-30 05:45:00+05:45
5 2022-04-29 05:45:00+05:45
6 2022-04-28 05:45:00+05:45
7 2022-04-27 05:45:00+05:45

How to convert string date column to timestamp in a new column in Python Pandas

Maybe try this?

import pandas as pd
import numpy as np

d = {'col1': ["2022-05-16T12:31:00Z", "2021-01-11T11:32:00Z"]}
df = pd.DataFrame(data=d)

df['col2'] = pd.to_datetime(df['col1'])
df['col2'] = df.col2.values.astype(np.int64) // 10 ** 9

df


Related Topics



Leave a reply



Submit