pandas datetime to unix timestamp seconds
I think you misunderstood what the argument is for. The purpose of origin='unix'
is to convert an integer timestamp to datetime
, not the other way.
pd.to_datetime(1.547559e+09, unit='s', origin='unix')
# Timestamp('2019-01-15 13:30:00')
Here are some options:
Option 1: integer division
Conversely, you can get the timestamp by converting to integer (to get nanoseconds) and divide by 109.
pd.to_datetime(['2019-01-15 13:30:00']).astype(int) / 10**9
# Float64Index([1547559000.0], dtype='float64')
Pros:
- super fast
Cons:
- makes assumptions about how pandas internally stores dates
Option 2: recommended by pandas
Pandas docs recommend using the following method:
# create test data
dates = pd.to_datetime(['2019-01-15 13:30:00'])
# calculate unix datetime
(dates - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')
[out]:
Int64Index([1547559000], dtype='int64')
Pros:
- "idiomatic", recommended by the library
Cons:
- unweildy
- not as performant as integer division
Option 3: pd.Timestamp
If you have a single date string, you can use pd.Timestamp
as shown in the other answer:
pd.Timestamp('2019-01-15 13:30:00').timestamp()
# 1547559000.0
If you have to cooerce multiple datetimes (where pd.to_datetime
is your only option), you can initialize and map:
pd.to_datetime(['2019-01-15 13:30:00']).map(pd.Timestamp.timestamp)
# Float64Index([1547559000.0], dtype='float64')
Pros:
- best method for a single datetime string
- easy to remember
Cons:
- not as performant as integer division
Convert pandas DateTimeIndex to Unix Time?
As DatetimeIndex
is ndarray
under the hood, you can do the conversion without a comprehension (much faster).
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: from datetime import datetime
In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
...: index = pd.DatetimeIndex(dates)
...:
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000],
dtype=int64)
In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)
%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop
%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop
Convert unix time to readable date in pandas dataframe
These appear to be seconds since epoch.
In [20]: df = DataFrame(data['values'])
In [21]: df.columns = ["date","price"]
In [22]: df
Out[22]:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 358 entries, 0 to 357
Data columns (total 2 columns):
date 358 non-null values
price 358 non-null values
dtypes: float64(1), int64(1)
In [23]: df.head()
Out[23]:
date price
0 1349720105 12.08
1 1349806505 12.35
2 1349892905 12.15
3 1349979305 12.19
4 1350065705 12.15
In [25]: df['date'] = pd.to_datetime(df['date'],unit='s')
In [26]: df.head()
Out[26]:
date price
0 2012-10-08 18:15:05 12.08
1 2012-10-09 18:15:05 12.35
2 2012-10-10 18:15:05 12.15
3 2012-10-11 18:15:05 12.19
4 2012-10-12 18:15:05 12.15
In [27]: df.dtypes
Out[27]:
date datetime64[ns]
price float64
dtype: object
pandas datetime to unixtime
I think you can subtract the date 1970-1-1
to create a timedelta
and then access the attribute total_seconds
:
In [130]:
s = pd.Series(pd.datetime(2012,1,1))
s
Out[130]:
0 2012-01-01
dtype: datetime64[ns]
In [158]:
(s - dt.datetime(1970,1,1)).dt.total_seconds()
Out[158]:
0 1325376000
dtype: float64
How to properly convert a UNIX timestamp to pd.Timestamp object via pandas?
Problem is quite simple but not obvious. utcnow()
gives you a naive datetime object, meaning that it is not aware of the fact that it represents UTC. Therefor, once you call .timestamp()
, Python assumes local time because the datetime object is naive! Thus converts to UTC first before calculating Unix time, adding any UTC offset that your local tz might have.
Solution: construct a datetime object that is aware of UTC. Same goes for fromtimestamp
: set UTC as tz !
from datetime import datetime, timezone
import pandas as pd
d = datetime.now(timezone.utc)
timestamp = d.timestamp()
assert datetime.fromtimestamp(timestamp, tz=timezone.utc) == d
assert pd.to_datetime(timestamp, unit="s", utc=True).to_pydatetime() == d
pandas
is kind of a different story; naive datetime is treated internally as UTC, so pd.to_datetime(timestamp, unit="s")
gives you the UTC timestamp. But the conversion to Python datetime does not take into account that Python will treat it as local time again... Here, keeping it consistent and setting utc=True
(i.e. using an aware Timestamp) makes it work nicely.
- Further reading: Stop using utcnow and utcfromtimestamp
python dataframe convert epoch to readable datetime hour minutes seconds as zero
If use https://www.epochconverter.com/
is added timezone.
If need add timezones to column use Series.dt.tz_localize
and then Series.dt.tz_convert
:
df['period'] = (pd.to_datetime(df['period'], unit='ms')
.dt.tz_localize('GMT')
.dt.tz_convert('Asia/Kathmandu'))
print (df)
period
0 2022-05-04 05:45:00+05:45
1 2022-05-03 05:45:00+05:45
2 2022-05-02 05:45:00+05:45
3 2022-05-01 05:45:00+05:45
4 2022-04-30 05:45:00+05:45
5 2022-04-29 05:45:00+05:45
6 2022-04-28 05:45:00+05:45
7 2022-04-27 05:45:00+05:45
How to convert string date column to timestamp in a new column in Python Pandas
Maybe try this?
import pandas as pd
import numpy as np
d = {'col1': ["2022-05-16T12:31:00Z", "2021-01-11T11:32:00Z"]}
df = pd.DataFrame(data=d)
df['col2'] = pd.to_datetime(df['col1'])
df['col2'] = df.col2.values.astype(np.int64) // 10 ** 9
df
Related Topics
Inheritance of Private and Protected Methods in Python
Formatting Long Numbers as Strings in Python
Take the Content of a List and Append It to Another List
Python Method for Reading Keypress
How to Plot Nan Values as a Special Color with Imshow in Matplotlib
How to Install Pyqt4 on Windows Using Pip
Imports in _Init_.Py and 'Import As' Statement
Why Do "Not a Number" Values Equal True When Cast as Boolean in Python/Numpy
What Is the Fastest Way to Parse Large Xml Docs in Python
Re.Findall Not Returning Full Match
Cannot Return Results from Stored Procedure Using Python Cursor
Calculation Error with Pow Operator
Difference Between Static Static_Url and Static_Root on Django
Getting the Docstring from a Function