How to convert a python datetime.datetime to excel serial date number
It appears that the Excel "serial date" format is actually the number of days since 1900-01-00, with a fractional component that's a fraction of a day, based on http://www.cpearson.com/excel/datetime.htm. (I guess that date should actually be considered 1899-12-31, since there's no such thing as a 0th day of a month)
So, it seems like it should be:
def excel_date(date1):
temp = dt.datetime(1899, 12, 30) # Note, not 31st Dec but 30th!
delta = date1 - temp
return float(delta.days) + (float(delta.seconds) / 86400)
Python datetime to Excel serial date conversion
calculate the timedelta of your datetime object versus Excel's "day zero", then divide the total_seconds of the timedelta by the seconds in a day to get Excel serial date:
import datetime
date_time_str = '2022-03-09 08:15:27'
UTC = datetime.timezone.utc
dt_obj = datetime.datetime.fromisoformat(date_time_str).replace(tzinfo=UTC)
day_zero = datetime.datetime(1899,12,30, tzinfo=UTC)
excel_serial_date = (dt_obj-day_zero).total_seconds()/86400
print(excel_serial_date)
# 44629.3440625
Note: I'm setting time zone to UTC here to avoid any ambiguities - adjust as needed.
Since the question is tagged pandas
, you'd do the same thing here, only that you don't need to set UTC as pandas assumes UTC by default for naive datetime:
import pandas as pd
ts = pd.Timestamp('2022-03-09 08:15:27')
excel_serial_date = (ts-pd.Timestamp('1899-12-30')).total_seconds()/86400
print(excel_serial_date)
# 44629.3440625
See also:
- background: What is story behind December 30, 1899 as base date?
- inverse operation: Convert Excel style date with pandas
Convert date from excel in number format to date format python
from datetime import datetime
excel_date = 42139
dt = datetime.fromordinal(datetime(1900, 1, 1).toordinal() + excel_date - 2)
tt = dt.timetuple()
print(dt)
print(tt)
As mentioned by J.F. Sebastian, this answer only works for any date after 1900/03/01
EDIT: (in answer to @R.K)
If your excel_date
is a float number, use this code:
from datetime import datetime
def floatHourToTime(fh):
hours, hourSeconds = divmod(fh, 1)
minutes, seconds = divmod(hourSeconds * 60, 1)
return (
int(hours),
int(minutes),
int(seconds * 60),
)
excel_date = 42139.23213
dt = datetime.fromordinal(datetime(1900, 1, 1).toordinal() + int(excel_date) - 2)
hour, minute, second = floatHourToTime(excel_date % 1)
dt = dt.replace(hour=hour, minute=minute, second=second)
print(dt)
assert str(dt) == "2015-05-15 00:13:55"
Excel Datetime SN Conversion in Python
Assuming your input looks like
import pandas as pd
df = pd.DataFrame({'date': ["2020-01-01", 43862, "2020-03-01"]})
you can process it as follows:
# convert everything first, ignore invalid results for now:
df['datetime'] = pd.to_datetime(df['date'])
# where you have numeric values, i.e. "excel datetime format":
nums = pd.to_numeric(df['date'], errors='coerce') # timestamp strings will give NaN here
# now replace the invalid dates:
df.loc[nums.notna(), 'datetime'] = pd.to_datetime(nums[nums.notna()], unit='d', origin='1899-12-30')
...giving you
df
date datetime
0 2020-01-01 2020-01-01
1 43862 2020-02-01
2 2020-03-01 2020-03-01
related:
- Python pandas: how to obtain the datatypes of objects in a mixed-datatype column?
- Convert Excel style date with pandas.
Convert Excel style date with pandas
OK I think the easiest thing is to construct a TimedeltaIndex
from the floats and add this to the scalar datetime for 1900,1,1
:
In [85]:
import datetime as dt
import pandas as pd
df = pd.DataFrame({'date':[42580.3333333333, 10023]})
df
Out[85]:
date
0 42580.333333
1 10023.000000
In [86]:
df['real_date'] = pd.TimedeltaIndex(df['date'], unit='d') + dt.datetime(1900,1,1)
df
Out[86]:
date real_date
0 42580.333333 2016-07-31 07:59:59.971200
1 10023.000000 1927-06-12 00:00:00.000000
OK it seems that excel is a bit weird with it's dates thanks @ayhan:
In [89]:
df['real_date'] = pd.TimedeltaIndex(df['date'], unit='d') + dt.datetime(1899, 12, 30)
df
Out[89]:
date real_date
0 42580.333333 2016-07-29 07:59:59.971200
1 10023.000000 1927-06-10 00:00:00.000000
See related: How to convert a python datetime.datetime to excel serial date number
How to convert a column with Excel Serial Dates and regular dates to a pandas datetime?
- All the dates can't be parsed in the same manner
- Load the dataframe
- Cast the
dates
column as astr
if it's not already. - Use Boolean Indexing to select different date types
- Assuming regular dates contain a
/
- Assuming Excel serial dates do not contain a
/
- Assuming regular dates contain a
- Fix each dataframe separately based on its datetime type
- Concat the dataframes back together.
import pandas as pd
from datetime import datetime
# load data
df = pd.DataFrame({'dates': ['09/01/2020', '05/15/1985', '06/07/2013', '33233', '26299', '29428']})
# display(df)
dates
0 09/01/2020
1 05/15/1985
2 06/07/2013
3 33233
4 26299
5 29428
# set the column type as a str if it isn't already
df.dates = df.dates.astype('str')
# create a date mask based on the string containing a /
date_mask = df.dates.str.contains('/')
# split the dates out for excel
df_excel = df[~date_mask].copy()
# split the regular dates out
df_reg = df[date_mask].copy()
# convert reg dates to datetime
df_reg.dates = pd.to_datetime(df_reg.dates)
# convert excel dates to datetime; the column needs to be cast as ints
df_excel.dates = pd.TimedeltaIndex(df_excel.dates.astype(int), unit='d') + datetime(1900, 1, 1)
# combine the dataframes
df = pd.concat([df_reg, df_excel])
display(df)
dates
0 2020-09-01
1 1985-05-15
2 2013-06-07
3 1990-12-28
4 1972-01-03
5 1980-07-28
How do I convert date format to integer in python?
Assuming (wild guess), that you want the number of days since 1899-12-30, you could use:
df['diff'] = (pd.to_datetime(df['date'], dayfirst=True)
-pd.Timestamp('1899-12-30')).dt.days
Output:
date valor diff
0 20/08/2008 a 39680
1 21/08/2008 b 39681
2 22/08/2008 c 39682
Related Topics
How to Find First Non-Zero Value in Every Column of a Numpy Array
Using Lxml and Iterparse() to Parse a Big (+- 1Gb) Xml File
Access Memory Address in Python
Using Multiple Python Engines (32Bit/64Bit and 2.7/3.5)
Access to Table Objects on Webpage Using Python Selenium
Unique Combinations of Values in Selected Columns in Pandas Data Frame and Count
Django Query That Get Most Recent Objects from Different Categories
Group Duplicate Column Ids in Pandas Dataframe
Subsampling Every Nth Entry in a Numpy Array
How to Merge Two Dataframes Side-By-Side
Restart Python-Script from Within Itself
Keras Sequential Model Input Layer
How to Match Any String from a List of Strings in Regular Expressions in Python
Oserror: [Winerror 193] %1 Is Not a Valid Win32 Application
Remove Reverse Duplicates from Dataframe