Combine Date and Time Columns Using Python Pandas

Combine Date and Time columns using python pandas

It's worth mentioning that you may have been able to read this in directly e.g. if you were using read_csv using parse_dates=[['Date', 'Time']].

Assuming these are just strings you could simply add them together (with a space), allowing you to use to_datetime, which works without specifying the format= parameter

In [11]: df['Date'] + ' ' + df['Time']
Out[11]:
0 01-06-2013 23:00:00
1 02-06-2013 01:00:00
2 02-06-2013 21:00:00
3 02-06-2013 22:00:00
4 02-06-2013 23:00:00
5 03-06-2013 01:00:00
6 03-06-2013 21:00:00
7 03-06-2013 22:00:00
8 03-06-2013 23:00:00
9 04-06-2013 01:00:00
dtype: object

In [12]: pd.to_datetime(df['Date'] + ' ' + df['Time'])
Out[12]:
0 2013-01-06 23:00:00
1 2013-02-06 01:00:00
2 2013-02-06 21:00:00
3 2013-02-06 22:00:00
4 2013-02-06 23:00:00
5 2013-03-06 01:00:00
6 2013-03-06 21:00:00
7 2013-03-06 22:00:00
8 2013-03-06 23:00:00
9 2013-04-06 01:00:00
dtype: datetime64[ns]

Alternatively, without the + ' ', but the format= parameter must be used. Additionally, pandas is good at inferring the format to be converted to a datetime, however, specifying the exact format is faster.

pd.to_datetime(df['Date'] + df['Time'], format='%m-%d-%Y%H:%M:%S')

Note: surprisingly (for me), this works fine with NaNs being converted to NaT, but it is worth worrying that the conversion (perhaps using the raise argument).

%%timeit
# sample dataframe with 10000000 rows using df from the OP
df = pd.concat([df for _ in range(1000000)]).reset_index(drop=True)

%%timeit
pd.to_datetime(df['Date'] + ' ' + df['Time'])
[result]:
1.73 s ± 10.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
pd.to_datetime(df['Date'] + df['Time'], format='%m-%d-%Y%H:%M:%S')
[result]:
1.33 s ± 9.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Combine date and time column in pandas

df = pd.DataFrame({
"Date":["2021-10-01"],
"Time":["12:42 PM"]
})
df["dateMerge"] = pd.to_datetime(df["Date"]+" "+df["Time"], infer_datetime_format=True)

Date Time dateMerge
2021-10-01 12:42 PM 2021-10-01 12:42:00

How to combine date column and time column

If you have 'Date' column as timestamp then you convert them to a string and add them , then convert them into timestamp i.e

df['Datetime'] = pd.to_datetime(df['Date'].apply(str)+' '+df['Time'])

Output :


Date Time Open High Low Close Volume \
0 2013-07-09 7:00 101.056 101.151 101.016 101.130 1822
1 2013-07-09 8:00 101.130 101.257 101.128 101.226 2286
2 2013-07-09 9:00 101.226 101.299 101.175 101.180 2685
3 2013-07-09 10:00 101.178 101.188 101.019 101.154 2980
4 2013-07-09 11:00 101.153 101.239 101.146 101.188 2623

Datetime
0 2013-07-09 07:00:00
1 2013-07-09 08:00:00
2 2013-07-09 09:00:00
3 2013-07-09 10:00:00
4 2013-07-09 11:00:00

Pandas: Join Date and Time into one datetime column

.astype(str) works on the elements of Series, not the Series itself, so of course type(df["Time"].astype(str)) == pd.Series. This seems to be the source of much of your confusion, you're acting on the Series, not its elements.

A solution (there maybe an easier way) is to just loop over the series:

dts = [datetime.datetime.strptime(elem, '%Y-%m-%d%H:%M:%S') 
for elem in df['Date'] + df['Time']]

fmted = [elem.strftime('%d-%m-%Y %H:%M:%S') for elem in dts]

df.insert(0, 'DateTime', fmted)

Pandas Combine date and time columns

Construct a dataframe from the tuples to pass to pd.to_datetime

Convert the 'time' column to time deltas with pd.to_timedelta

date = pd.to_datetime(
pd.DataFrame(
df.date.tolist(),
columns=['year', 'month', 'day']
)
)

time = pd.to_timedelta(df.time)

date + time

0 2016-05-07 01:01:01.125
1 2016-05-08 02:03:05.691
dtype: datetime64[ns]

Python pandas - join date & time columns into datetime column with timezone

Upon calling read_csv, set dayfirst=True so that the date is parsed correctly. Floor to minutes using dt.floor:

data = pd.read_csv(f'{data_path}/{symbol}.csv', parse_dates=[['Date','Time']], dayfirst=True)

data = data.set_index(data['Date_Time'].dt.floor('min')).tz_localize('Asia/Kolkata')

# need to drop col used as index separately here:
data = data.drop(['Date_Time'], axis=1)


Related Topics



Leave a reply



Submit