Combining Date and Time into a Date Column for Plotting

Combining date and time into a Date column for plotting

Reconstruct your data:

dat <- read.table(text="
date time numbers
01-02-2010 14:57 5
01-02-2010 23:23 7
02-02-2010 05:05 3
02-02-2010 10:23 11", header=TRUE)

Now use as.POSIXct() and paste() to combine your date and time into a POSIX date. You need to specify the format, using the symbols defined in ?strptime. Also see ?DateTimeClasses for more information

dat$newdate <- with(dat, as.POSIXct(paste(date, time), format="%m-%d-%Y %H:%M"))
plot(numbers ~ newdate, data=dat, type="b", col="blue")

Sample Image

How to merge date and time into one variable

We could use ymd_hm function from lubridate package:

library(lubridate)

df$Date_time <- ymd_hm(paste0(df$Date, df$Time))
   Date       Time  Date_time          
<date> <chr> <dttm>
1 2017-06-24 08:40 2017-06-24 08:40:00
2 2019-10-29 10:00 2019-10-29 10:00:00
3 2017-02-10 22:10 2017-02-10 22:10:00
4 2016-08-10 18:00 2016-08-10 18:00:00
5 2017-12-08 08:00 2017-12-08 08:00:00
6 2017-08-28 04:30 2017-08-28 04:30:00
7 2019-09-18 20:00 2019-09-18 20:00:00
8 2019-02-04 15:40 2019-02-04 15:40:00
9 2019-02-09 11:00 2019-02-09 11:00:00
10 2020-03-23 07:00 2020-03-23 07:00:00

In R how do you combine date and time character columns into a a single column?

Try this:

#I think you might have put a comma instead of a dash in you example data. Corrected by gsub:    
df$Date <- gsub(",", "-",df$Date)

#Create a new combined date and time column:
df$DateTime <- paste(df$Date,df$Time)

#Convert the new column into a POSIXlt class object:
df$DateTime <- as.POSIXlt(df$DateTime, c("%Y-%m-%d %H:%M"), tz = "HST")

Another approach using piping actions from the dplyr package:

library(dplyr)

df %>%
mutate(Date = gsub(",","-",Date)) %>%
mutate(DateTime = paste(Date,Time)) %>%
mutate(DateTime = as.POSIXlt(DateTime, c("%Y-%m-%d %H:%M"), tz = "HST")) -> df

Howto convert from Character to Date and merge Date and Time fields in a dataframe in R

library(lubridate)
as_datetime(paste(df$date, df$time, sep = " "))

so, adding dplyr library we can:

df |> mutate(newDate = as_datetime(paste(df$date[1], df$time[1], sep = " ")))
#
# A tibble: 2 × 3
date time newDate
<chr> <chr> <dttm>
1 2021-11-21 10:05:17 2021-11-21 10:05:17
2 2021-11-2 10:04:48 2021-11-21 10:05:17

and then you can dplyr::filter(newDate >= as.datetime("2022-01-22 09:00:00))

Grzegorz

Combine Date and Time columns using pandas

It's worth mentioning that you may have been able to read this in directly e.g. if you were using read_csv using parse_dates=[['Date', 'Time']].

Assuming these are just strings you could simply add them together (with a space), allowing you to use to_datetime, which works without specifying the format= parameter

In [11]: df['Date'] + ' ' + df['Time']
Out[11]:
0 01-06-2013 23:00:00
1 02-06-2013 01:00:00
2 02-06-2013 21:00:00
3 02-06-2013 22:00:00
4 02-06-2013 23:00:00
5 03-06-2013 01:00:00
6 03-06-2013 21:00:00
7 03-06-2013 22:00:00
8 03-06-2013 23:00:00
9 04-06-2013 01:00:00
dtype: object

In [12]: pd.to_datetime(df['Date'] + ' ' + df['Time'])
Out[12]:
0 2013-01-06 23:00:00
1 2013-02-06 01:00:00
2 2013-02-06 21:00:00
3 2013-02-06 22:00:00
4 2013-02-06 23:00:00
5 2013-03-06 01:00:00
6 2013-03-06 21:00:00
7 2013-03-06 22:00:00
8 2013-03-06 23:00:00
9 2013-04-06 01:00:00
dtype: datetime64[ns]

Alternatively, without the + ' ', but the format= parameter must be used. Additionally, pandas is good at inferring the format to be converted to a datetime, however, specifying the exact format is faster.

pd.to_datetime(df['Date'] + df['Time'], format='%m-%d-%Y%H:%M:%S')

Note: surprisingly (for me), this works fine with NaNs being converted to NaT, but it is worth worrying that the conversion (perhaps using the raise argument).

%%timeit

# sample dataframe with 10000000 rows using df from the OP
df = pd.concat([df for _ in range(1000000)]).reset_index(drop=True)

%%timeit
pd.to_datetime(df['Date'] + ' ' + df['Time'])
[result]:
1.73 s ± 10.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
pd.to_datetime(df['Date'] + df['Time'], format='%m-%d-%Y%H:%M:%S')
[result]:
1.33 s ± 9.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

How to combine Date from one column and Time from another?

Using base R, we can extract date from DoS and time from ToS and combine them together.

transform(sample, Datetime = as.POSIXct(paste(as.Date(DoS), format(ToS, "%T"))))

# DoS ToS Datetime
#1 2018-01-27 1899-12-31 19:00:00 2018-01-27 19:00:00
#2 2018-02-07 1899-12-31 15:45:00 2018-02-07 15:45:00
#3 2018-02-13 1899-12-31 23:00:00 2018-02-13 23:00:00
#4 2018-02-15 1899-12-31 13:45:00 2018-02-15 13:45:00
#5 2018-02-16 1899-12-31 10:00:00 2018-02-16 10:00:00
#6 2018-02-19 1899-12-31 15:00:00 2018-02-19 15:00:00
#7 2018-02-20 1899-12-31 15:05:00 2018-02-20 15:05:00
#8 2018-02-21 1899-12-31 15:00:00 2018-02-21 15:00:00
#9 2018-02-22 1899-12-31 11:30:00 2018-02-22 11:30:00
#10 2018-03-01 1899-12-31 19:30:00 2018-03-01 19:30:00

Problem combining Date and Time Column using python pandas

If you want a string, use:

df['Date_Time'] = df.pop('Date')+' '+df.pop('Time')

output:

   Value            Date_Time
0 0.00 30.07.2020 11:08:00
1 0.01 30.07.2020 24:00:00
2 0.02 31.07.2020 00:01:00

To correctly handle the 24:00 as dateimt:

# drop date/time and concatenate as single string
s = df.pop('Date')+' '+df.pop('Time')

# identify dates with 24:00 format
m = s.str.contains(' 24:')

# convert to datetime and add 1 day
df['Date_Time'] = (pd.to_datetime(s.str.replace(' 24:', ' 00:'))
+ pd.DateOffset(days=1)*m.astype(int)
)

output:

   Value           Date_Time
0 0.00 2020-07-30 11:08:00
1 0.01 2020-07-31 00:00:00
2 0.02 2020-07-31 00:01:00

Python Pandas Combined Date and Hour Into One Column and plot using lineplot

Try using the following to combine your MultiIndex into a single DatetimeIndex:

df.set_index(pd.to_datetime(df.index.get_level_values(0) ) +
pd.to_timedelta(df.index.get_level_values(1), unit='H'),
inplace=True)

From the data you provided, there appears to be gaps, for example there is no 'msg_count' value at 2015-01-01 09:00.

To fix this you can DataFrame.reindex with pandas.date_range and fill missing entries with 0

new_idx = pd.date_range(df.index.min(), df.index.max(), freq='H')

df.reindex(new_idx, fill_value=0, inplace=True)

To plot 2015 data only use:

df[df.index.year == 2015].plot()


Related Topics



Leave a reply



Submit