Combining date and time into a Date column for plotting
Reconstruct your data:
dat <- read.table(text="
date time numbers
01-02-2010 14:57 5
01-02-2010 23:23 7
02-02-2010 05:05 3
02-02-2010 10:23 11", header=TRUE)
Now use as.POSIXct()
and paste()
to combine your date and time into a POSIX date. You need to specify the format, using the symbols defined in ?strptime
. Also see ?DateTimeClasses
for more information
dat$newdate <- with(dat, as.POSIXct(paste(date, time), format="%m-%d-%Y %H:%M"))
plot(numbers ~ newdate, data=dat, type="b", col="blue")
How to merge date and time into one variable
We could use ymd_hm
function from lubridate
package:
library(lubridate)
df$Date_time <- ymd_hm(paste0(df$Date, df$Time))
Date Time Date_time
<date> <chr> <dttm>
1 2017-06-24 08:40 2017-06-24 08:40:00
2 2019-10-29 10:00 2019-10-29 10:00:00
3 2017-02-10 22:10 2017-02-10 22:10:00
4 2016-08-10 18:00 2016-08-10 18:00:00
5 2017-12-08 08:00 2017-12-08 08:00:00
6 2017-08-28 04:30 2017-08-28 04:30:00
7 2019-09-18 20:00 2019-09-18 20:00:00
8 2019-02-04 15:40 2019-02-04 15:40:00
9 2019-02-09 11:00 2019-02-09 11:00:00
10 2020-03-23 07:00 2020-03-23 07:00:00
In R how do you combine date and time character columns into a a single column?
Try this:
#I think you might have put a comma instead of a dash in you example data. Corrected by gsub:
df$Date <- gsub(",", "-",df$Date)
#Create a new combined date and time column:
df$DateTime <- paste(df$Date,df$Time)
#Convert the new column into a POSIXlt class object:
df$DateTime <- as.POSIXlt(df$DateTime, c("%Y-%m-%d %H:%M"), tz = "HST")
Another approach using piping actions from the dplyr
package:
library(dplyr)
df %>%
mutate(Date = gsub(",","-",Date)) %>%
mutate(DateTime = paste(Date,Time)) %>%
mutate(DateTime = as.POSIXlt(DateTime, c("%Y-%m-%d %H:%M"), tz = "HST")) -> df
Howto convert from Character to Date and merge Date and Time fields in a dataframe in R
library(lubridate)
as_datetime(paste(df$date, df$time, sep = " "))
so, adding dplyr
library we can:
df |> mutate(newDate = as_datetime(paste(df$date[1], df$time[1], sep = " ")))
#
# A tibble: 2 × 3
date time newDate
<chr> <chr> <dttm>
1 2021-11-21 10:05:17 2021-11-21 10:05:17
2 2021-11-2 10:04:48 2021-11-21 10:05:17
and then you can dplyr::filter(newDate >= as.datetime("2022-01-22 09:00:00))
Grzegorz
Combine Date and Time columns using pandas
It's worth mentioning that you may have been able to read this in directly e.g. if you were using read_csv
using parse_dates=[['Date', 'Time']]
.
Assuming these are just strings you could simply add them together (with a space), allowing you to use to_datetime
, which works without specifying the format=
parameter
In [11]: df['Date'] + ' ' + df['Time']
Out[11]:
0 01-06-2013 23:00:00
1 02-06-2013 01:00:00
2 02-06-2013 21:00:00
3 02-06-2013 22:00:00
4 02-06-2013 23:00:00
5 03-06-2013 01:00:00
6 03-06-2013 21:00:00
7 03-06-2013 22:00:00
8 03-06-2013 23:00:00
9 04-06-2013 01:00:00
dtype: object
In [12]: pd.to_datetime(df['Date'] + ' ' + df['Time'])
Out[12]:
0 2013-01-06 23:00:00
1 2013-02-06 01:00:00
2 2013-02-06 21:00:00
3 2013-02-06 22:00:00
4 2013-02-06 23:00:00
5 2013-03-06 01:00:00
6 2013-03-06 21:00:00
7 2013-03-06 22:00:00
8 2013-03-06 23:00:00
9 2013-04-06 01:00:00
dtype: datetime64[ns]
Alternatively, without the + ' '
, but the format=
parameter must be used. Additionally, pandas is good at inferring the format to be converted to a datetime
, however, specifying the exact format is faster.
pd.to_datetime(df['Date'] + df['Time'], format='%m-%d-%Y%H:%M:%S')
Note: surprisingly (for me), this works fine with NaNs being converted to NaT, but it is worth worrying that the conversion (perhaps using the raise
argument).
%%timeit
# sample dataframe with 10000000 rows using df from the OP
df = pd.concat([df for _ in range(1000000)]).reset_index(drop=True)
%%timeit
pd.to_datetime(df['Date'] + ' ' + df['Time'])
[result]:
1.73 s ± 10.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
pd.to_datetime(df['Date'] + df['Time'], format='%m-%d-%Y%H:%M:%S')
[result]:
1.33 s ± 9.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
How to combine Date from one column and Time from another?
Using base R, we can extract date from DoS
and time from ToS
and combine them together.
transform(sample, Datetime = as.POSIXct(paste(as.Date(DoS), format(ToS, "%T"))))
# DoS ToS Datetime
#1 2018-01-27 1899-12-31 19:00:00 2018-01-27 19:00:00
#2 2018-02-07 1899-12-31 15:45:00 2018-02-07 15:45:00
#3 2018-02-13 1899-12-31 23:00:00 2018-02-13 23:00:00
#4 2018-02-15 1899-12-31 13:45:00 2018-02-15 13:45:00
#5 2018-02-16 1899-12-31 10:00:00 2018-02-16 10:00:00
#6 2018-02-19 1899-12-31 15:00:00 2018-02-19 15:00:00
#7 2018-02-20 1899-12-31 15:05:00 2018-02-20 15:05:00
#8 2018-02-21 1899-12-31 15:00:00 2018-02-21 15:00:00
#9 2018-02-22 1899-12-31 11:30:00 2018-02-22 11:30:00
#10 2018-03-01 1899-12-31 19:30:00 2018-03-01 19:30:00
Problem combining Date and Time Column using python pandas
If you want a string, use:
df['Date_Time'] = df.pop('Date')+' '+df.pop('Time')
output:
Value Date_Time
0 0.00 30.07.2020 11:08:00
1 0.01 30.07.2020 24:00:00
2 0.02 31.07.2020 00:01:00
To correctly handle the 24:00 as dateimt:
# drop date/time and concatenate as single string
s = df.pop('Date')+' '+df.pop('Time')
# identify dates with 24:00 format
m = s.str.contains(' 24:')
# convert to datetime and add 1 day
df['Date_Time'] = (pd.to_datetime(s.str.replace(' 24:', ' 00:'))
+ pd.DateOffset(days=1)*m.astype(int)
)
output:
Value Date_Time
0 0.00 2020-07-30 11:08:00
1 0.01 2020-07-31 00:00:00
2 0.02 2020-07-31 00:01:00
Python Pandas Combined Date and Hour Into One Column and plot using lineplot
Try using the following to combine your MultiIndex
into a single DatetimeIndex
:
df.set_index(pd.to_datetime(df.index.get_level_values(0) ) +
pd.to_timedelta(df.index.get_level_values(1), unit='H'),
inplace=True)
From the data you provided, there appears to be gaps, for example there is no 'msg_count' value at 2015-01-01 09:00.
To fix this you can DataFrame.reindex
with pandas.date_range
and fill missing entries with 0
new_idx = pd.date_range(df.index.min(), df.index.max(), freq='H')
df.reindex(new_idx, fill_value=0, inplace=True)
To plot 2015 data only use:
df[df.index.year == 2015].plot()
Related Topics
How to Get The R Shiny Downloadhandler Filename to Work
The Fastest Way to Convert Numeric to Character in R
How to Add Columnn Titles in a Sankey Chart Networkd3
Combining Multiple Identically-Named Columns in R
Using The Result of Summarise (Dplyr) to Mutate The Original Dataframe
Find Match of Two Data Frames and Rewrite The Answer as Data Frame
Error in Dev.Off(): Cannot Shut Down Device 1 (The Null Device)
Assigning/Referencing a Column Name in Data.Table Dynamically (In I, J and By)
Na.Locf and Inverse.Rle in Rcpp
Run R Interactively from Rscript
Using If Else on a Dataframe Across Multiple Columns
Ggplot2: Making Changes to Symbols in The Legend
How to Round Percentage to 2 Decimal Places in Ggplot2
Split Line by Multiple Points Using Sf Package
How to Align or Center The Bars of a Histogram on The X Axis
How to Make Install.Packages Return an Error If an R Package Cannot Be Installed