Split Date Data (M/D/Y) into 3 Separate Columns

Split date data (m/d/y) into 3 separate columns

Given a text variable x, like this:

> x
[1] "10/3/2001"

then:

> as.Date(x,"%m/%d/%Y")
[1] "2001-10-03"

converts it to a date object. Then, if you need it:

> julian(as.Date(x,"%m/%d/%Y"))
[1] 11598
attr(,"origin")
[1] "1970-01-01"

gives you a Julian date (relative to 1970-01-01).

Don't try the substring thing...

See help(as.Date) for more.

How can I separate this date time column into different columns?

You can convert it into POSIXct and use then format to extract day, month, year and time.

x <- c("17/09/2019 9:15:27 a.m.", "17/09/2019 9:15:27 p.m.")
x <- gsub("\\.", "", x) #Remove the . in a.m.
x <- as.POSIXct(x, format="%d/%m/%Y %I:%M:%S %p") #convert to POSIX
data.frame(day = format(x, "%d"),
month = format(x, "%m"),
year = format(x, "%Y"),
time = format(x, "%T"))
# day month year time
#1 17 09 2019 09:15:27
#2 17 09 2019 21:15:27

In case only splitting up into columns is enough, I would use strsplit and split on / or .

x <- c("17/09/2019 9:15:27 a.m.", "17/09/2019 9:15:27 p.m.")
do.call(rbind, strsplit(x, "[/ ]"))
# [,1] [,2] [,3] [,4] [,5]
#[1,] "17" "09" "2019" "9:15:27" "a.m."
#[2,] "17" "09" "2019" "9:15:27" "p.m."

splitting dates in R

In the lubridate package, use the year() and month() functions to extract them, respectively, from a date object.

library(data.table)
library(lubridate)

DT <- as.data.table(yourDataFrame)

DT[, Date := ymd(time)]
DT[, year := year(Date)]
DT[, month := month(Date)]

Split Date-Time column (containing a character) into two separate columns in R

You can use tidyr::separate -

tidyr::separate(df, Date_Time, c('Year', 'Month', 'Day', 'Time'), sep = '[T-]')

# Year Month Day Time
#1 2020 01 01 00:48:00
#2 2020 01 01 00:46:00
#3 2020 01 02 15:07:00
#4 2020 01 02 15:07:00

Or extract date and time after converting Date_Time to POSIXct type.

library(dplyr)
library(lubridate)

df %>%
mutate(Date_Time = ymd_hms(Date_Time),
Year = year(Date_Time),
Month = month(Date_Time),
Day = day(Date_Time),
Time = format(Date_Time, '%T'))

Splitting timestamp column into separate date and time columns

I'm not sure why you would want to do this in the first place, but if you really must...

df = pd.DataFrame({'my_timestamp': pd.date_range('2016-1-1 15:00', periods=5)})

>>> df
my_timestamp
0 2016-01-01 15:00:00
1 2016-01-02 15:00:00
2 2016-01-03 15:00:00
3 2016-01-04 15:00:00
4 2016-01-05 15:00:00

df['new_date'] = [d.date() for d in df['my_timestamp']]
df['new_time'] = [d.time() for d in df['my_timestamp']]

>>> df
my_timestamp new_date new_time
0 2016-01-01 15:00:00 2016-01-01 15:00:00
1 2016-01-02 15:00:00 2016-01-02 15:00:00
2 2016-01-03 15:00:00 2016-01-03 15:00:00
3 2016-01-04 15:00:00 2016-01-04 15:00:00
4 2016-01-05 15:00:00 2016-01-05 15:00:00

The conversion to CST is more tricky. I assume that the current timestamps are 'unaware', i.e. they do not have a timezone attached? If not, how would you expect to convert them?

For more details:

https://docs.python.org/2/library/datetime.html

How to make an unaware datetime timezone aware in python

EDIT

An alternative method that only loops once across the timestamps instead of twice:

new_dates, new_times = zip(*[(d.date(), d.time()) for d in df['my_timestamp']])
df = df.assign(new_date=new_dates, new_time=new_times)


Related Topics



Leave a reply



Submit