How to Convert Variable With Mixed Date Formats to One Format

How to convert variable with mixed date formats to one format?

You may try parse_date_time in package lubridate which "allows the user to specify several format-orders to handle heterogeneous date-time character representations" using the orders argument. Something like...

library(lubridate)
parse_date_time(x = df$date,
orders = c("d m y", "d B Y", "m/d/y"),
locale = "eng")

...should be able to handle most of your formats. Please note that b/B formats are locale sensitive.

Other date-time formats which can be used in orders are listed in the Details section in ?strptime.

Fixing mixed date formats in data frame?

The parse_date_time function in lubridate allows you to parse vectors with heterogeneous formats using the "orders" argument:

require(lubridate)
x <- c("November 2, 2014", "13 August, 2014")

parse_date_time(x, orders = c("mdy", "dmy"))
[1] "2014-11-02 UTC" "2014-08-13 UTC"

convert mixed date formats in a data frame

If you know that you only have those two formats, you can first identify them (the first one has alphabetic characters in it, the other does not) and convert accordingly.

dateDF$date <- as.POSIXlt( 
dateDF$date,
format = ifelse(
grepl("[a-z]", d$date),
"%H:%M %d-%b-%y",
"%d/%m/%Y %H:%M"
)
)

How do I change all dates into the same format in R (if columns has bunch of different character formats)?

Try one format and then the other for the values that returned NA.

y <- as.Date(input)
na <- is.na(y)
y[na] <- as.Date(input[na], "%m/%d/%Y")
y
#[1] "2019-01-22" "2019-04-17" "2009-05-17" "2010-05-17" "2015-05-17"
#[6] "2019-07-30"

How to format multiple date formats into single date in python

In an ideal world, you know the format of your inputs.

Where this is not possible, I recommend you use a 3rd party library for mixed format dates.

Two libraries that come to mind are dateutil (via dateutil.parser.parse) and pandas (via pandas.to_datetime). Below is an example implementation with the former.

Note the only occasion when parser.parse was unsuccessful had to be covered with a manual conversion via datetime.strptime. datetime is part of the standard Python library.

from datetime import datetime
from dateutil import parser

list1 = ["30-4-1994", "1994-30-04", "30/04/1994",
"30-apr-1994", "30/apr/1994","1994-30-apr"]

def converter(lst):
for i in lst:
try:
yield parser.parse(i)
except ValueError:
try:
yield parser.parse(i, dayfirst=True)
except ValueError:
try:
yield datetime.strptime(i, '%Y-%d-%b')
except:
yield i

res = list(converter(list1))

# [datetime.datetime(1994, 4, 30, 0, 0),
# datetime.datetime(1994, 4, 30, 0, 0),
# datetime.datetime(1994, 4, 30, 0, 0),
# datetime.datetime(1994, 4, 30, 0, 0),
# datetime.datetime(1994, 4, 30, 0, 0),
# datetime.datetime(1994, 4, 30, 0, 0)]

You can then format into strings any way you like using datetime.strptime:

res_str = [i.strftime('%d-%m-%Y') for i in res]

# ['30-04-1994',
# '30-04-1994',
# '30-04-1994',
# '30-04-1994',
# '30-04-1994',
# '30-04-1994']

Date Field Mixed (Date and Text) Formatting

If you have "1190729 " in A2 then the formula

=DATE(2000+(MID(A2,2,2)),(MID(A2,4,2)),(RIGHT(TRIM(A2),2)))

will produce the date 29th July 2019 in your current date format



Related Topics



Leave a reply



Submit