Add Correct Century to Dates With Year Provided as "Year Without Century", %Y

Add correct century to dates with year provided as Year without century, %y

1) chron. chron uses 30 by default so this will convert them converting first to Date (since chron can't read those sorts of dates) reformatting to character with two digit years into a format that chron can understand and finally back to Date.

library(chron)
xx <- c("01AUG11", "01AUG12", "01AUG13") # sample data
as.Date(chron(format(as.Date(xx, "%d%b%y"), "%m/%d/%y")))

That gives a cutoff of 30 but we can get a cutoff of 13 using chron's chron.year.expand option:

library(chron)
options(chron.year.expand =
function (y, cut.off = 12, century = c(1900, 2000), ...) {
chron:::year.expand(y, cut.off = cut.off, century = century, ...)
}
)

and then repeating the original conversion. For example assuming we had run this options statement already we would get the following with our xx :

> as.Date(chron(format(as.Date(xx, "%d%b%y"), "%m/%d/%y")))
[1] "2011-08-01" "2012-08-01" "1913-08-01"

2) Date only. Here is an alternative that does not use chron. You might want to replace "2012-12-31" with Sys.Date() if the idea is that otherwise future dates are really to be set 100 years back:

d <- as.Date(xx, "%d%b%y")
as.Date(ifelse(d > "2012-12-31", format(d, "19%y-%m-%d"), format(d)))

EDIT: added Date only solution.

Adding the Century to 2-Digit Year

Try

df$date <- as.Date(with(df, paste(1900+YR, MO, DA,sep="-")), "%Y-%m-%d")

Define year for two digits year date format

We can create a function to do this

library(lubridate)
f1 <- function(x, year=1970){
x <- dmy(x)
m <- year(x) %% 100
year(x) <- ifelse(m > year %% 100, 2000+m, 1900+m)
x
}

f1(character_date)
#[1] "1944-01-19"

If this always have 19 as prefix for year

dmy(sub("-(\\d+)", "-19\\1", character_date))
#[1] "1944-01-19"

posixct time not understanding the '60s

Two-digit years are ambiguous. You can add a "19" using regex then parse with %Y instead of %y

library(tidyverse)

discharge %>%
rownames_to_column(var="date") %>%
as_tibble() %>%
mutate(date = strptime(sub("^(\\d+/\\d+/)(\\d+)$", "\\119\\2", date),
format = "%m/%d/%Y"))
#> # A tibble: 261 x 2
#> date Original
#> <dttm> <int>
#> 1 1963-04-01 00:00:00 1100
#> 2 1963-05-01 00:00:00 1030
#> 3 1963-06-01 00:00:00 982
#> 4 1963-07-01 00:00:00 703
#> 5 1963-08-01 00:00:00 587
#> 6 1963-09-01 00:00:00 512
#> 7 1963-10-01 00:00:00 606
#> 8 1963-11-01 00:00:00 667
#> 9 1963-12-01 00:00:00 1010
#> 10 1964-01-01 00:00:00 1400
#> # ... with 251 more rows

Created on 2022-04-28 by the reprex package (v2.0.1)

R Lubridate Returns Unwanted Century When Given Two Digit Year

Lubridate v1.7.1 does not have this issue.

How to display the correct date century in Pandas?

In this specific case, I would use this:

pd.to_datetime(df['DOB'].str[:-2] + '19' + df['DOB'].str[-2:])

Note that this will break if you have DOBs after 1999!

Output:

0   1984-01-01
1 1985-07-31
2 1985-08-24
3 1993-12-30
4 1977-09-12
5 1990-08-09
6 1988-01-06
7 1989-04-10
8 1991-11-15
9 1968-01-06
dtype: datetime64[ns]

convert character format to date format in r lubridate, leading year 20 not 19

I think you have to add "19" yourself, unless you want to use hydrostats::four.digit.year:

hydrostats::four.digit.year(dmy("4/11/64"), year=1900)

(the function is only a few lines long, so you could just copy it if you didn't want to depend on the package)

function (x, year = 1968)  {
n <- as.numeric(strftime(x, format = "%y"))%%100
Y <- ifelse(n > year%%100, 1900 + n, 2000 + n)
return(Y)
}

Otherwise, you're stuck with the POSIX standard. "%y" is the standard tag for converting two- (or one-) digit years, and from ?strptime:

‘%y’ Year without century (00-99). On input, values 00 to 68 are
prefixed by 20 and 69 to 99 by 19 - that is the behaviour
specified by the 2018 POSIX standard, but it does also say
‘it is expected that in a future version the default century
inferred from a 2-digit year will change’.

The standard itself is available here:

If century is not specified, then values in the range [69,99] shall refer to years 1969 to 1999 inclusive, and values in the range [00,68] shall refer to years 2000 to 2068 inclusive.

See also: why strptime for two digit year for 69 returns 1969 in python?

date format in R

It doesn't look (from the documentation for %y in ?strptime) like there's any obvious option for changing the default century inferred from 2-digit years.

Since the objects returned by strptime() have class POSIXlt, though, it's a pretty simple matter to subtract 100 years from any dates after today (or after any other cutoff date you'd like to use).

# Use strptime() to create object of class POSIXlt
dd <- c("20-Sep-90", "24-Feb-05", "16-Aug-65",
"19-Nov-56", "28-Nov-59", "19-Apr-86")
DD <- strptime(dd, '%d-%b-%y')

# Subtract 100 years from any date after today
DD$year <- ifelse(DD > Sys.time(), DD$year-100, DD$year)
DD
[1] "1990-09-20" "2005-02-24" "1965-08-16" "1956-11-19" "1959-11-28" "1986-04-19"

date import, incorrect century

Dates are stored internal as integer days, so there is only such formatting at the time of input or output. As for input without century information I think you are out of luck. Here's what ?strptime says about the %y format spec: "On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is the behaviour specified by the 2004 and 2008 POSIX standards, but they do also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’."

  as.Date( "01/01/64", "%m/%d/%y", origin="1970-01-01") -100*365.25
#[1] "1964-01-01"

It might be possible to start a bar fight about programmers who allow removal of century information given that Y2K is so recent in the past.

Since the default is to assume year 00-68 is 2000-2068, it is certainly possible to create an as.Dateshift

as.Date with two-digit years

x = format(as.Date("10.10.61", "%d.%m.%y"), "19%y-%m-%d")
x = as.Date(x)
x
class(x)


Related Topics



Leave a reply



Submit