Change a Column from Birth Date to Age in R

change a column from birth date to age in r

From the comments of this blog entry, I found the age_calc function in the eeptools package. It takes care of edge cases (leap years, etc.), checks inputs and looks quite robust.

library(eeptools)
x <- as.Date(c("2011-01-01", "1996-02-29"))
age_calc(x[1],x[2]) # default is age in months

[1] 46.73333 224.83118

age_calc(x[1],x[2], units = "years") # but you can set it to years

[1] 3.893151 18.731507

floor(age_calc(x[1],x[2], units = "years"))

[1] 3 18

For your data

yourdata$age <- floor(age_calc(yourdata$birthdate, units = "years"))

assuming you want age in integer years.

Convert date of birth to age

"11-10-1969" (month day year or day month year) is not an unambiguous date format. To get it properly converted you will need to specify the format argument to as.Date()

Note also that a 4-digit year needs a capital Y in the format string: "%d-%m-%Y" (or "%d/%m/%Y" for /). Sys.Date() is already a Date object, so you don't need the format argument with the /s in it.

> as.numeric(Sys.Date() - as.Date("11-10-1969", format="%d-%m-%Y")) / 365.25
#> [1] 52.56674

EDIT: use 365.25 to approximate leap years per Henry's suggestion in comment

Efficient and accurate age calculation (in years, months, or weeks) in R given birth date and an arbitrary date

Ok, so I found this function in another post:

age <- function(from, to) {
from_lt = as.POSIXlt(from)
to_lt = as.POSIXlt(to)

age = to_lt$year - from_lt$year

ifelse(to_lt$mon < from_lt$mon |
(to_lt$mon == from_lt$mon & to_lt$mday < from_lt$mday),
age - 1, age)
}

It was posted by @Jim saying "The following function takes a vectors of Date objects and calculates the ages, correctly accounting for leap years. Seems to be a simpler solution than any of the other answers".

It is indeed simpler and it does the trick I was looking for. On average, it is actually faster than the arithmetic method (about 75% faster).

mbm <- microbenchmark(
arithmetic = (givendate - birthdate) / 365.25,
lubridate = interval(start = birthdate, end = givendate) /
duration(num = 1, units = "years"),
eeptools = age_calc(dob = birthdate, enddate = givendate,
units = "years"),
age = age(from = birthdate, to = givendate),
times = 1000
)
mbm
autoplot(mbm)

Sample Image
Sample Image

And at least in my examples it does not make any mistake (and it should not in any example; it's a pretty straightforward function using ifelses).

toy_df <- data.frame(
birthdate = birthdate,
givendate = givendate,
arithmetic = as.numeric((givendate - birthdate) / 365.25),
lubridate = interval(start = birthdate, end = givendate) /
duration(num = 1, units = "years"),
eeptools = age_calc(dob = birthdate, enddate = givendate,
units = "years"),
age = age(from = birthdate, to = givendate)
)
toy_df[, 3:6] <- floor(toy_df[, 3:6])
toy_df

birthdate givendate arithmetic lubridate eeptools age
1 1978-12-30 2015-12-31 37 37 37 37
2 1978-12-31 2015-12-31 36 37 37 37
3 1979-01-01 2015-12-31 36 37 36 36
4 1962-12-30 2015-12-31 53 53 53 53
5 1962-12-31 2015-12-31 52 53 53 53
6 1963-01-01 2015-12-31 52 53 52 52
7 2000-06-16 2050-06-17 50 50 50 50
8 2000-06-17 2050-06-17 49 50 50 50
9 2000-06-18 2050-06-17 49 50 49 49
10 2007-03-18 2008-03-19 1 1 1 1
11 2007-03-19 2008-03-19 1 1 1 1
12 2007-03-20 2008-03-19 0 1 0 0
13 1968-02-29 2015-02-28 46 47 46 46
14 1968-02-29 2015-03-01 47 47 47 47
15 1968-02-29 2015-03-02 47 47 47 47

I do not consider it as a complete solution because I also wanted to have age in months and weeks, and this function is specific for years. I post it here anyway because it solves the problem for the age in years. I will not accept it because:

  1. I would wait for @Jim to post it as an answer.
  2. I will wait to see if someone else come up with a complete solution (efficient, accurate and producing age in years, months or weeks as desired).

Calculating age in R from dob

First, copy and paste the function age_calc from the blog post to which you linked into your R console (or RStudio console) and hit 'Enter' to store it.

The function takes 3 arguments: dob, enddate and units. The dob argument needs to be of class Date. Units can be days, months or years. Assuming that you want years, this should add a column age to your data frame:

P4PA$age <- age_calc(as.Date(P4PA$DDN, "%m/%d/%Y"), units = "years")

P4PA
DDN age
1 4/22/1956 60
2 12/26/1964 52
3 4/16/1963 53
4 1/28/1970 47
5 7/15/1972 44
6 1/18/1956 61

In R, how can I calculate age based on birth date using eeptools?

It looks like eeptools has an age_calc() function.

your_data <- data.frame(stringsAsFactors=FALSE,
Born = c("1946-05-27", "1979-06-19", "1980-04-18", "1958-06-12",
"1948-03-23", "1973-07-24", "1949-09-15", "1950-03-12",
"1952-04-20", "1950-06-20"),
bioguide = c("A000370", "A000371", "A000367", "A000369", "B001291",
"B000213", "B001281", "B001271", "B001292", "B001293")
)

library(eeptools)
#> Loading required package: ggplot2

your_data$age <- eeptools::age_calc(dob = as.Date(your_data$Born),
enddate = Sys.Date(),
units = 'years')

your_data
#> Born bioguide age
#> 1 1946-05-27 A000370 73.62459
#> 2 1979-06-19 A000371 40.56158
#> 3 1980-04-18 A000367 39.73224
#> 4 1958-06-12 A000369 61.58075
#> 5 1948-03-23 B001291 71.80328
#> 6 1973-07-24 B000213 46.46569
#> 7 1949-09-15 B001281 70.32048
#> 8 1950-03-12 B001271 69.83281
#> 9 1952-04-20 B001292 67.72678
#> 10 1950-06-20 B001293 69.55884

Created on 2020-01-10 by the reprex package (v0.3.0)

More on eeptools here: https://github.com/jknowles/eeptools

Calculate age at first record for each ID

We can use difftime to get the difference in days and divide by 365

library(dplyr)
d %>%
group_by(ID) %>%
mutate(age_first_record = as.numeric(difftime(min(service_date),
dob, unit = 'day')/365)) %>%
ungroup

-output

# A tibble: 4 x 4
ID dob service_date age_first_record
<chr> <date> <date> <dbl>
1 a 2004-04-17 2018-01-01 13.7
2 a 2004-04-17 2019-07-12 13.7
3 b 2009-04-24 2014-12-23 5.67
4 b 2009-04-24 2016-04-27 5.67


Related Topics



Leave a reply



Submit