Given a birthday column, how to calculate a persons age for the current day?
- This answer assumes the desired output is the age, in years, from the birthday until today.
(today - bd)
returns a value in total number of days.- Divide by
pd.Timedelta(days=365.25)
, to get the number of years.365.25
because the length of a year changes
- Divide by
- See Calculate Pandas DataFrame Time Difference Between Two Columns in Hours and Minutes for additional details about timedeltas and converting to different time units.
- The original function returns a value, but the returned values aren't assigned back to the dataframe, with something like
df['age (years)'] = calc_age(df.birthdate)
, which is why the values show up in the console. - Also, there is no need to use
.apply
. This applies the function iteratively, and is why there are multiple outputs to the console.
import pandas as pd
from datetime import date
# read the file in and covert the birthdate column to a datetime
df = pd.read_csv('vanco.csv', parse_dates=['birthdate'])
# function
def calc_age(bd: pd.Series) -> pd.Series:
today = pd.to_datetime(date.today()) # convert today to a pandas datetime
return (today - bd) / pd.Timedelta(days=365.25) # divide by days to get years
# call function and assign the values to a new column in the dataframe
df['age (years)'] = calc_age(df.birthdate)
# display(df)
firstname secondname birthdate age (years)
0 vanco grizov 1983-03-16 37.48118
1 vlado stojanov 1982-06-24 38.20671
2 goce grizov 1985-07-18 35.14031
Alternatively
- Use the
dateutil
module, which has a more feature rich selection of methodsrelativedelta
will compensate for leap years- Use
.years
to extract only the year component, from therelativedelta
object.
# updated function with relativedelta
def calc_age(bd: pd.Series):
today = pd.to_datetime(date.today())
return bd.apply(lambda x: relativedelta(today, x).years)
# function call
df['age (years)'] = calc_age(df.birthdate)
# display(df)
firstname secondname birthdate age (years)
0 vanco grizov 1983-03-16 37
1 vlado stojanov 1982-06-24 38
2 goce grizov 1985-07-18 35
# the output if .years is removed from the calc_age function
firstname secondname birthdate age (years)
0 vanco grizov 1983-03-16 relativedelta(years=+37, months=+5, days=+22)
1 vlado stojanov 1982-06-24 relativedelta(years=+38, months=+2, days=+14)
2 goce grizov 1985-07-18 relativedelta(years=+35, months=+1, days=+20)
Calculating Age from Birthdate TypeError: strptime() argument 1 must be str, not float
You are converting in and out of dates / datetimes a few too many times
final_df['D_O_B__c'] = pd.to_datetime(final_df['D_O_B__c'], format = "%Y-%m-%d", errors = 'coerce')
Once you've run this line, the column is now a pandas Datetime dtype
final_df['D_O_B__c'] = pd.to_datetime(final_df['D_O_B__c']).dt.date
this line is unnecessary
In the calculate age function, you don't need to use datetime.strptime since the supplied object will already be a datetime, so your function can be simplified to
def calculate_age(born):
today = date.today()
return today.year - born.year - ((today.month, today.day) < (born.month, born.day))
And now this line will run just fine, returning NAs for rows with bad/null timestamp strings originally
final_df['Age'] = df['D_O_B__c'].apply(calculate_age)
To fill those values in with the median age, you can just dofinal_df.loc[final_df['Age'].isnull(),'Age'] = final_df['Age'].median()
which calculates the median 'Age' value for non-null rows, and then sets all null rows to that value
Calculating age in python
So close!
You need to convert the string into a datetime object before you can do calculations on it - see datetime.datetime.strptime()
.
For your date input, you need to do:
datetime.strptime(input_text, "%d %m %Y")
#!/usr/bin/env python3
from datetime import datetime, date
print("Your date of birth (dd mm yyyy)")
date_of_birth = datetime.strptime(input("--->"), "%d %m %Y")
def calculate_age(born):
today = date.today()
return today.year - born.year - ((today.month, today.day) < (born.month, born.day))
age = calculate_age(date_of_birth)
print(age)
PS: I urge you to use a sensible order of input - dd mm yyyy
or the ISO standard yyyy mm dd
how to calculate age form list of dates in python
You will need to format the dates appropriately, so research the datetime date function and figure out where to go from here.
from datetime import date
today = date.today()
for n, bday in zip(name, birthdate):
age = (today - date(bday)).years
print("Student ", n, "age is ", age, "years old.")
How to get a person's age from his date of birth in Odoo?
Dates are complicated. If you subtract two dates from each other (assuming datetime.date
) you get a timedelta
consisting of a number of days. Because of leap years you can't reliably calculate number of years from that. By dividing by 365.25, you get a number of years that's correct in most cases. But most cases is usually not acceptable.
Instead, you should calculate year difference, offset by -1 if the person has not had their birthday in the current year.
from datetime import date
def get_age(date_of_birth: date) -> int:
today = date.today()
offset = int(date_of_birth.replace(year=today.year) > today) # int(True) == 1, int(False) == 0
return date.today().year - date_of_birth.year - offset
Tests
# On 2020-05-11
get_age(date(2020, 5, 11))
>>> 0
get_age(date(2020, 5, 12))
>>> -1
get_age(date(2000, 5, 11))
>>> 20
get_age(date(2000, 5, 12))
>>> 19
Related Topics
Change Date Formats in CSV With Python 3
I Want to Multiply Two Columns in a Pandas Dataframe and Add the Result into a New Column
What Else Do I Need for Codehs 8.3.8: Word Ladder
How to Write a Python Script That Can Read Doc/Docx Files and Convert Them to Txt
Reading an Excel Named Range into a Pandas Dataframe
Matplotlib: Attributeerror: 'Axessubplot' Object Has No Attribute 'Add_Axes'
How to Generate and Open an Outlook Email With Python (But Do Not Send)
How to Retrieve Data from Dynamic Table - Selenium Python
How to Resolve Modulenotfounderror: No Module Named 'Google.Colab'
Fillna in Multiple Columns in Place in Python Pandas
Convert Float to Float Time in Python
How to Convert an Integer to Time
How to Assign and Use Column Headers in Spark
Filtering Date Column in Python
How to Install Pypdf2 Module Using Windows
Python List - Only Keep Only-Positive or Only-Negative Values