Python Pandas Extract Year from Datetime: Df['Year'] = Df['Date'].Year Is Not Working

python pandas extract year from datetime: df['year'] = df['date'].year is not working

If you're running a recent-ish version of pandas then you can use the datetime accessor dt to access the datetime components:

In [6]:

df['date'] = pd.to_datetime(df['date'])
df['year'], df['month'] = df['date'].dt.year, df['date'].dt.month
df
Out[6]:
date Count year month
0 2010-06-30 525 2010 6
1 2010-07-30 136 2010 7
2 2010-08-31 125 2010 8
3 2010-09-30 84 2010 9
4 2010-10-29 4469 2010 10

EDIT

It looks like you're running an older version of pandas in which case the following would work:

In [18]:

df['date'] = pd.to_datetime(df['date'])
df['year'], df['month'] = df['date'].apply(lambda x: x.year), df['date'].apply(lambda x: x.month)
df
Out[18]:
date Count year month
0 2010-06-30 525 2010 6
1 2010-07-30 136 2010 7
2 2010-08-31 125 2010 8
3 2010-09-30 84 2010 9
4 2010-10-29 4469 2010 10

Regarding why it didn't parse this into a datetime in read_csv you need to pass the ordinal position of your column ([0]) because when True it tries to parse columns [1,2,3] see the docs

In [20]:

t="""date Count
6/30/2010 525
7/30/2010 136
8/31/2010 125
9/30/2010 84
10/29/2010 4469"""
df = pd.read_csv(io.StringIO(t), sep='\s+', parse_dates=[0])
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns (total 2 columns):
date 5 non-null datetime64[ns]
Count 5 non-null int64
dtypes: datetime64[ns](1), int64(1)
memory usage: 120.0 bytes

So if you pass param parse_dates=[0] to read_csv there shouldn't be any need to call to_datetime on the 'date' column after loading.

Extracting just Month and Year separately from Pandas Datetime column

If you want new columns showing year and month separately you can do this:

df['year'] = pd.DatetimeIndex(df['ArrivalDate']).year
df['month'] = pd.DatetimeIndex(df['ArrivalDate']).month

or...

df['year'] = df['ArrivalDate'].dt.year
df['month'] = df['ArrivalDate'].dt.month

Then you can combine them or work with them just as they are.

not able to extract year from a dataframe column containing dates

You are passing the whole column to the datetime.strptime() function.

What you want is to.apply() a function to each value of the Date column to get its year, e.g.:

def get_year(x):
return datetime.strptime(x, '%y-%m-%d').year

df['year'] = df.Date.apply(get_year)

Pandas extract week of year and year from date

Try:

df['date'] = pd.to_datetime(df['date'])
df['week_of_year'] = df['date'].dt.weekofyear
df['year']=(df['date']+pd.to_timedelta(6-df['date'].dt.weekday, unit='d')).dt.year

Outputs:

        date  week_of_year  year
0 2018-12-31 1 2019
1 2019-01-01 1 2019
2 2019-12-31 1 2020
3 2020-01-01 1 2020

Few things - generally avoid .apply(..).

For datetime columns you can just interact with the date through df[col].dt variable.

Then to get the last day of the week just add to date 6-weekday where weekday is between 0 (Monday) and 6 to the date

Can you extract both year AND month from date in Pandas

You can use to_period

df['month_year'] = df['date'].dt.to_period('M')


Related Topics



Leave a reply



Submit