python pandas extract year from datetime: df['year'] = df['date'].year is not working
If you're running a recent-ish version of pandas then you can use the datetime accessor dt
to access the datetime components:
In [6]:
df['date'] = pd.to_datetime(df['date'])
df['year'], df['month'] = df['date'].dt.year, df['date'].dt.month
df
Out[6]:
date Count year month
0 2010-06-30 525 2010 6
1 2010-07-30 136 2010 7
2 2010-08-31 125 2010 8
3 2010-09-30 84 2010 9
4 2010-10-29 4469 2010 10
EDIT
It looks like you're running an older version of pandas in which case the following would work:
In [18]:
df['date'] = pd.to_datetime(df['date'])
df['year'], df['month'] = df['date'].apply(lambda x: x.year), df['date'].apply(lambda x: x.month)
df
Out[18]:
date Count year month
0 2010-06-30 525 2010 6
1 2010-07-30 136 2010 7
2 2010-08-31 125 2010 8
3 2010-09-30 84 2010 9
4 2010-10-29 4469 2010 10
Regarding why it didn't parse this into a datetime in read_csv
you need to pass the ordinal position of your column ([0]
) because when True
it tries to parse columns [1,2,3]
see the docs
In [20]:
t="""date Count
6/30/2010 525
7/30/2010 136
8/31/2010 125
9/30/2010 84
10/29/2010 4469"""
df = pd.read_csv(io.StringIO(t), sep='\s+', parse_dates=[0])
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns (total 2 columns):
date 5 non-null datetime64[ns]
Count 5 non-null int64
dtypes: datetime64[ns](1), int64(1)
memory usage: 120.0 bytes
So if you pass param parse_dates=[0]
to read_csv
there shouldn't be any need to call to_datetime
on the 'date' column after loading.
Extracting just Month and Year separately from Pandas Datetime column
If you want new columns showing year and month separately you can do this:
df['year'] = pd.DatetimeIndex(df['ArrivalDate']).year
df['month'] = pd.DatetimeIndex(df['ArrivalDate']).month
or...
df['year'] = df['ArrivalDate'].dt.year
df['month'] = df['ArrivalDate'].dt.month
Then you can combine them or work with them just as they are.
not able to extract year from a dataframe column containing dates
You are passing the whole column to the datetime.strptime()
function.
What you want is to.apply()
a function to each value of the Date
column to get its year, e.g.:
def get_year(x):
return datetime.strptime(x, '%y-%m-%d').year
df['year'] = df.Date.apply(get_year)
Pandas extract week of year and year from date
Try:
df['date'] = pd.to_datetime(df['date'])
df['week_of_year'] = df['date'].dt.weekofyear
df['year']=(df['date']+pd.to_timedelta(6-df['date'].dt.weekday, unit='d')).dt.year
Outputs:
date week_of_year year
0 2018-12-31 1 2019
1 2019-01-01 1 2019
2 2019-12-31 1 2020
3 2020-01-01 1 2020
Few things - generally avoid .apply(..)
.
For datetime
columns you can just interact with the date through df[col].dt
variable.
Then to get the last day of the week just add to date 6-weekday
where weekday
is between 0 (Monday) and 6 to the date
Can you extract both year AND month from date in Pandas
You can use to_period
df['month_year'] = df['date'].dt.to_period('M')
Related Topics
What Is the Time Complexity of Popping Elements from List in Python
Python Read JSON File and Modify
What Are Dictionary View Objects
A Very Simple Multithreading Parallel Url Fetching (Without Queue)
How to Move Pandas Data from Index to Column After Multiple Groupby
How to Use PDFminer as a Library
Meaning of Inter_Op_Parallelism_Threads and Intra_Op_Parallelism_Threads
Concatenate a List of Pandas Dataframes Together
How to Specify an Authenticated Proxy for a Python Http Connection
Getting Today's Date in Yyyy-Mm-Dd in Python
Merging Two CSV Files Using Python
How to Convert a Decimal Number into Fraction
How to Pass Extra Arguments to a Python Decorator
Replacing Text in a File with Python