Combine year, month and day in Python to create a date
Solution
You could use datetime.datetime
along with .apply()
.
import datetime
d = datetime.datetime(2020, 5, 17)
date = d.date()
For pandas.to_datetime(df)
It looks like your code is fine. See pandas.to_datetime
documentation and How to convert columns into one datetime column in pandas?.
df = pd.DataFrame({'year': [2015, 2016],
'month': [2, 3],
'day': [4, 5]})
pd.to_datetime(df[["year", "month", "day"]])
Output:
0 2015-02-04
1 2016-03-05
dtype: datetime64[ns]
What if your YEAR, MONTH and DAY columns have different headers?
Let's say your YEAR, MONTH and DAY columns are labeled as yy
, mm
and dd
respectively. And you prefer to keep your column names unchanged. In that case you could do it as follows.
import pandas as pd
df = pd.DataFrame({'yy': [2015, 2016],
'mm': [2, 3],
'dd': [4, 5]})
df2 = df[["yy", "mm", "dd"]].copy()
df2.columns = ["year", "month", "day"]
pd.to_datetime(df2)
Output:
0 2015-02-04
1 2016-03-05
dtype: datetime64[ns]
How to combine year, month, and day columns to single datetime column?
There is an easier way:
In [250]: df['Date']=pd.to_datetime(df[['year','month','day']])
In [251]: df
Out[251]:
id lat lon year month day Date
0 381 53.3066 -0.54649 2004 1 2 2004-01-02
1 381 53.3066 -0.54649 2004 1 3 2004-01-03
2 381 53.3066 -0.54649 2004 1 4 2004-01-04
from docs:
Assembling a datetime from multiple columns of a DataFrame. The keys
can be common abbreviations like [year
,month
,day
,minute
,
second
,ms
,us
,ns
]) or plurals of the same
Combine month and year columns to create date column
You get the Invalid argument, not a string or column
because argument 1
in your concat_ws('/', df.month, 1, df.year)
is neither a column or a string (string that should be the name of a column). You can correct it by using lit
built-in function, as follows:
from pyspark.sql import functions as F
df = df.select(F.concat_ws('/', df.month, F.lit(1), df.year).alias('Month'), df["*"])
Python - Combining Month, Day, Year into Date Column
pd.to_datetime
can automatically parse dates from multiple columns if they are named properly ('year', 'month', 'day', 'hour', 'minute'
)
pd.to_datetime(df[['YY', 'MM', 'DD']].rename(columns={'YY': 'year', 'MM': 'month', 'DD': 'day'}))
Output:
1 2017-01-02
2 2017-01-02
3 2017-01-02
4 2017-01-02
5 2017-01-02
...
2427 2017-03-05
2428 2017-03-05
2429 2017-03-05
2430 2017-03-05
You can also add hours and minutes:
pd.to_datetime(df[['YY', 'MM', 'DD', 'hh', 'mm']].rename(
columns={'YY': 'year', 'MM': 'month', 'DD': 'day',
'hh': 'hour', 'mm': 'minute'}))
#1 2017-01-02 06:00:00
#2 2017-01-02 06:20:00
#...
#2429 2017-03-05 01:40:00
#2430 2017-03-05 02:00:00
Cleanly combine year and month columns to single date column with pandas
Option 1
Pass a dataframe slice with 3 columns - YEAR
, MONTH
, and DAY
, to pd.to_datetime
.
df['DATE'] = pd.to_datetime(df[['YEAR', 'MONTH']].assign(DAY=1))
df
ID MONTH YEAR DATE
0 A 1 2017 2017-01-01
1 B 2 2017 2017-02-01
2 C 3 2017 2017-03-01
3 D 4 2017 2017-04-01
4 E 5 2017 2017-05-01
5 F 6 2017 2017-06-01
Option 2
String concatenation, with pd.to_datetime
.
pd.to_datetime(df.YEAR.astype(str) + '/' + df.MONTH.astype(str) + '/01')
0 2017-01-01
1 2017-02-01
2 2017-03-01
3 2017-04-01
4 2017-05-01
5 2017-06-01
dtype: datetime64[ns]
Combining year, month, week number and day to date
The module datetime
gives you this opportunity. This discussion explains how to get the date from the week number.
Then, you can define a function to get the date and apply it to you dataframe.
Here the code:
# Import modules
import datetime
# Your data
df = pd.DataFrame([
[2019, 8, 29, "Fri"],
[2019, 8, 31, "Sun"],
[2019, 8, 29, "Tues"]],
columns=["year", "month", "week_num", "day"])
# A value per day
val_day = {"Mon": 0, "Tues": 1, "Weds": 2, "Thurs": 3,
"Fri": 4, "Sat": 5, "Sun": 6}
# Get the date from the year, number of week and the day
def getDate(row):
# Create string format
str_date = "{0}-W{1}-1".format(row.year,
row.week_num - 1)
print(str_date)
# Get the date
date = datetime.datetime.strptime(
str_date, "%Y-W%W-%w") + datetime.timedelta(days=val_day[row.day])
# Update date field
row["date"] = date.strftime("%Y-%m-%d")
return row
# apply the function to each row
df = df.apply(getDate, axis=1)
print(df)
# year month week_num day date
# 0 2019 8 1 Thurs 2019-01-03
# 1 2019 8 29 Fri 2019-07-19
# 2 2019 8 29 Tues 2019-07-16
Combine month name and year in a column pandas python
One type of solution is converting to datetimes and then change format by Series.dt.to_period
or Series.dt.strftime
:
df['Month Name-Year']=pd.to_datetime(df['Month Name']+df['Year'].astype(str),format='%b%Y')
#for months periods
df['Month Name-Year1'] = df['Month Name-Year'].dt.to_period('m')
#for 2010-02 format
df['Month Name-Year2'] = df['Month Name-Year'].dt.strftime('%Y-%m')
Simpliest is solution without convert to datetimes only join with -
and convert years to strings:
#format 2010-Feb
df['Month Name-Year3'] = df['Year'].astype(str) + '-' + df['Month Name']
...what is same like converting to datetimes and then converting to custom strings:
#format 2010-Feb
df['Month Name-Year31'] = df['Month Name-Year'].dt.strftime('%Y-%b')
Related Topics
Reading Columns of a Txt File on Python
Convert HTML String to an Image in Python
Python Json Add Key-Value Pair
Python, Anaconda, Spyder - Uninstalling Python Package Using Pip Does Not Work in Spyder + Ipython
How to Install Tesseract for Python on Anaconda
Type Conversion in Python Attributeerror: 'Str' Object Has No Attribute 'Astype'
How to Write Multiple Images (Subplots) into One Image
What Causes a Python Segmentation Fault
Converting Exponential to Float
Pyspark Replace All Values in Dataframe With Another Values
Join Dataframes Based on Partial String-Match Between Columns
How to Call a Classes Method from Another Class Without Initialising the First Class
Converting Json into Newline Delimited Json in Python
Python: How to Read and Load an Excel File from Aws S3
Typeerror: Missing 1 Required Positional Argument: 'Self'
Python - Outputting Variables to Txt File
How to Dynamically Build a Json Object
Regex That Matches a Number With Commas for Every Three Digits