Combine Year, Month and Day in Python to Create a Date

Combine year, month and day in Python to create a date

Solution

You could use datetime.datetime along with .apply().

import datetime

d = datetime.datetime(2020, 5, 17)
date = d.date()

For pandas.to_datetime(df)

It looks like your code is fine. See pandas.to_datetime documentation and How to convert columns into one datetime column in pandas?.

df = pd.DataFrame({'year': [2015, 2016],
'month': [2, 3],
'day': [4, 5]})
pd.to_datetime(df[["year", "month", "day"]])

Output:

0   2015-02-04
1 2016-03-05
dtype: datetime64[ns]

What if your YEAR, MONTH and DAY columns have different headers?

Let's say your YEAR, MONTH and DAY columns are labeled as yy, mm and dd respectively. And you prefer to keep your column names unchanged. In that case you could do it as follows.

import pandas as pd

df = pd.DataFrame({'yy': [2015, 2016],
'mm': [2, 3],
'dd': [4, 5]})
df2 = df[["yy", "mm", "dd"]].copy()
df2.columns = ["year", "month", "day"]
pd.to_datetime(df2)

Output:

0   2015-02-04
1 2016-03-05
dtype: datetime64[ns]

How to combine year, month, and day columns to single datetime column?

There is an easier way:

In [250]: df['Date']=pd.to_datetime(df[['year','month','day']])

In [251]: df
Out[251]:
id lat lon year month day Date
0 381 53.3066 -0.54649 2004 1 2 2004-01-02
1 381 53.3066 -0.54649 2004 1 3 2004-01-03
2 381 53.3066 -0.54649 2004 1 4 2004-01-04

from docs:

Assembling a datetime from multiple columns of a DataFrame. The keys
can be common abbreviations like [year, month, day, minute,
second, ms, us, ns]) or plurals of the same

Combine month and year columns to create date column

You get the Invalid argument, not a string or column because argument 1 in your concat_ws('/', df.month, 1, df.year) is neither a column or a string (string that should be the name of a column). You can correct it by using lit built-in function, as follows:

from pyspark.sql import functions as F

df = df.select(F.concat_ws('/', df.month, F.lit(1), df.year).alias('Month'), df["*"])

Python - Combining Month, Day, Year into Date Column

pd.to_datetime can automatically parse dates from multiple columns if they are named properly ('year', 'month', 'day', 'hour', 'minute')

pd.to_datetime(df[['YY', 'MM', 'DD']].rename(columns={'YY': 'year', 'MM': 'month', 'DD': 'day'}))

Output:

1      2017-01-02
2 2017-01-02
3 2017-01-02
4 2017-01-02
5 2017-01-02
...
2427 2017-03-05
2428 2017-03-05
2429 2017-03-05
2430 2017-03-05

You can also add hours and minutes:

pd.to_datetime(df[['YY', 'MM', 'DD', 'hh', 'mm']].rename(
columns={'YY': 'year', 'MM': 'month', 'DD': 'day',
'hh': 'hour', 'mm': 'minute'}))
#1 2017-01-02 06:00:00
#2 2017-01-02 06:20:00
#...
#2429 2017-03-05 01:40:00
#2430 2017-03-05 02:00:00

Cleanly combine year and month columns to single date column with pandas

Option 1

Pass a dataframe slice with 3 columns - YEAR, MONTH, and DAY, to pd.to_datetime.

df['DATE'] = pd.to_datetime(df[['YEAR', 'MONTH']].assign(DAY=1))
df

ID MONTH YEAR DATE
0 A 1 2017 2017-01-01
1 B 2 2017 2017-02-01
2 C 3 2017 2017-03-01
3 D 4 2017 2017-04-01
4 E 5 2017 2017-05-01
5 F 6 2017 2017-06-01

Option 2

String concatenation, with pd.to_datetime.

pd.to_datetime(df.YEAR.astype(str) + '/' + df.MONTH.astype(str) + '/01')

0 2017-01-01
1 2017-02-01
2 2017-03-01
3 2017-04-01
4 2017-05-01
5 2017-06-01
dtype: datetime64[ns]

Combining year, month, week number and day to date

The module datetime gives you this opportunity. This discussion explains how to get the date from the week number.

Then, you can define a function to get the date and apply it to you dataframe.

Here the code:

# Import modules
import datetime

# Your data
df = pd.DataFrame([
[2019, 8, 29, "Fri"],
[2019, 8, 31, "Sun"],
[2019, 8, 29, "Tues"]],
columns=["year", "month", "week_num", "day"])

# A value per day
val_day = {"Mon": 0, "Tues": 1, "Weds": 2, "Thurs": 3,
"Fri": 4, "Sat": 5, "Sun": 6}

# Get the date from the year, number of week and the day


def getDate(row):
# Create string format
str_date = "{0}-W{1}-1".format(row.year,
row.week_num - 1)
print(str_date)
# Get the date
date = datetime.datetime.strptime(
str_date, "%Y-W%W-%w") + datetime.timedelta(days=val_day[row.day])
# Update date field
row["date"] = date.strftime("%Y-%m-%d")
return row


# apply the function to each row
df = df.apply(getDate, axis=1)
print(df)
# year month week_num day date
# 0 2019 8 1 Thurs 2019-01-03
# 1 2019 8 29 Fri 2019-07-19
# 2 2019 8 29 Tues 2019-07-16

Combine month name and year in a column pandas python

One type of solution is converting to datetimes and then change format by Series.dt.to_period or Series.dt.strftime:

df['Month Name-Year']=pd.to_datetime(df['Month Name']+df['Year'].astype(str),format='%b%Y')

#for months periods
df['Month Name-Year1'] = df['Month Name-Year'].dt.to_period('m')
#for 2010-02 format
df['Month Name-Year2'] = df['Month Name-Year'].dt.strftime('%Y-%m')

Simpliest is solution without convert to datetimes only join with - and convert years to strings:

#format 2010-Feb
df['Month Name-Year3'] = df['Year'].astype(str) + '-' + df['Month Name']

...what is same like converting to datetimes and then converting to custom strings:

#format 2010-Feb
df['Month Name-Year31'] = df['Month Name-Year'].dt.strftime('%Y-%b')


Related Topics



Leave a reply



Submit