Problems formatting date into format %Y-%m
A year and a month do not make a date. You need a day also.
d <- data.frame(V1=c("1950-12","1951-01"))
as.Date(paste(d$V1,1,sep="-"),"%Y-%m-%d")
# [1] "1950-12-01" "1951-01-01"
You could also use the yearmon
class in the zoo package.
library(zoo)
as.yearmon(d$V1)
# [1] "Dec 1950" "Jan 1951"
Convert dates to %y-%m-%d format in Python
Step 0:-
Your dataframe
:-
df=pd.read_csv('your file name.csv')
Step 1:-
firstly convert your 'date' column into datetime
by using to_datetime()
method:-
df['date']=pd.to_datetime(df['date'])
Step 2:-
And If you want to convert them in string
like format Then use:-
df['date']=df['date'].astype(str)
Now if you print df or write df(if you are using jupyter notebook)
Output:-
0 2020-01-01
1 2020-12-31
2 2020-06-20
Format JavaScript date as yyyy-mm-dd
You can do:
function formatDate(date) {
var d = new Date(date),
month = '' + (d.getMonth() + 1),
day = '' + d.getDate(),
year = d.getFullYear();
if (month.length < 2)
month = '0' + month;
if (day.length < 2)
day = '0' + day;
return [year, month, day].join('-');
}
console.log(formatDate('Sun May 11,2014'));
How to format a UTC date as a `YYYY-MM-DD hh:mm:ss` string using NodeJS?
If you're using Node.js, you're sure to have EcmaScript 5, and so Date has a toISOString
method. You're asking for a slight modification of ISO8601:
new Date().toISOString()
> '2012-11-04T14:51:06.157Z'
So just cut a few things out, and you're set:
new Date().toISOString().
replace(/T/, ' '). // replace T with a space
replace(/\..+/, '') // delete the dot and everything after
> '2012-11-04 14:55:45'
Or, in one line: new Date().toISOString().replace(/T/, ' ').replace(/\..+/, '')
ISO8601 is necessarily UTC (also indicated by the trailing Z on the first result), so you get UTC by default (always a good thing).
Converting year and month ( yyyy-mm format) to a date?
Try this. (Here we use text=Lines
to keep the example self contained but in reality we would replace it with the file name.)
Lines <- "2009-01 12
2009-02 310
2009-03 2379
2009-04 234
2009-05 14
2009-08 1
2009-09 34
2009-10 2386"
library(zoo)
z <- read.zoo(text = Lines, FUN = as.yearmon)
plot(z)
The X axis is not so pretty with this data but if you have more data in reality it might be ok or you can use the code for a fancy X axis shown in the examples section of ?plot.zoo
.
The zoo series, z
, that is created above has a "yearmon"
time index and looks like this:
> z
Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 Aug 2009 Sep 2009 Oct 2009
12 310 2379 234 14 1 34 2386
"yearmon"
can be used alone as well:
> as.yearmon("2000-03")
[1] "Mar 2000"
Note:
"yearmon"
class objects sort in calendar order.This will plot the monthly points at equally spaced intervals which is likely what is wanted; however, if it were desired to plot the points at unequally spaced intervals spaced in proportion to the number of days in each month then convert the index of
z
to"Date"
class:time(z) <- as.Date(time(z))
.
as.Date with dates in format m/d/y in R
Use capital Y
in as.Date
call instead. This should do the trick:
> as.Date("3/15/2012", "%m/%d/%Y")
[1] "2012-03-15"
From the help file's examples you can realize when year is full specified you should use %Y
otherwise %y
for example:
> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> as.Date(dates, "%m/%d/%y")
[1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
You can see that in your example the Year format is 2012 then you should use %Y
, and in the other example (taken from the as.Date
help file) Year format is 92 then using %y
is the correct way to go. See as.Date
for further details.
How to convert Python date format '%B - %Y' back to '%Y-%m-%d'?
Edit ::
I didn't do any benchmark, but just so you know, casting your column as date, doing... df['your_column'] = pd.to_datetime(df['your_column'])
will convert any date you have to an ISO format. See below the second example.
The second example should be way faster though. :)
import pandas as pd
from datetime import datetime
data = {
'A' : ['July - 2019', 'June - 2020'],
'B' : [1, 2]
}
df = pd.DataFrame(data)
print(df, end='\n\n')
# A B
# 0 July - 2019 1
# 1 June - 2020 2
day_to_put = 15
df['A'] = df['A'].apply( lambda x: datetime.strptime(x, '%B - %Y')\
.replace(day=day_to_put)\
.strftime('%Y-%m-%d') )
print(df)
# A B
#0 2019-07-15 1
#1 2020-06-15 2
Second example
import pandas as pd
from datetime import datetime
data = {
'A' : ['July - 2019', 'June - 2020'],
'B' : [1, 2]
}
df = pd.DataFrame(data)
print(df, end='\n\n')
# A B
# 0 July - 2019 1
# 1 June - 2020 2
df['A'] = pd.to_datetime(df['A'])
print(df)
# A B
# 0 2019-07-01 1
# 1 2020-06-01 2
Pandas - Datetime format change to '%m/%d/%Y'
The reason you have to use errors="ignore"
is because not all the dates you are parsing are in the correct format. If you use errors="coerce"
like @phi has mentioned then any dates that cannot be converted will be set to NaT
. The columns datatype will still be converted to datatime64 and you can then format as you like and deal with the NaT
as you want.
Example
A dataframe with one item in Date
not written as Year/Month/Day (25th Month is wrong):
>>> df = pd.DataFrame({'ID': [91060, 91061, 91062, 91063], 'Date': ['2017/11/10', '2022/05/01', '2022/04/01', '2055/25/25']})
>>> df
ID Date
0 91060 2017/11/10
1 91061 2022/05/01
2 91062 2022/04/01
3 91063 2055/25/25
>>> df.dtypes
ID int64
Date object
dtype: object
Using errors="ignore"
:
>>> df['Date'] = pd.to_datetime(df['Date'], errors='ignore')
>>> df
ID Date
0 91060 2017/11/10
1 91061 2022/05/01
2 91062 2022/04/01
3 91063 2055/25/25
>>> df.dtypes
ID int64
Date object
dtype: object
Column Date
is still an object because not all the values have been converted. Running df['Date'] = df['Date'].dt.strftime("%m/%d/%Y")
will result in the AttributeError
Using errors="coerce"
:
>>> df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
>>> df
ID Date
0 91060 2017-11-10
1 91061 2022-05-01
2 91062 2022-04-01
3 91063 NaT
>>> df.dtypes
ID int64
Date datetime64[ns]
dtype: object
Invalid dates are set to NaT and the column is now of type datatime64 and you can now format it:
>>> df['Date'] = df['Date'].dt.strftime("%m/%d/%Y")
>>> df
ID Date
0 91060 11/10/2017
1 91061 05/01/2022
2 91062 04/01/2022
3 91063 NaN
Note: When formatting datatime64, it is converted back to type object so NaT's are changed to NaN. The issue you are having is a case of some dirty data not in the correct format.
Dealing with +00:00 in datetime format
The easiest thing to do is let pd.to_datetime auto-infer the format. That works very well for standard formats like this (ISO 8601):
import pandas as pd
dti = pd.to_datetime(["2020-06-30 15:20:13.078196+00:00"])
print(dti)
# DatetimeIndex(['2020-06-30 15:20:13.078196+00:00'], dtype='datetime64[ns, UTC]', freq=None)
+00:00
is a UTC offset of zero hours, thus can be interpreted as UTC.
btw., pd.to_datetime
also works very well for mixed formats, see e.g. here.
Related Topics
How to Have Na's Displayed First Using Arrange()
Canonical Tidyverse Method to Update Some Values of a Vector from a Look-Up Table
Replace Specific Values Based on Another Dataframe
How to Find Common Rows Between Two Dataframe in R
Add Data to Ggvis Tooltip That's Contained in the Input Dataset But Not Directly in the Vis
Change Plotly Chart Y Variable Based on Selectinput
Displaying True When Shiny Files Are Split into Different Folders
Time Series Plot with X Axis in "Year"-"Month" in R
How to Reorder the Items in a Legend
Generating Multiple Plots in Ggplot by Factor
Finding Where Two Linear Fits Intersect in R
Why Is Subsetting on a "Logical" Type Slower Than Subsetting on "Numeric" Type
Find Consecutive Values in Vector in R
Writing Data Frame to PDF Table
Correct Number of Decimal Places Reading in a .Csv