Converting year and month ( yyyy-mm format) to a date?
Try this. (Here we use text=Lines
to keep the example self contained but in reality we would replace it with the file name.)
Lines <- "2009-01 12
2009-02 310
2009-03 2379
2009-04 234
2009-05 14
2009-08 1
2009-09 34
2009-10 2386"
library(zoo)
z <- read.zoo(text = Lines, FUN = as.yearmon)
plot(z)
The X axis is not so pretty with this data but if you have more data in reality it might be ok or you can use the code for a fancy X axis shown in the examples section of ?plot.zoo
.
The zoo series, z
, that is created above has a "yearmon"
time index and looks like this:
> z
Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 Aug 2009 Sep 2009 Oct 2009
12 310 2379 234 14 1 34 2386
"yearmon"
can be used alone as well:
> as.yearmon("2000-03")
[1] "Mar 2000"
Note:
"yearmon"
class objects sort in calendar order.This will plot the monthly points at equally spaced intervals which is likely what is wanted; however, if it were desired to plot the points at unequally spaced intervals spaced in proportion to the number of days in each month then convert the index of
z
to"Date"
class:time(z) <- as.Date(time(z))
.
Format Date to Year-Month in R
lubridate
only handle dates, and dates have days. However, as alistaire mentions, you can floor them by month of you want work monthly:
library(tidyverse)
df_month <-
df %>%
mutate(Date = floor_date(as_date(Date), "month"))
If you e.g. want to aggregate by month, just group_by()
and summarize()
.
df_month %>%
group_by(Date) %>%
summarize(N = sum(N)) %>%
ungroup()
#> # A tibble: 4 x 2
#> Date N
#> <date> <dbl>
#>1 2017-01-01 59
#>2 2018-01-01 20
#>3 2018-02-01 33
#>4 2018-03-01 45
How to change year.month format into Year-Month format in R
You can use sub
, with capturing groups in the regular expression:
df$Month <- sub("^(\\d{4})\\.(\\d{2})$", "\\1-\\2", format(df$Month, 2))
df
#> Month GSI
#> 1 1993-01 -0.5756706
#> 2 1993-02 -1.1554924
#> 3 1993-03 -1.0035307
#> 4 1993-04 -0.1069888
#> 5 1993-05 -0.3190359
#> 6 1993-06 0.3036164
#> 7 1993-07 1.2452892
#> 8 1993-08 0.8510437
#> 9 1993-09 1.2468009
#> 10 1993-10 1.4252141
Input Data
df <- structure(list(Month = c(1993.01, 1993.02, 1993.03, 1993.04,
1993.05, 1993.06, 1993.07, 1993.08, 1993.09, 1993.1), GSI = c(-0.57567056,
-1.15549239, -1.00353071, -0.1069888, -0.31903591, 0.30361638,
1.24528915, 0.8510437, 1.24680092, 1.42521406)), class = "data.frame", row.names = c(NA,
-10L))
df
#> Month GSI
#> 1 1993.01 -0.5756706
#> 2 1993.02 -1.1554924
#> 3 1993.03 -1.0035307
#> 4 1993.04 -0.1069888
#> 5 1993.05 -0.3190359
#> 6 1993.06 0.3036164
#> 7 1993.07 1.2452892
#> 8 1993.08 0.8510437
#> 9 1993.09 1.2468009
#> 10 1993.10 1.4252141
Converting a date 'year - month - date' to only 'year and month' in r with SQL data
Up front, your attempt of as.Date(df$Posting_Date, format="%Y %m")
seems backwards: the function as.Date
is for converting from a string to a Date
-class, and its format=
argument is to identify how to find the year/month/day components of the string, not how you want to convert it later. (Note that in R, a Date
is shown as YYYY-MM-DD
. Always. Telling R you want a date to be just year/month is saying that you want to convert it to a string, no longer date-like or number-like. lubridate
and perhaps other packages allow you to have similar-to-Date
like objects.)
For df
, one can just subset the strings without parsing to Date
-class:
substring(df$Posting_Date, 1, 7)
# [1] "2020-05" "2020-10" "2021-10"
If you want to do anything number-like to them, you can convert to Date
-class first, and then use format(.)
to convert to a string with a specific format.
as.Date(df$Posting_Date)
# [1] "2020-05-28" "2020-10-09" "2021-10-19"
format(as.Date(df$Posting_Date), format = "%Y-%m")
# [1] "2020-05" "2020-10" "2021-10"
For df2
, though, since it is numeric you need to specify an origin=
instead of a format=
. I'm inferring that these are based off of epoch, so
as.Date(df2$Posting_Date, origin = "1970-01-01")
# [1] "2020-05-28" "2020-10-09" "2021-10-19"
format(as.Date(df2$Posting_Date, origin = "1970-01-01"), format = "%Y-%m")
# [1] "2020-05" "2020-10" "2021-10"
Note that R stores Date
(and POSIXct
, incidentally) as numbers internally:
dput(as.Date(df2$Posting_Date, origin = "1970-01-01"))
# structure(c(18410, 18544, 18919), class = "Date")
R: date format with just year and month
I think you can't represent a Date format in R without showing the day. If you want a character column, like in your example, you can do:
> x <- data.frame(Year = c(2020,2020,2020,2020), Month = c(1,2,3,4), Data = c(54,58,78,59))
> x$Month <- ifelse(nchar(x$Month == 1), paste0(0, x$Month), x$Month) # add 0 behind.
> x$Date <- paste(x$Year, x$Month, sep = '-')
> x
Year Month Data Date
1 2020 01 54 2020-01
2 2020 02 58 2020-02
3 2020 03 78 2020-03
4 2020 04 59 2020-04
> class(x$Date)
[1] "character"
If you want a Date type column you will have to add:
x$Date <- paste0(x$Date, '-01')
x$Date <- as.Date(x$Date, format = '%Y-%m-%d')
x
class(x$Date)
Extract Month and Year From Date in R
This will add a new column to your data.frame
with the specified format.
df$Month_Yr <- format(as.Date(df$Date), "%Y-%m")
df
#> ID Date Month_Yr
#> 1 1 2004-02-06 2004-02
#> 2 2 2006-03-14 2006-03
#> 3 3 2007-07-16 2007-07
# your data sample
df <- data.frame( ID=1:3,Date = c("2004-02-06" , "2006-03-14" , "2007-07-16") )
a simple example:
dates <- "2004-02-06"
format(as.Date(dates), "%Y-%m")
> "2004-02"
side note:
the data.table
approach can be quite faster in case you're working with a big dataset.
library(data.table)
setDT(df)[, Month_Yr := format(as.Date(Date), "%Y-%m") ]
How to get date in month-year format?
Using as.yearmon()
from the zoo package (and the magrittr pipe):
library(zoo)
library(magrittr)
as.yearmon(df$MonthYear, "%b-%y") %>%
format(., "%Y-%m")
[1] "2020-01" "2019-02" "2018-05"
Can also be done without the '.' used as a placeholder for the left hand side of the pipe. It was left in as these functions aren't typical tidyverse piping functions.
as.yearmon(df$MonthYear, "%b-%y") %>%
format("%Y-%m")
Or without piping at all, and using nested functions (as pointed out by @Sotos). I find them harder to read, and usually have the tidyverse (and therefore %>% pipes) loaded anyway.
format(as.yearmon(df$MonthYear, "%b-%y"), "%Y-%m")
Related Topics
Subset Data.Table by Logical Column
Adding 15 Business Days in Lubridate
Add Row in Each Group Using Dplyr and Add_Row()
How to Turn Gpclibpermit() to True
Aggregating All Unique Values of Each Column of Data Frame
Format Ttest Output by R for Tex
Existing Function for Seeing If a Row Exists in a Data Frame
Sequence Length Encoding Using R
Using R Convert Data.Frame to Simple Vector
Skip Some Rows in Read.CSV in R
Ordering Stacks by Size in a Ggplot2 Stacked Bar Graph
How to Get the Cumulative Sum by Group in R
Check If String Contains Only Numbers or Only Characters (R)