Order of Dates Is Not Chronological in Ggplot2

Order of dates is not chronological in ggplot2

If your dates are imported in the correct order in the data frame, use

sumc$Date <- factor(sumc$Date, ordered = T)

prior to plotting. This will make them as ordered factors based on the order they appear, and ggplot will understand that it has to keep them that way.

Edit: if the dates are not ordered, you can order them and save to a vector:

dates <- unique(sort(sumc$Date))
sumc$Date <- factor(sumc$Date, labels = dates, ordered = T)

out of order date in ggplot2

The issue is that your dates are currently being interpreted as character data, and R is shuffling them a little. What you really want is for them to be treated as genuine Date objects, and then let ggplot's higher-level functions handle the ordering and labeling accordingly.

Convert the date data to Date type:

tmp3$newdate <- as.Date(strptime(tmp3$simdte, '%Y%m%d'))

Specify the new dates as the x-values (no need to select only the unique values), and use scale_x_date to create pretty labels. Note that this also correctly spaces the data points across time, instead of using even spacing for each "level" of the date data.

plot.new <- ggplot(tmp3)+
geom_point(aes(x= newdate, y=r2))+
scale_x_date(date_labels = '%b-%d') +
facet_wrap(~simyr, scales='free_x')+
theme(axis.text.x=element_text(angle=45,hjust=1))
print(plot.new)

Sample Image

In the future, it's useful to be aware of the str function, which can quickly tell you the format of your data columns (also accessible from the Environment panel in RStudio):

str(tmp3)

'data.frame': 28 obs. of 7 variables:
$ mdldte : chr "20150305" "20140531" "20160620" "20150305" ...
$ simdte : chr "20130403" "20130429" "20130503" "20130525" ...
$ r2 : num 0.542 0.485 0.54 0.4 0.594 ...
$ simyr : chr "2013" "2013" "2013" "2013" ...
$ mdlyr : chr "2015" "2014" "2016" "2015" ...
$ mdlpreds: Factor w/ 4 levels "phv","phvfsca",..: 1 1 1 1 4 1 4 2 3 4 ...
$ newdate : Date, format: "2013-04-03" "2013-04-29" "2013-05-03" "2013-05-25" ...

As you can see, your original "simdte" column is being stored as character data. R (and ggplot) will treat every value of the data as a unique level or category. Conversely, Date data are fundamentally numerical. R will treat them as continuous, which makes it easier to plot them accurately on a timeline or axis. It also makes it easier to separate the underlying data from the format of any plotting labels.

Update: Using dates as categories and plotting boxplots, in date order

If instead we wanted each date to act as a category (instead of having the date data act as a numerical distance), the solution is actually simpler. Strange things happen when you try to change the number of values being fed into a ggplot aesthetic, which I suspect is the root cause of your misordering problem.

The key is to rely on ggplot's built-in labeling functions. Once again, the main call to ggplot is fed the raw data, and scale_x_discrete handles the creation of pretty labels:

plot.new <- ggplot(tmp3)+
geom_boxplot(aes(x=simdte,y=r2))+
facet_wrap(~simyr, scales='free_x')+
scale_x_discrete(labels = function(x) strftime(strptime(x, '%Y%m%d'), '%b-%d'))+
theme(axis.text.x=element_text(angle=45,hjust=1))
print(plot.new)

Sample Image

How to Reorder Dates in Chronological Order in ggplot2

Change the D1 column to date class and that should fix the plot.

gym$D1 <- lubridate::dmy(gym$D1)

Converting character to date turning all dates to NA & x-axis of ggplot not in chronological order

You had two issues.

1- inflation was stored as character not a number so it couldn't be plotted

2- date was stored as a character, not a date, so it would just be plotted in alphabetical order. It has to be a date so it can be sorted properly, then just format the scale so that it prints the date in the format that you want.

library("tidyverse")
library("lubridate")

#webscraping the ONS inflation csv file
cpi<-read.csv(url("https://www.ons.gov.uk/generator?format=csv&uri=/economy/inflationandpriceindices/timeseries/d7g7/mm23"))

#removing rows 1 to 7 which contain descriptors, keeping this as a dataframe
cpi<-cpi[-c(1,2,3,4,5,6,7),,drop=FALSE]

#renaming columns as date and inflation
cpi<- cpi %>% rename(date=Title)
cpi<- cpi %>% rename(inflation=CPI.ANNUAL.RATE.00..ALL.ITEMS.2015.100)
#proper title characters for date

#THIS FAILS. cut_cpi data.frame hasn't been created yet so this doesn't work. Unnecessary so just remove it.
#cut_cpi$date<- str_to_title(cut_cpi$date)

#subsetting cpi dataset in order to have only the data from the row of 2020 JAN to the last row
cut_cpi<- cpi[(which(cpi$date=="2020 JAN")):nrow(cpi),]

#NEW
cut_cpi<- cut_cpi %>%
mutate(real_date_format= parse_date_time(cut_cpi$date, orders = "%Y %b")) %>%
arrange(desc(real_date_format))

#plotting inflation in a line chart

#NEW
# remove extra comma on aes
# converted inflation to numeric (was character)
# converted real_date_format to date (was datetime). scale_x_date breaks with datetime
ggplot(cut_cpi,aes(x=as_date(real_date_format), y=as.numeric(inflation),group=1))+
geom_line(colour="black")+
labs(title="CPI inflation from January 2020") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
#NEW
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")

Can I chronologically order dates as characters in R?

openingData %>% 
mutate(dateOpened = as.Date(dateOpened,"%m/%d/%y")) %>%
arrange(dateOpened) %>%
mutate(id = factor(row_number(),labels = dateOpened)) %>%
ggplot() +
geom_col(mapping = aes(x = id, y = daysPrior))+
labs(x = "Date Opened", y = "Days prior to opening at or above 11.0")

date not ordering correctly on axis in R ggplot

Your data is somewhat irregularly spaced, so ggplot is inferring that the resolution of x axis should be daily, making each bar only one day wide.

Since you generally have only have one observation per year, it might be simplest to use the year as your x axis. This will keep even spacing of years, but it unfortunately obscures the lower values when there are multiple in a year. I've made the bars slightly transparent to show this.

ggplot(data = t4, aes(x=lubridate::year(survey_date),y=avg_count)) + 
geom_col(position = "identity", alpha = 0.8) +
scale_x_continuous(breaks = scales::breaks_width(1), name = NULL)

Sample Image

Or you might show each date on a discrete axis. Note that I'm converting the text created by format() into a sorted factor using forcats::fct_reorder, otherwise it would show up alphabetically, which in this date format is not chronological.

ggplot() + geom_col(data = t4, 
aes(x=format(survey_date, "%m/%d/%y") %>% forcats::fct_reorder(survey_date),
y=avg_count)) +
scale_x_discrete(name = NULL)

Sample Image

Or you might aggregate the data annually first:

library(dplyr)
t4 %>%
group_by(year = lubridate::year(survey_date)) %>%
summarize(avg_count = mean(avg_count)) %>%
ggplot(aes(year, avg_count)) +
geom_col() +
scale_x_continuous(breaks = scales::breaks_width(1), name = NULL)

Sample Image

Or another variation, putting year into facets:

t4 %>%
mutate(year = lubridate::year(survey_date),
survey_date2 = format(survey_date, "%m/%d") %>%
forcats::fct_reorder(survey_date)) %>%
ggplot(aes(survey_date2, avg_count)) +
geom_col() +
facet_wrap(~year, nrow = 1, scales = "free_x")

Sample Image

X axis dates not in chronological order

Use your second solution, but use

+scale_x_date(breaks=unique(dates))

to specify where you want the breakpoints.



Related Topics



Leave a reply



Submit