Cumulative Plot Using Ggplot2

cumulative plot using ggplot2

Try this:

ggplot(df, aes(x=1:5, y=cumsum(val))) + geom_line() + geom_point()

Sample Image

Just remove geom_point() if you don't want it.

Edit: Since you require to plot the data as such with x labels are dates, you can plot with x=1:5 and use scale_x_discrete to set labels a new data.frame. Taking df:

ggplot(data = df, aes(x = 1:5, y = cumsum(val))) + geom_line() + 
geom_point() + theme(axis.text.x = element_text(angle=90, hjust = 1)) +
scale_x_discrete(labels = df$date) + xlab("Date")

Sample Image

Since you say you'll have more than 1 val for "date", you can aggregate them first using plyr, for example.

require(plyr)
dd <- ddply(df, .(date), summarise, val = sum(val))

Then you can proceed with the same command by replacing x = 1:5 with x = seq_len(nrow(dd)).

Time-series/cumulative data plots using ggplot2

Does this do what you want?

library(ggplot2)
library(dplyr)
library(tidyr)
library(lubridate)

holocron %>%
mutate(Date = dmy(Date)) %>%
arrange(Date) %>% # Just in case not ordered already
mutate(Valid_Cumulative = cumsum(Valid),
Invalid_Cumulative = cumsum(Invalid)) %>%
pivot_longer(cols = c(Registers, Valid_Cumulative, Invalid_Cumulative)) %>%

ggplot(aes(Date, value, color = name)) +
geom_line()

Sample Image

Cumulative plot in ggplot2

The problem is that when you use cumsum() in the aesthetic, it applies over all values, not just the values within a particular year.

Rather than doing the transformation with ggplot, it would be safer to do the transformation with dplyr first, then plot the results. For example

ggplot(dat %>% group_by(year) %>% mutate(cv=cumsum(value)), 
aes(x = id.2wk, y = cv, colour = factor(year))) +
geom_line(size = 1)+
geom_point()

Sample Image

Cumulative plot using ggplot and dplyr

How about using data.table to calculate cumulative sum within group?

library(data.table)
library(ggplot2)

ggplot(setDT(dat)[, cv:= cumsum(value), year],
aes(x = id.2wk, y = cv, colour = factor(year))) +
geom_line(size = 1) +
geom_point()

Sample data:

dat <- data.frame(year = as.factor(rep(c(2012:2015),each = 6)),
id.2wk = rep(c(18,19,20,21,22,23),times = 4),
value = c(1.8,15.6,32.9,27.5,19.6,2.6,1,8,42,35,11,3,2,7,12,47,26,7,2,13,24,46,12,4))

Plotting cumulative counts in ggplot2

This will not solve directly problem with grouping of lines but it will be workaround.

You can add three calls to stat_bin() where you subset your data according to A levels.

ggplot(x,aes(x=X,color=A)) +
stat_bin(data=subset(x,A=="a"),aes(y=cumsum(..count..)),geom="step")+
stat_bin(data=subset(x,A=="b"),aes(y=cumsum(..count..)),geom="step")+
stat_bin(data=subset(x,A=="c"),aes(y=cumsum(..count..)),geom="step")

Sample Image

UPDATE - solution using geom_step()

Another possibility is to multiply values of ..y.. with number of observations in each level. To get this number of observations at this moment only way I found is to precalculate them before plotting and add them to original data frame. I named this column len. Then in geom_step() inside aes() you should define that you will use variable len=len and then define y values as y=..y.. * len.

set.seed(123)
x <- data.frame(A=replicate(200,sample(c("a","b","c"),1)),X=rnorm(200))
library(plyr)
df <- ddply(x,.(A),transform,len=length(X))
ggplot(df,aes(x=X,color=A)) + geom_step(aes(len=len,y=..y.. * len),stat="ecdf")

Sample Image

How to create a grouped cumulative frequency graph with ggplot2

I think you'd like to use stat_ecdf from ggplot2:

ggplot(df, aes(Con, color = Zone)) + stat_ecdf(geom = "point")

Creating a Cumulative Sum Plot using ggplot with duplicate x values

In these situations, it's usually best to pre-compute your desired summary and send that to ggplot, rather than messing around with ggplot's summary functions. I've also added a geom_line() for clarity.

data %>% 
group_by(Flavor, Month) %>%
summarize(Orders = sum(Orders)) %>%
group_by(Flavor) %>%
arrange(Month) %>%
mutate(Orders = cumsum(Orders)) %>%
ggplot(data = ., aes(x=Month, y=Orders, color = Flavor)) + geom_point() + geom_line()

Sample Image

How to define xaxis for a cumulative distribution function using ggplot and geom_ribbon in R?

I found a "manual" solution. First, I created a variable equal to the cumulative distribution of my variable of interest:

df <- 
df %>%
dplyr::mutate(cumula_var = cume_dist(var_x))

Then, I made the graph:

Graph <- 
ggplot(df, aes(x=var_x, y=cumula_var)) +
geom_line() +
geom_ribbon(aes(ymin = 0, ymax = ..y..,
xmin = 0, xmax = 20))+
coord_cartesian(xlim = c(0, 20))

ggplot cumulative line plot with custom month order

You could try inside same factor defining the order and rotating the labels in x-axis:

library(tidyverse)
total <- c(5, 2, 3, 4, 7, 4, 7, 8, 5, 6, 2, 25, 7 ,8, 0, 6, 4, 4)
fiscal_year <- c(19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 20)
month_num <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 10, 11, 12)
month_str <- c("January", "February", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December", "January",
"February", "March", "October", "November", "December")
fy1920 <- data.frame(total, fiscal_year, month_num, month_str)
fy1920$month = factor(fy1920$month_str, levels = c("October", "November", "December", "January",
"February", "March","April", "May", "June", "July",
"August", "September"),ordered = T)

#Some code to arrange
fy1920.2 <- fy1920 %>% arrange(month) %>% group_by(fiscal_year) %>% mutate(Total=cumsum(total))

ggplot() +
geom_line(aes(x=month, y=Total, colour='FY19', group=fiscal_year),
fy1920.2 %>% filter(fiscal_year=='19')) +
geom_line(aes(x=month, y=Total, colour='FY20', group=fiscal_year),
fy1920.2 %>% filter(fiscal_year=='20')) +
scale_colour_discrete(name='fiscal_year', labels=c("FY19", "FY20"))+
theme(axis.text.x = element_text(angle=45))

Output:

Sample Image



Related Topics



Leave a reply



Submit