Ggplot: Remove Na Factor Level in Legend

ggplot: remove NA factor level in legend

You have one data point where delay_class is NA, but tot_delay isn't. This point is not being caught by your filter. Changing your code to:

filter(flights, !is.na(delay_class)) %>% 
ggplot() +
geom_bar(mapping = aes(x = carrier, fill = delay_class), position = "fill")

does the trick:

Sample Image

Alternatively, if you absolutely must have that extra point, you can override the fill legend as follows:

filter(flights, !is.na(tot_delay)) %>% 
ggplot() +
geom_bar(mapping = aes(x = carrier, fill = delay_class), position = "fill") +
scale_fill_manual( breaks = c("none","short","medium","long"),
values = scales::hue_pal()(4) )

UPDATE: As pointed out in @gatsky's answer, all discrete scales also include the na.translate argument. The feature actually existed since ggplot 2.2.0; I just wasn't aware of it at the time I posted my answer. For completeness, its usage in the original question would look like

filter(flights, !is.na(tot_delay)) %>% 
ggplot() +
geom_bar(mapping = aes(x = carrier, fill = delay_class), position = "fill") +
scale_fill_discrete(na.translate=FALSE)

How to remove NA from a factor variable (and from a ggplot chart)?

assuming your data is in a data frame called dat

newdat <- dat[!is.na(dat$Factor), ]

not sure how to solve the problem inside of ggplot code

Hide unused levels in ggplot legend

OK; the related issue 4511 gives the answer. Setting limits = force in scale_fill_manual did it.

Plot graph in R and don't want NA to show

Without a reproducible example and understanding the structure of your database, a simple option is to use drop_na() to remove any row containing NA values.

lusl %>% 
drop_na() %>%
ggplot(aes(wave, generalhealth)) + geom_point(alpha = .005) # etc.

If you want a more precise removal of the NA values, only in harassment_fct (used for color), then filter these out:

lusl %>% 
filter(!is.na(harassment_fct)) %>%
ggplot(aes(wave, generalhealth)) + geom_point(alpha = .005) # etc.

Remove legend entries for some factors levels

First, as your variable used for the fill is numeric then convert it to factor (for example with different name a2) and set labels for factor levels as you need (each level needs different label so for the first five numbers I used the same numbers).

training_results.barplot$a2 <- factor(training_results.barplot$a,
labels = c("1", "2", "3", "4", "5", "Best", "Suggested", "Worst"))

Now use this new variable for the fill =. This will make labels in legend as you need. With argument breaks= in the scale_fill_manual() you cat set levels that you need to show in legend but remove the argument labels =. Both argument can be used only if they are the same lengths.

ggplot(training_results.barplot, mapping = aes(x = name, y = wer, fill = a2))  + 
geom_bar(stat = "identity") +
scale_fill_manual(breaks = c("Best", "Suggested", "Worst"),
values = c("#555555", "#777777", "#555555", "#777777",
"#555555", "green", "orange", "red"))

Sample Image

Here is a data used for this answer:

training_results.barplot<-structure(list(a = c(1L, 2L, 1L, 8L, 3L, 4L, 5L, 6L, 7L, 1L, 
1L, 1L), b = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L
), c = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L), name = structure(1:12, .Label = c("1+1+1",
"1+1+2", "1+1+3", "1+1+4", "1+1+5", "1+2+1", "1+2+2", "1+2+3",
"1+2+4", "1+2+5", "1+3+1", "1+3+2"), class = "factor"), corr = c(66.63,
66.66, 66.81, 66.57, 66.89, 66.63, 66.82, 66.74, 67, 66.9, 66.68,
66.76), acc = c(59.15, 59.29, 59.42, 59.08, 59.34, 59.1, 59.45,
59.31, 59.5, 59.19, 59.16, 59.23), H = c(4167L, 4169L, 4178L,
4163L, 4183L, 4167L, 4179L, 4174L, 4190L, 4184L, 4170L, 4175L
), D = c(238L, 235L, 226L, 223L, 226L, 240L, 228L, 225L, 226L,
230L, 227L, 226L), S = c(1849L, 1850L, 1850L, 1868L, 1845L, 1847L,
1847L, 1855L, 1838L, 1840L, 1857L, 1853L), I = c(468L, 461L,
462L, 468L, 472L, 471L, 461L, 465L, 469L, 482L, 470L, 471L),
N = c(6254L, 6254L, 6254L, 6254L, 6254L, 6254L, 6254L, 6254L,
6254L, 6254L, 6254L, 6254L), wer = c(40.85, 40.71, 40.58,
40.92, 40.66, 40.9, 40.55, 40.69, 40.5, 40.81, 40.84, 40.77
)), .Names = c("a", "b", "c", "name", "corr", "acc", "H",
"D", "S", "I", "N", "wer"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"))

ggplot: how to remove unused factor levels from a facet?

Setting scales = free in facet grid will do the trick:

facet_grid( ~ fac, scales = "free")

Removing a factor from ggplot color legend when multiple are specified

You can use ?scale_color_discrete to specify the breaks. In your case this could be something like the following:

ggplot(Data, aes(x=StudyArea))+
geom_point(aes(y=value, color=variable),size=3, shape=1)+
geom_errorbar(aes(ymin=value-SE, ymax=value+SE, color=variable),lty = 2, cex=0.75)+
geom_point(aes(y=InSituPred, color=StudyArea),size=3, shape=1)+
geom_errorbar(aes(ymin=InSituPred-InSituSE, ymax=InSituPred+InSituSE, color=StudyArea),lty=1,cex=0.75)+
geom_point(aes(y=Obs, color=StudyArea),shape="*",size=12) +
scale_color_discrete(breaks=c("ExSituAAA", "ExSituBBB", "ExSituCCC"))

EDIT: Yes, it is possible to specify the colors. Since I don't really understand what coloring scheme you want, here are some examples (not all are meant entirely seriously).

p <- ggplot(Data, aes(x=StudyArea))+
geom_point(aes(y=value, color=variable),size=3, shape=1)+
geom_errorbar(aes(ymin=value-SE, ymax=value+SE, color=variable),lty = 2, cex=0.75)+
geom_point(aes(y=InSituPred, color=StudyArea),size=3, shape=1)+
geom_errorbar(aes(ymin=InSituPred-InSituSE, ymax=InSituPred+InSituSE, color=StudyArea),lty=1,cex=0.75)+
geom_point(aes(y=Obs, color=StudyArea),shape="*",size=12)
p + scale_color_manual(name="Study Area \nPrediction",
values=c("red", "blue", "darkgreen","red","blue","darkgreen"),
breaks=c("ExSituAAA", "ExSituBBB", "ExSituCCC"))
p + scale_color_manual(name="Study Area \nPrediction",
values=c("black", "black", "black", "red","blue","darkgreen"),
breaks=c("ExSituAAA", "ExSituBBB", "ExSituCCC"))
p + scale_color_manual(name="Study Area \nPrediction",
values=c("white", "yellow", "pink", "red","blue","darkgreen"),
breaks=c("ExSituAAA", "ExSituBBB", "ExSituCCC"))

ADDITION: This would be much easier, if you clean and restructure your data before plotting. Here's my attampt:

df <- with(Data, data.frame(area=rep(StudyArea, 2),
exarea=c(variable,rep(variable[c(1,4,7)], 3)),
value=c(value, InSituPred),
se=c(SE, InSituSE),
obs = rep(Obs, 2),
situ=rep(c("in", "ex"), each=nrow(Data))))
df <- df[!duplicated(df),]

Then the plotting becomes much easier:

p <- ggplot(df, aes(x=area))+
geom_point(aes(y=value, color=exarea),size=3, shape=1)+
geom_errorbar(aes(ymin=value-se, ymax=value+se, color=exarea, lty=situ), cex=0.75)+
geom_point(aes(y=obs, color=exarea),shape="*",size=12)

p + scale_color_manual(name="Study Area \nPrediction",
values=c("red", "blue", "darkgreen"),
breaks=c("ExSituAAA", "ExSituBBB", "ExSituCCC")) +
scale_linetype_manual(name="Situ",
values=c(1,2),
breaks=c("in", "ex"),
labels=c("InSitu", "ExSitu"))

EDIT2: It is possible to use the original data for this. You have to put the lty inside the aes-function and then use scale_linetype_manual as before. Here it goes:

p <- ggplot(Data, aes(x=StudyArea))+
geom_point(aes(y=value, color=variable),size=3, shape=1)+
geom_errorbar(aes(ymin=value-SE, ymax=value+SE, color=variable, lty="2"), cex=0.75)+
geom_point(aes(y=InSituPred, color=StudyArea),size=3, shape=1)+
geom_errorbar(aes(ymin=InSituPred-InSituSE, ymax=InSituPred+InSituSE, color=StudyArea, lty="1"),cex=0.75)+
geom_point(aes(y=Obs, color=StudyArea),shape="*",size=12)
p + scale_color_manual(name="Study Area \nPrediction",
values=c("red", "blue", "darkgreen","red","blue","darkgreen"),
breaks=c("ExSituAAA", "ExSituBBB", "ExSituCCC")) +
scale_linetype_manual(name="Situ",
values=c(1,2),
breaks=c("1", "2"),
labels=c("InSitu", "ExSitu"))

It really is usually better practice to restructure the data instead. If you want to make any more changes to this code, it will be very hard to do. The code is already rather difficult to read. So if it is at all possible to restructer your dataset (it usually is) then consider taking the approach mentioned above.

ggplot2 0.9.0 automatically dropping unused factor levels from plot legend?

Yes, you want to add drop = FALSE to your colour scale:

ggplot(subset(df,fruit == "apple"),aes(x = year,y = qty,colour = fruit)) + 
geom_point() +
scale_colour_discrete(drop = FALSE)

Remove unused factor levels from a ggplot bar plot

One easy options is to use na.omit() on your data frame df to remove those rows with NA

ggplot(na.omit(df), aes(x=name,y=var1)) + geom_bar()

Given your update, the following

ggplot(df[!is.na(df$var1), ], aes(x=name,y=var1)) + geom_bar()

works OK and only considers NA in Var1. Given that you are only plotting name and Var, apply na.omit() to a data frame containing only those variables

ggplot(na.omit(df[, c("name", "var1")]), aes(x=name,y=var1)) + geom_bar()


Related Topics



Leave a reply



Submit