Make Frequency Histogram for Factor Variables

Make Frequency Histogram for Factor Variables

It seems like you want barplot(prop.table(table(animals))):

Sample Image

However, this is not a histogram.

Create histogram (not barplot) from categorical variable

Thanks to Zheyuan Li I already have my answer. Instead of forcing a histogram, I can simply create a barplot that looks excatly like a histogram:

par(oma=c(2,0,0,0))   #so labels are not cut off   
barplot(table(hhincome2), space = 0, # set space between bars to zero
ylab = "Frequency", main = "Netto houshold income",
border="black", col="grey",las=2)
axis(1, at = hhincome,labels = FALSE) # at x-axis at category borders
box()

Sample Image

Edit: I just found an alternative way:

h <- hist(as.numeric(hhincome2) #as.numeric converst factor levels to numeric values
, xlab = "", ylab = "Frequency", main = "Netto houshold income \n(with normal disttribution curve)",
border="black", col="grey",las=2,
xaxt='n') #this supresses the x-axis which would disply levels instead values
axis(1, at = hhincome2, labels = hhincome2, las=2) #just add factor level labels as labels
box()

This way it's possible to add a normal disttribution curve as well:

xfit<-seq(min(as.numeric(hhincome2)),max(as.numeric(hhincome2)),length=1100) 
yfit<-dnorm(xfit,mean=mean(as.numeric(hhincome2)),sd=sd(as.numeric(hhincome2)))
yfit <- yfit*diff(h$mids[1:2])*length(as.numeric(hhincome2))
lines(xfit, yfit, col="black", lwd=2)

Sample Image

Get a histogram plot of factor frequencies (summary)

Update in light of clarified Q

set.seed(1)
dat2 <- data.frame(fac = factor(sample(LETTERS, 100, replace = TRUE)))
hist(table(dat2), xlab = "Frequency of Level Occurrence", main = "")

gives:

histogram of frequency of occurrence in factor

Here we just apply hist() directly to the result of table(dat). table(dat) provides the frequencies per level of the factor and hist() produces the histogram of these data.


Original

There are several possibilities. Your data:

dat <- data.frame(fac = rep(LETTERS[1:4], times = c(3,3,1,5)))

Here are three, from column one, top to bottom:

  • The default plot methods for class "table", plots the data and histogram-like bars
  • A bar plot - which is probably what you meant by histogram. Notice the low ink-to-information ratio here
  • A dot plot or dot chart; shows the same info as the other plots but uses far less ink per unit information. Preferred.

Code to produce them:

layout(matrix(1:4, ncol = 2))
plot(table(dat), main = "plot method for class \"table\"")
barplot(table(dat), main = "barplot")
tab <- as.numeric(table(dat))
names(tab) <- names(table(dat))
dotchart(tab, main = "dotchart or dotplot")
## or just this
## dotchart(table(dat))
## and ignore the warning
layout(1)

this produces:

one dimensional plots

If you just have your data in variable factor (bad name choice by the way) then table(factor) can be used rather than table(dat) or table(dat$fac) in my code examples.

For completeness, package lattice is more flexible when it comes to producing the dot plot as we can get the orientation you want:

require(lattice)
with(dat, dotplot(fac, horizontal = FALSE))

giving:

Lattice dotplot version

And a ggplot2 version:

require(ggplot2)
p <- ggplot(data.frame(Freq = tab, fac = names(tab)), aes(fac, Freq)) +
geom_point()
p

giving:

ggplot2 version

Plot histograms over factor variables

Something like this?

set.seed(1234)
df <- data.frame(weekday=rep(sort(unique(weekdays(.leap.seconds))), each=24),
hour=rep(1:24, 7), count=sample(2:600, 24*7, replace=T))

df$weekday <- factor(df$weekday, levels=c("Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday", "Sunday"), ordered=T)
df$hour <- factor(df$hour)

require(ggplot2)
p <- ggplot(data = df, aes(x=hour))
p <- p + geom_histogram(aes(weights=count, fill=weekday))
p <- p + scale_fill_brewer(palette="Set3")
p <- p + facet_wrap( ~ weekday, ncol=1)
p

ggplot2_facet_histogram

ploting histogram and finding frequency from data?

To plot a bar plot from one categorical variable is as simple as

library(ggplot2)

ggplot(df1, aes(x)) + geom_bar()

Sample Image

Data

x <- scan(what = character(), text = "
A
A
A
B
B
C
C
C
C
C
D")

df1 <- data.frame(x)


Related Topics



Leave a reply



Submit