Draw Histograms Per Row Over Multiple Columns in R

Draw histograms per row over multiple columns in R

If you use ggplot you won't need to do it as a loop, you can plot them all at once. Also, you need to reformat your data so that it's in long format not short format. You can use the melt function from the reshape package to do so.

library(reshape2)
new.df<-melt(HEIrank11,id.vars="HEI.ID")
names(new.df)=c("HEI.ID","Year","Rank")

substring is just getting rid of the X in each year

library(ggplot2)
ggplot(new.df, aes(x=HEI.ID,y=Rank,fill=substring(Year,2)))+
geom_histogram(stat="identity",position="dodge")

Sample Image

How do I generate a histogram for each column of my table?

If you combine the tidyr and ggplot2 packages, you can use facet_wrap to make a quick set of histograms of each variable in your data.frame.

You need to reshape your data to long form with tidyr::gather, so you have key and value columns like such:

library(tidyr)
library(ggplot2)
# or `library(tidyverse)`

mtcars %>% gather() %>% head()
#> key value
#> 1 mpg 21.0
#> 2 mpg 21.0
#> 3 mpg 22.8
#> 4 mpg 21.4
#> 5 mpg 18.7
#> 6 mpg 18.1

Using this as our data, we can map value as our x variable, and use facet_wrap to separate by the key column:

ggplot(gather(mtcars), aes(value)) + 
geom_histogram(bins = 10) +
facet_wrap(~key, scales = 'free_x')

Sample Image

The scales = 'free_x' is necessary unless your data is all of a similar scale.

You can replace bins = 10 with anything that evaluates to a number, which may allow you to set them somewhat individually with some creativity. Alternatively, you can set binwidth, which may be more practical, depending on what your data looks like. Regardless, binning will take some finesse.

Plot multiple histograms based on dataframe in R

Try this:

library(dplyr)
library(tidyr)
library(ggplot2)
#Code
df %>%
pivot_longer(-Client) %>%
ggplot(aes(x=name,y=value))+
geom_bar(stat = 'identity',aes(fill=factor(Client)))+
facet_wrap(.~Client,scales = 'free')

Output:

Sample Image

In this case you would need a bar plot. Or this for histogram:

#Code 2
df %>%
pivot_longer(-Client) %>%
ggplot(aes(x=value))+
geom_histogram(aes(fill=factor(Client)))+
facet_wrap(.~Client,scales = 'free')

Output:

Sample Image

Some data used:

#Data
df <- structure(list(Client = 1:7, Model_1 = c(10.34, 0.97, 2.01, 0.57,
0.68, 0.55, 10.68), Model_2 = c(0.22, 0.6, 0.15, 0.94, 0.65,
3.59, 1.08), Model_3 = c(0.62, 0.04, 0.27, 0.11, 0.26, 0.06,
0.07), Model_4 = c(0.47, 0.78, 0.49, 0.66, 0.41, 0.01, 0.16),
Model_5 = c(1.96, 0.19, 0, 0, 0.5, 5.5, 0.2)), class = "data.frame", row.names = c(NA,
-7L))

Plot histograms from data frame based on conditions as group_by style

facet_wrap() can do that.

It looks like lat is currently a continuous variable. Stratifying by that would quickly get out of hand if there are many possible values, so I'd consider categorising it with cut() before passing it to facet_wrap().

library(dplyr)
library(ggplot2)
df |>
mutate(lat_grp = cut(lat, breaks = c(55, 60, 66))) |>
ggplot(aes(day_number)) +
geom_histogram(color="darkblue", fill="white", bins=366) +
xlim(0,400) +
xlab("Day n°") + ylab("Count") +
facet_wrap(vars(decade, lat_grp))

How do I create a histogram in r for a 2 column data?

If each row of the data.frame represents a user -

set.seed(1)
df <- data.frame(user = letters, twitter_count = rpois(26, lambda = 4) + 1)
hist(df$twitter_count)

Sample Image

Plotting the distribution for multiple columns

You don't need to combine your dataframe all over again. What you need is either a density plot or histogram.

Also as good practice, load only the packages required for plotting, in this case it would be maybe ggplot2 and tidyr.

For example, I just used an example with 5 of the column names I can see in your data:

library(tidyr)
library(ggplot2)

WKA_ohneJB = data.frame(dummyvar=1:10000,sapply(1:5,rnorm,n=10000))
colnames(WKA_ohneJB)[-1] = c("BASKETS_NZ","PIS","PIS_AP","PIS_DV","PIS_PL")
head(WKA_ohneJB)

dummyvar BASKETS_NZ PIS PIS_AP PIS_DV PIS_PL
1 1 0.92088518 0.9167877 1.956920 4.695379 4.349631
2 2 0.05335686 2.8225161 3.059749 4.317281 5.985579
3 3 1.00141759 3.5743033 2.499662 4.761415 5.886588
4 4 -1.31231486 2.5335004 5.396917 4.364643 5.866026
5 5 -0.65336724 0.2647117 3.203358 4.838659 4.437011
6 6 0.78769080 0.3630670 2.516433 3.826074 3.741611

To one of them do:

ggplot(WKA_ohneJB,aes(x=PIS)) + geom_histogram()

Sample Image

Or:

ggplot(WKA_ohneJB,aes(x=PIS)) + geom_density()

Sample Image

To plot everything at one go, you can try to pivot it long, as you have done with melt, but I don't know if your machine can handle it, so try it for a few variables first:

var_to_plot = c("BASKETS_NZ","PIS","PIS_AP","PIS_DV","PIS_PL")
dummyvar = "dummyvar"
ggplot(pivot_longer(WKA_ohneJB[,c(var_to_plot,dummyvar)],-dummyvar),
aes(x=value)) +
geom_histogram() +
facet_wrap(~name)

Sample Image

If melting the data.frame is too intensive, just use baseR plot:

# means 2 rows, 3 columns
par(mfrow=c(2,3))
for(i in var_to_plot){hist(WKA_ohneJB[,i],xlab=i,main="")}

Sample Image

How to make 2 histograms of a row from a table using half the n of columns per graph (R)?

i think it because each element is unique, for example in your tiny example

`

table( as.numeric( data1))

31.8003 63.033 67.3098
1 1 1

`

this is like uniform distribution, it is for that reason your problem graph
(exist only one frequency)

i create data and i put my own example

data=cbind(matrix(NA,5,4),rbind(
abs(rnorm(70,54,19)),
abs(rnorm(70,0.78,1.3)),
abs(rnorm(70,27,14)),
abs(rnorm(70,3.1,0.51)),
abs(rnorm(70,1.3,0.99))
))
for (i in seq(nrow(data))) {

win.graph()

par(mfcol=c(1,2))

data1 = data[i, 5:39]

hist(as.numeric(data1),
main="Expression levels for TSPAN6 in non-tumor tissue",
xlab="Patient",
ylab="Expression level value",
border = "black",
col = "black")

data2 = data[i, 40:74]

hist(as.numeric(data2),
main="Expression levels for TSPAN6 in non-tumor tissue",
xlab="Patient",
ylab="Expression level value",
border = "black",
col = "black")

}

if you want to do one or one you can do that

win.graph()

    par(mfcol=c(1,2)) 

data1 = data[1, 5:39]

hist(as.numeric(data1),
main="Expression levels for TSPAN6 in non-tumor tissue",
xlab="Patient",
ylab="Expression level value",
border = "black",
col = "black")

data2 = data[1, 40:74]

hist(as.numeric(data2),
main="Expression levels for TSPAN6 in non-tumor tissue",
xlab="Patient",
ylab="Expression level value",
border = "black",
col = "black")

Sample Image

and if you want to do all row in your case i think this code have to function,

`

for (i in seq(nrow(data))) {

win.graph()

par(mfcol=c(1,2))

data1 = data[i, 5:39]

hist(as.numeric(data1),
main="Expression levels for TSPAN6 in non-tumor tissue",
xlab="Patient",
ylab="Expression level value",
border = "black",
col = "black")

data2 = data[i, 40:74]

hist( as.numeric( data2),
main="Expression levels for TSPAN6 in non-tumor tissue",
xlab="Patient",
ylab="Expression level value",
border = "black",
col = "black")

}

`



Related Topics



Leave a reply



Submit