Plotting Pca Biplot with Ggplot2

Plotting pca biplot with ggplot2

Maybe this will help-- it's adapted from code I wrote some time back. It now draws arrows as well.

PCbiplot <- function(PC, x="PC1", y="PC2") {
# PC being a prcomp object
data <- data.frame(obsnames=row.names(PC$x), PC$x)
plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames))
plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2)
datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation)
mult <- min(
(max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))),
(max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x])))
)
datapc <- transform(datapc,
v1 = .7 * mult * (get(x)),
v2 = .7 * mult * (get(y))
)
plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color="red")
plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color="red")
plot
}

fit <- prcomp(USArrests, scale=T)
PCbiplot(fit)

You may want to change size of text, as well as transparency and colors, to taste; it would be easy to make them parameters of the function.
Note: it occurred to me that this works with prcomp but your example is with princomp. You may, again, need to adapt the code accordingly.
Note2: code for geom_segment() is borrowed from the mailing list post linked from comment to OP.

PC biplot

A self-written code for biplot in ggplot2

The problem is with your geom_text layer

geom_text(data = loadings, aes(x=PC2, y=PC3, label=label),color="#006400")

Both loadings$PC2 and loadings$PC3 have length 4, but label has length 150. These do not go together.

How to display observations in pca biplot?

You can specify geom.ind = "txt"

fviz_pca_biplot(pca,geom.ind="text",labelsize=2,
col.var = "#2E9FDF",col.ind = "#696969")

And try to add the ggrepel :

options(ggrepel.max.overlaps = 20)
fviz_pca_biplot(pca,geom.ind="text",labelsize=2,
col.var = "#2E9FDF",col.ind = "#696969",repel=TRUE)

Sample Image

Problems Plotting PCA in R with ggplot2

Also check this, here I included an example. The trick use Comps <- as.data.frame(mypca$x) to isolate the components and then add to original data. After that you can use cbind() with Comps[,c(1,2)] to only extract the first two components. Here, I used iris dataset:

library(ggplot2)
library(ggforce)
#Data
data("iris")
#PCA
mypca <- prcomp(iris[,-5])
#Isolate components
Comps <- as.data.frame(mypca$x)
#Extract components and bind to original data
newiris <- cbind(iris,Comps[,c(1,2)])
#Plot
ggplot(newiris, aes(x=PC1, y=PC2, col = Species, fill = Species)) +
stat_ellipse(geom = "polygon", col= "black", alpha =0.5)+
geom_point(shape=21, col="black")

Output:

Sample Image

In the case of data shared, only do not apply the NA action. Here the code and output with the data you shared:

#Code
ggplot(pcat, aes(x=PC1, y=PC2, col = `Time point`, fill = `Time point`)) +
stat_ellipse(geom = "polygon", col= "black", alpha =0.5)+
geom_point(shape=21, col="black")

Output:

Sample Image

R: add calibrated axes to PCA biplot in ggplot2

Maybe as an alternative, you could remove the default panel box and axes altogether, and draw a smaller rectangle in the plot region instead. Clipping the lines not to clash with the text labels is a bit tricky, but this might work.

Sample Image

df <- data.frame(x = -1:1, y = -1:1)
dfLabs <- data.frame(x = c(1, -1, 1/2), y = c(-0.75, -0.25, 1),
labels = paste0("V", 1:3))
p <- ggplot(data = df, aes(x = x, y = y)) +
geom_blank() +
geom_blank(data=dfLabs, aes(x = x, y = y)) +
geom_text(data = dfLabs, mapping = aes(label = labels)) +
geom_abline(intercept = rep(0, 3), slope = c(-0.75, 0.25, 2)) +
theme_grey() +
theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
panel.grid = element_blank()) +
theme()

library(grid)
element_grob.element_custom <- function(element, ...) {
rectGrob(0.5,0.5, 0.8, 0.8, gp=gpar(fill="grey95"))
}

panel_custom <- function(...){ # dummy wrapper
structure(
list(...),
class = c("element_custom","element_blank", "element")
)

}

p <- p + theme(panel.background=panel_custom())


clip_layer <- function(g, layer="segment", width=1, height=1){
id <- grep(layer, names(g$grobs[[4]][["children"]]))
newvp <- viewport(width=unit(width, "npc"),
height=unit(height, "npc"), clip=TRUE)
g$grobs[[4]][["children"]][[id]][["vp"]] <- newvp

g
}

g <- ggplotGrob(p)
g <- clip_layer(g, "segment", 0.85, 0.85)
grid.newpage()
grid.draw(g)

Plotting PCA biplot with autoplot: modify arrow thickness

The problem is that ggfortify has already created a ggplot2 object. So if you don't want to recreate the plot by hand (which would be the cleaner solution here), you have to modify the existing plot in the following way:

Old code

library(ggplot2)
library(ggfortify)

df <- iris[c(1, 2, 3, 4)]
iris.pca<-(prcomp(df))

d <- autoplot(iris.pca, data=iris, colour="Species", loadings=TRUE, loadings.colour = "black", scale = 1)+
scale_colour_manual(values=c("forestgreen","red","blue")) +
scale_fill_manual(values=c("forestgreen","red","blue")) +
scale_shape_manual(values=c(25,22,23))+
theme_bw()

Modifications

d$layers[[2]]$aes_params$size <- 0.5
d$layers[[2]]$geom_params$arrow$length <- unit(6, units = "points")
d

This essentially manually creates the size aesthetic for the arrow lines, and shrinks the pointy ends of the arrows: Resulting Plot



Related Topics



Leave a reply



Submit