Plotting pca biplot with ggplot2
Maybe this will help-- it's adapted from code I wrote some time back. It now draws arrows as well.
PCbiplot <- function(PC, x="PC1", y="PC2") {
# PC being a prcomp object
data <- data.frame(obsnames=row.names(PC$x), PC$x)
plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames))
plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2)
datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation)
mult <- min(
(max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))),
(max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x])))
)
datapc <- transform(datapc,
v1 = .7 * mult * (get(x)),
v2 = .7 * mult * (get(y))
)
plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color="red")
plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color="red")
plot
}
fit <- prcomp(USArrests, scale=T)
PCbiplot(fit)
You may want to change size of text, as well as transparency and colors, to taste; it would be easy to make them parameters of the function.
Note: it occurred to me that this works with prcomp but your example is with princomp. You may, again, need to adapt the code accordingly.
Note2: code for geom_segment()
is borrowed from the mailing list post linked from comment to OP.
A self-written code for biplot in ggplot2
The problem is with your geom_text
layer
geom_text(data = loadings, aes(x=PC2, y=PC3, label=label),color="#006400")
Both loadings$PC2
and loadings$PC3
have length 4, but label
has length 150. These do not go together.
How to display observations in pca biplot?
You can specify geom.ind = "txt"
fviz_pca_biplot(pca,geom.ind="text",labelsize=2,
col.var = "#2E9FDF",col.ind = "#696969")
And try to add the ggrepel :
options(ggrepel.max.overlaps = 20)
fviz_pca_biplot(pca,geom.ind="text",labelsize=2,
col.var = "#2E9FDF",col.ind = "#696969",repel=TRUE)
Problems Plotting PCA in R with ggplot2
Also check this, here I included an example. The trick use Comps <- as.data.frame(mypca$x)
to isolate the components and then add to original data. After that you can use cbind()
with Comps[,c(1,2)]
to only extract the first two components. Here, I used iris
dataset:
library(ggplot2)
library(ggforce)
#Data
data("iris")
#PCA
mypca <- prcomp(iris[,-5])
#Isolate components
Comps <- as.data.frame(mypca$x)
#Extract components and bind to original data
newiris <- cbind(iris,Comps[,c(1,2)])
#Plot
ggplot(newiris, aes(x=PC1, y=PC2, col = Species, fill = Species)) +
stat_ellipse(geom = "polygon", col= "black", alpha =0.5)+
geom_point(shape=21, col="black")
Output:
In the case of data shared, only do not apply the NA action. Here the code and output with the data you shared:
#Code
ggplot(pcat, aes(x=PC1, y=PC2, col = `Time point`, fill = `Time point`)) +
stat_ellipse(geom = "polygon", col= "black", alpha =0.5)+
geom_point(shape=21, col="black")
Output:
R: add calibrated axes to PCA biplot in ggplot2
Maybe as an alternative, you could remove the default panel box and axes altogether, and draw a smaller rectangle in the plot region instead. Clipping the lines not to clash with the text labels is a bit tricky, but this might work.
df <- data.frame(x = -1:1, y = -1:1)
dfLabs <- data.frame(x = c(1, -1, 1/2), y = c(-0.75, -0.25, 1),
labels = paste0("V", 1:3))
p <- ggplot(data = df, aes(x = x, y = y)) +
geom_blank() +
geom_blank(data=dfLabs, aes(x = x, y = y)) +
geom_text(data = dfLabs, mapping = aes(label = labels)) +
geom_abline(intercept = rep(0, 3), slope = c(-0.75, 0.25, 2)) +
theme_grey() +
theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
panel.grid = element_blank()) +
theme()
library(grid)
element_grob.element_custom <- function(element, ...) {
rectGrob(0.5,0.5, 0.8, 0.8, gp=gpar(fill="grey95"))
}
panel_custom <- function(...){ # dummy wrapper
structure(
list(...),
class = c("element_custom","element_blank", "element")
)
}
p <- p + theme(panel.background=panel_custom())
clip_layer <- function(g, layer="segment", width=1, height=1){
id <- grep(layer, names(g$grobs[[4]][["children"]]))
newvp <- viewport(width=unit(width, "npc"),
height=unit(height, "npc"), clip=TRUE)
g$grobs[[4]][["children"]][[id]][["vp"]] <- newvp
g
}
g <- ggplotGrob(p)
g <- clip_layer(g, "segment", 0.85, 0.85)
grid.newpage()
grid.draw(g)
Plotting PCA biplot with autoplot: modify arrow thickness
The problem is that ggfortify
has already created a ggplot2 object. So if you don't want to recreate the plot by hand (which would be the cleaner solution here), you have to modify the existing plot in the following way:
Old code
library(ggplot2)
library(ggfortify)
df <- iris[c(1, 2, 3, 4)]
iris.pca<-(prcomp(df))
d <- autoplot(iris.pca, data=iris, colour="Species", loadings=TRUE, loadings.colour = "black", scale = 1)+
scale_colour_manual(values=c("forestgreen","red","blue")) +
scale_fill_manual(values=c("forestgreen","red","blue")) +
scale_shape_manual(values=c(25,22,23))+
theme_bw()
Modifications
d$layers[[2]]$aes_params$size <- 0.5
d$layers[[2]]$geom_params$arrow$length <- unit(6, units = "points")
d
This essentially manually creates the size aesthetic for the arrow lines, and shrinks the pointy ends of the arrows:
Related Topics
How to Match by Nearest Date from Two Data Frames
R Package That Automatically Uses Several Cores
R Scatter Plot: Symbol Color Represents Number of Overlapping Points
Import Data into R with an Unknown Number of Columns
Standard Deviation in R Seems to Be Returning the Wrong Answer - am I Doing Something Wrong
Assign Value to Group Based on Condition in Column
Split the Title Onto Multiple Lines
How to Read CSV File in R Where Some Values Contain the Percent Symbol (%)
Spreading a Two Column Data Frame with Tidyr
Outputting Multiple Lines of Text with Rendertext() in R Shiny
Randomly Insert Nas into Dataframe Proportionaly
Return Data Subset Time Frames Within Another Timeframes
How to Extract Just the Number from a Named Number (Without the Name)
Split/Subset a Data Frame by Factors in One Column
How to Change the Figure Caption Format in Bookdown
How to Do Range Grouping on a Column Using Dplyr
Removing One Tablegrob When Applied to a Box Plot with a Facet_Wrap