Plot Correlation Matrix into a Graph

Plot correlation matrix into a graph

Quick, dirty, and in the ballpark:

library(lattice)

#Build the horizontal and vertical axis information
hor <- c("214", "215", "216", "224", "211", "212", "213", "223", "226", "225")
ver <- paste("DM1-", hor, sep="")

#Build the fake correlation matrix
nrowcol <- length(ver)
cor <- matrix(runif(nrowcol*nrowcol, min=0.4), nrow=nrowcol, ncol=nrowcol, dimnames = list(hor, ver))
for (i in 1:nrowcol) cor[i,i] = 1

#Build the plot
rgb.palette <- colorRampPalette(c("blue", "yellow"), space = "rgb")
levelplot(cor, main="stage 12-14 array correlation matrix", xlab="", ylab="", col.regions=rgb.palette(120), cuts=100, at=seq(0,1,0.01))

Sample Image

Plot correlation matrix using pandas

You can use pyplot.matshow() from matplotlib:

import matplotlib.pyplot as plt

plt.matshow(dataframe.corr())
plt.show()

Edit:

In the comments was a request for how to change the axis tick labels. Here's a deluxe version that is drawn on a bigger figure size, has axis labels to match the dataframe, and a colorbar legend to interpret the color scale.

I'm including how to adjust the size and rotation of the labels, and I'm using a figure ratio that makes the colorbar and the main figure come out the same height.


EDIT 2:
As the df.corr() method ignores non-numerical columns, .select_dtypes(['number']) should be used when defining the x and y labels to avoid an unwanted shift of the labels (included in the code below).

f = plt.figure(figsize=(19, 15))
plt.matshow(df.corr(), fignum=f.number)
plt.xticks(range(df.select_dtypes(['number']).shape[1]), df.select_dtypes(['number']).columns, fontsize=14, rotation=45)
plt.yticks(range(df.select_dtypes(['number']).shape[1]), df.select_dtypes(['number']).columns, fontsize=14)
cb = plt.colorbar()
cb.ax.tick_params(labelsize=14)
plt.title('Correlation Matrix', fontsize=16);

correlation plot example

Chord Plot for a Correlation matrix. R

Assuming you are using the circlize package you can adjust the colours manually based on a range by writing a function and then inputting it in the col argument of the chordDiagram() function:

library(circlize)
cols = colorRamp2(c(-1,0,1),c("blue","white","red"),transparency = 0.3)
chordDiagram(mat,col=cols,big.gap=30)

I used the matrix to plot the chord diagram but the data frame should produce the same results. However, I don't understand what you mean when you say all links are represented even if the value is 0 since for example, S1 to E1 is 0 but there is no link between the two
Sample Image

Correlation matrix in Plotly

I think we can use df.corr() in pandas to calculate the correlation coefficient and make a heatmap of it.

import plotly.graph_objects as go
import random
import numpy as np

df = pd.DataFrame({'danceability': np.random.rand(100),
'energy': np.random.rand(100),
'loudness': np.random.rand(100),
'mode': random.choices([0,1], k=100)})
df_corr = df.corr()

print(df_corr)
danceability energy loudness mode
danceability 1.000000 0.061570 0.048222 -0.060654
energy 0.061570 1.000000 0.019930 0.042376
loudness 0.048222 0.019930 1.000000 -0.044294
mode -0.060654 0.042376 -0.044294 1.000000

fig = go.Figure()
fig.add_trace(
go.Heatmap(
x = df_corr.columns,
y = df_corr.index,
z = np.array(df_corr),
text=df_corr.values,
texttemplate='%{text:.2f}'
)
)
fig.show()

Sample Image



Related Topics



Leave a reply



Submit