Python : Plot heatmap for large matrix
I can think of two options by using numpy arrays.
Assuming your data is mostly higher than zero but there are a lot of zeros.:
vmin = some_value_higher_than_zero
plt.matshow(k,aspect='auto',vmin=vmin)Setting all zeros to NaNs. they are automatically left out.
k[k==0.0]=np.nan
plt.matshow(k,aspect='auto')
NB. imshow and matshow work both here.
Another option, when your matrix is really sparse is to use scatterplots.
x,y = k.nonzero()
plt.scatter(x,y,s=100,c=k[x,y]) #color as the values in k matrix
How to render a heatmap for a large array
- The original code didn't generate a plot for me
- Changing
fig, ax = plt.subplots()
toplt.figure(figsize=(14, 14))
, worked to create the plot.- At
figsize=(10, 10)
, the figure didn't render in Jupyter, but the correct image did save to a file. - A figure smaller than
figsize=(14, 14)
, wouldn't render in Jupyter.
- At
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# create matrix
size = 10000
similarity_matrix = np.random.rand(size, size)
# plot matrix
# create figure and set size
plt.figure(figsize=(14, 14))
# add heatmap
sns.heatmap(similarity_matrix, vmin=0, vmax=1)
# save the figure
plt.savefig('test.png', dpi=600)
# show the figure; this was slow
plt.show()
Heatmap for a large matrix and get the clear labels
since you stated that you need all the labels, the only way I see is reducing the font size. You can do this by setting the cexCol
and cexRow
parameters in your call to heatmap()
; for example like this:
heatmap(as.matrix(iris[,1:3]),cexRow = 0.1, cexCol = 0.1,)
How to plot a heatmap of a big matrix with matplotlib (45K * 446)
I solved by downsampling the matrix to a smaller matrix.
I decided to try two methodologies:
- supposing I want to down-sample a matrix of 45k rows to a matrix of 1k rows, I took a row value every 45 rows
- another methodology is, to down-sample 45k rows to 1k rows, to group the 45k rows into 1k groups (composed by 45 adjacent rows) and to take the average for each group as representative row
Hope it helps.
How to make clustered heatmap of a large dataset look nicer?
The problem is in your vmax = 1
argument. If you look at the maximum value in the whole dataset using new_matrix.max().max()
, it is about 0.17.
So, just removing vmax as: or just set a lower value for vmax
Related Topics
R Partial Reshape Data from Long to Wide
Using If Else Conditions on Vectors
Screening (Multi)Collinearity in a Regression Model
How to Manually Set Colors in a Bar Chart
Convert Data Frame into Vector
Parse String with Additional Characters in Format to Date
R Aggregate Data in One Column Based on 2 Other Columns
Weird As.Posixct Behavior Depending on Daylight Savings Time
Merge Two Dataframes If Timestamp of X Is Within Time Interval of Y
How to Produce Time Series for Each Row of a Data Frame with an Unnamed First Column
Scatterplot with Alpha Transparent Histograms in R
The Difference Between Domc and Doparallel in R
What Are Some Good Books, Web Resources, and Projects for Learning R
How to Ignore Case When Using Str_Detect