Plot Two Histograms on Single Chart With Matplotlib

Plot two histograms on single chart with matplotlib

Here you have a working example:

import random
import numpy
from matplotlib import pyplot

x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]

bins = numpy.linspace(-10, 10, 100)

pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()

enter image description here

Plot two Histograms with Matplotlib and Python

The Seaborn Library can help you:

import matplotlib.pyplot as plt
import seaborn as sns


# Your Data
list_1 = [1,2,3,4,5,1,2,3,4,5,6,3,4,5,1,3,4,5,4,5,6,8,9,12,3,3,3,4,3,4,5,6,5,6,7,8,9,5,3,2,4,5,2,3,4,11,13,4,5,3,5,6,7,11,13,3,4,5,4,5]
list_2 = [4,5,6,7,8,9,4,5,6,7,8,9,5,6,7,8,9,6,7,8,9,12,15,16,11,12,7,8,9,7,8,9,5,6,7,8,9,7,8,9,8,9,11,10,12,16,7,8,9,10,10,8,9,8,9,10,10,10,15,16,19]

# Creating a displot
fig = plt.figure(figsize=(15,5))
ax = fig.add_subplot(111)

sns.distplot(list_1, kde=True, ax = ax, hist=False, bins = 10)
sns.distplot(list_2, kde=True, ax = ax, hist=False, bins = 10)

plt.show()

resulting plot

How to have two histograms but not stacked?

You can combine the two data rows in one single plot by adding the data rows in a list inside your plt.hist function.

Check out the documentation of the plt.hist funciton. It says, you "Input values, this takes either a single array or a sequence of arrays which are not required to be of the same length."

sns.set(palette="Reds_r")
plt.figure(figsize=(15,10))
members_injury_height = members["injury_height_metres"]
members_death_height = members["death_height_metres"]

plt.hist([members_injury_height, members_death_height], 20)
plt.legend(loc='upper right')
plt.xlabel("Hauteurs")
plt.ylabel("% de membres morts/blessés")
plt.title("Répartition des hauteurs auxquelles des membres se sont blessés ou sont morts")
fond = plt.gca()
fond.set_facecolor('whitesmoke')
plt.gca().legend(('blessés', 'morts'))

You would get then the following:
Sample Image

Plot two histograms on the same graph and have their columns sum to 100

It sounds like you don't want the normed/density kwarg in this case. You're already using weights. If you multiply your weights by 100 and leave out the normed=True option, you should get exactly what you had in mind.

For example:

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1)

x = np.random.normal(5, 2, 10000)
y = np.random.normal(2, 1, 3000000)

xweights = 100 * np.ones_like(x) / x.size
yweights = 100 * np.ones_like(y) / y.size

fig, ax = plt.subplots()
ax.hist(x, weights=xweights, color='lightblue', alpha=0.5)
ax.hist(y, weights=yweights, color='salmon', alpha=0.5)

ax.set(title='Histogram Comparison', ylabel='% of Dataset in Bin')
ax.margins(0.05)
ax.set_ylim(bottom=0)
plt.show()

enter image description here

On the other hand, what you're currently doing (weights and normed) would result in (note the units on the y-axis):

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1)

x = np.random.normal(5, 2, 10000)
y = np.random.normal(2, 1, 3000000)

xweights = 100 * np.ones_like(x) / x.size
yweights = 100 * np.ones_like(y) / y.size

fig, ax = plt.subplots()
ax.hist(x, weights=xweights, color='lightblue', alpha=0.5, normed=True)
ax.hist(y, weights=yweights, color='salmon', alpha=0.5, normed=True)

ax.set(title='Histogram Comparison', ylabel='% of Dataset in Bin')
ax.margins(0.05)
ax.set_ylim(bottom=0)
plt.show()

enter image description here

How to plot these histograms next to each other

Having sample data with 3 columns:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

d = {'col1': [1, 2, 3, 4, 5, 6, 3, 5],
'col2': [1, 2, 2, 2, 5, 6, 3, 5],
'col3': [1, 2, 2, 2, 2, 2, 2, 6], }
df = pd.DataFrame(data=d)

df

Sample Image

You can use subplots from matplotlib to see each column on separate histogram:

fig, axs = plt.subplots(len(df.columns), figsize=(8,12))
i=0

for x in df.columns.to_list():
axs[i].hist(df[x], bins=10)
i=i+1

Sample Image

How to create multiple histograms on separate graphs with matplotlib?

This is probably when you want to use matplotlib's object-oriented interface. There are a couple ways that you could handle this.

First, you could want each plot on an entirely separate figure. In which case, matplotlib lets you keep track of various figures.

import numpy as np
import matplotlib.pyplot as plt

a = np.random.normal(size=200)
b = np.random.normal(size=200)

fig1 = plt.figure()
ax1 = fig1.add_subplot(1, 1, 1)
n, bins, patches = ax1.hist(a)
ax1.set_xlabel('Angle a (degrees)')
ax1.set_ylabel('Frequency')

fig2 = plt.figure()
ax2 = fig2.add_subplot(1, 1, 1)
n, bins, patches = ax2.hist(b)
ax2.set_xlabel('Angle b (degrees)')
ax2.set_ylabel('Frequency')

Or, you can divide your figure into multiple subplots and plot a histogram on each of those. In which case matplotlib lets you keep track of the various subplots.

fig = plt.figure()
ax1 = fig.add_subplot(2, 1, 1)
ax2 = fig.add_subplot(2, 1, 2)

n, bins, patches = ax1.hist(a)
ax1.set_xlabel('Angle a (degrees)')
ax1.set_ylabel('Frequency')

n, bins, patches = ax2.hist(b)
ax2.set_xlabel('Angle b (degrees)')
ax2.set_ylabel('Frequency')

Answer to this question explain the numbers in add_subplot.

compare two images and output their histograms

  1. You need to mention the plots of both the b&w images and then display it

  2. Choose colors using color flag within plot()

Code:

img1 = plt.imread('image_1.jpg', 0)
img2 = plt.imread('image_2.jpg', 0)

# histogram of both images
hist1 = cv2.calcHist([img1],[0],None,[256],[0,256])
hist2 = cv2.calcHist([img2],[0],None,[256],[0,256])

# plot both the histograms and mention colors of your choice
plt.plot(hist1,color='red')
plt.plot(hist2, color='green')

plt.show()

Sample Image

To answer your second question, store the values in hist1 and hist2 as two separate columns in a dataframe. Save the dataframe as an excel or CSV file. You can later open the file, select the columns and plot.

Matplotlib: How to make two histograms have the same bin width?

I think a consistent way that will easily work for most cases, without having to worry about what is the distribution range for each of your datasets, will be to put the datasets together into a big one, determine the bins edges and then plot:

a=np.random.random(100)*0.5 #a uniform distribution
b=1-np.random.normal(size=100)*0.1 #a normal distribution
bins=np.histogram(np.hstack((a,b)), bins=40)[1] #get the bin edges
plt.hist(a, bins)
plt.hist(b, bins)

enter image description here



Related Topics



Leave a reply



Submit