How to add multiple annotations to a bar plot
With pandas
- Tested with
pandas v1.2.4
Imports and Load Data
import pandas as pd
import matplotlib.pyplot as plt
# create the dataframe from values in the OP
counts = [29227, 102492, 53269, 504028, 802994]
df = pd.DataFrame(data=counts, columns=['counts'], index=['A','B','C','D','E'])
# add a percent column
df['%'] = df.counts.div(df.counts.sum()).mul(100).round(2)
# display(df)
counts %
A 29227 1.96
B 102492 6.87
C 53269 3.57
D 504028 33.78
E 802994 53.82
Plot use matplotlib
from version 3.4.2
- Use
matplotlib.pyplot.bar_label
- See How to add value labels on a bar chart for additional details and examples with
.bar_label
. - Tested with
pandas v1.2.4
, which is usingmatplotlib
as the plot engine. - Some formatting can be done with the
fmt
parameter, but more sophisticated formatting should be done with thelabels
parameter.
ax = df.plot(kind='barh', y='counts', figsize=(10, 5), legend=False, width=.75,
title='This is the plot generated by all code examples in this answer')
# customize the label to include the percent
labels = [f' {v.get_width()}\n {df.iloc[i, 1]}%' for i, v in enumerate(ax.containers[0])]
# set the bar label
ax.bar_label(ax.containers[0], labels=labels, label_type='edge', size=13)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
plt.show()
Plot use matplotlib
before version 3.4.2
# plot the dataframe
ax = df.plot(kind='barh', y='counts', figsize=(10, 5), legend=False, width=.75)
for i, y in enumerate(ax.patches):
# get the percent label
label_per = df.iloc[i, 1]
# add the value label
ax.text(y.get_width()+.09, y.get_y()+.3, str(round((y.get_width()), 1)), fontsize=10)
# add the percent label here
ax.text(y.get_width()+.09, y.get_y()+.1, str(f'{round((label_per), 2)}%'), fontsize=10)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
plt.show()
Original Answer without pandas
- Tested with
matplotlib v3.3.4
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 5))
counts = [29227, 102492, 53269, 504028, 802994]
# calculate percents
percents = [100*x/sum(counts) for x in counts]
y_ax = ('A','B','C','D','E')
y_tick = np.arange(len(y_ax))
ax.barh(range(len(counts)), counts, align = "center", color = "tab:blue")
ax.set_yticks(y_tick)
ax.set_yticklabels(y_ax, size = 8)
#annotate bar plot with values
for i, y in enumerate(ax.patches):
label_per = percents[i]
ax.text(y.get_width()+.09, y.get_y()+.3, str(round((y.get_width()), 1)), fontsize=10)
# add the percent label here
# ax.text(y.get_width()+.09, y.get_y()+.3, str(round((label_per), 2)), ha='right', va='center', fontsize=10)
ax.text(y.get_width()+.09, y.get_y()+.1, str(f'{round((label_per), 2)}%'), fontsize=10)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
plt.show()
- You can play with the positioning.
- Other formatting options mentioned by JohanC
- Print both parts of the text in one string with a
\n
in between to get a "natural" line spacing: str(f'{round((y.get_width()), 1)}\n{round((label_per), 2)}%')
ax.text(..., va='center')
to vertically center and be able to use a slightly larger font.ax.set_xlim(0, max(counts) * 1.18)
to get a bit more space for the text.- Start each line of text with a space to get a natural "horizontal" padding.
str(f' {round((label_per), 2)}%')
, note the space before{
.y.get_width()+.09
is extremely close toy.get_width()
when these values are in the tens of thousands.
multiple annotations on bar seaborn chart
You can play around with the ax.bar_label
in order to set custom labels. No need for annotations and loops.
I'm assuming the below example is what you mean by "plot the corresponding percentage values on the bars", but it can be adjusted flexibly.
Note that this doesn't show values smaller than 1%, since those would be overlapping the x-axis and the other label. This can also be easily adjusted below.
The docs have some instructive examples.
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1, figsize=(15, 8))
plots = sns.barplot(x="STRUD", y="Struct_Count", data=df2, ax=ax)
ax.bar_label(ax.containers[0])
ax.bar_label(ax.containers[0],
labels=[f'{e}%' if e > 1 else "" for e in df2.Perc],
label_type="center")
plt.title("Distribution of STRUCT")
How do I annotate a barplot made from 2 different arrays?
I am able to annotate by using axes.pathces
instead of plot.patches
.
x = [' A6' ,' Q2', ' Q5', ' A5', ' A1', ' A4', ' Q3', ' A3']
y = [ 748, 822, 877, 882 ,1347 ,1381 ,1417, 1929]
fig, ax = plt.subplots(figsize = (10, 7))
ax.bar(x, y)
for bar in ax.patches:
ax.annotate(text = bar.get_height(),
xy = (bar.get_x() + bar.get_width() / 2, bar.get_height()),
ha='center',
va='center',
size=15,
xytext=(0, 8),
textcoords='offset points')
plt.xlabel("Car Model")
plt.ylabel("Car Frequency")
plt.title("Frequency of Most Popular Audi Cars")
plt.ylim(bottom=0)
plt.show()
Creating and Annotating a Grouped Barplot in Python
There are other ways to convert the data format to a vertical format, but we will draw a bar chart for that vertical data. Then get the x-axis position and height of that bar, and annotate it. In my code, I have placed the text at half the height.
df_long = df.unstack().to_frame(name='value')
df_long = df_long.swaplevel()
df_long.reset_index(inplace=True)
df_long.columns = ['group', 'status', 'value']
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(12, 8))
g = sns.barplot(data=df_long, x='group', y='value', hue='status', ax=ax)
for bar in g.patches:
height = bar.get_height()
ax.text(bar.get_x() + bar.get_width() / 2., 0.5 * height, int(height),
ha='center', va='center', color='white')
plt.show()
How to plot a stacked bar with annotations for multiple groups
- This is easier to implement as a stacked bar plot, as such, reshape the dataframe with
pandas.crosstab
and plot usingpandas.DataFrame.plot
withkind='bar'
andstacked=True
- This should not be implemented with
plt.hist
because it's more convoluted, and it's easier to use the pandas plot method directly. - Also a histogram is more appropriate when the x values are a continuous range of numbers, not discrete categorical values.
- This should not be implemented with
ct.iloc[:, :-1]
selects all but the last column,'tot'
to be plotted as bars.- Use
matplotlib.pyplot.bar_label
to add annotationsax.bar_label(ax.containers[2], padding=3)
useslabel_type='edge'
by default, which results in annotating the edge with the cumulative sum ('center'
annotates with the patch value), as shown in this answer.- The
[2]
inax.containers[2]
selects only the top containers to annotate with the cumulative sum. Thecontainers
are 0 indexed from the bottom.
- The
- See this answer for additional details and examples
- This answer shows how to do annotations the old way, without
.bar_label
. I do not recommend it. - This answer shows how to customize labels to prevent annotations for values under a given size.
- Tested in
python 3.10
,pandas 1.3.5
,matplotlib 3.5.1
Load and Shape the DataFrame
import pandas as pd
# load from github repo link
url = 'https://raw.githubusercontent.com/jpiedehierroa/files/main/Libro1.csv'
df = pd.read_csv(url)
# reshape the dataframe
ct = pd.crosstab(df.countries, df.type)
# total medals per country, which is necessary to sort the bars
ct['tot'] = ct.sum(axis=1)
# sort
ct = ct.sort_values(by='tot', ascending=False)
# display(ct)
type bronze gold silver tot
countries
USA 33 39 41 113
China 18 38 32 88
ROC 23 20 28 71
GB 22 22 21 65
Japan 17 27 14 58
Australia 22 17 7 46
Italy 20 10 10 40
Germany 16 10 11 37
Netherlands 14 10 12 36
France 11 10 12 33
Plot
colors = ("#CD7F32", "silver", "gold")
cd = dict(zip(ct.columns, colors))
# plot the medals columns
title = 'Country Medal Count for Tokyo 2020'
ax = ct.iloc[:, :-1].plot(kind='bar', stacked=True, color=cd, title=title,
figsize=(12, 5), rot=0, width=1, ec='k' )
# annotate each container with individual values
for c in ax.containers:
ax.bar_label(c, label_type='center')
# annotate the top containers with the cumulative sum
ax.bar_label(ax.containers[2], padding=3)
# pad the spacing between the number and the edge of the figure
ax.margins(y=0.1)
- An alternative way to annotate the top with the sum is to use the
'tot'
column for custom labels, but as shown, this is not necessary.
labels = ct.tot.tolist()
ax.bar_label(ax.containers[2], labels=labels, padding=3)
How to plot and annotate grouped bars
- The easiest solution is to use pandas. This puts the data in an object which easily facilitates further analysis, and the plot API properly manages the spacing of grouped bars.
- This implementation uses only 6 lines of code, compared to 18 lines.
- Use
pandas.DataFrame.plot
, which usesmatplotlib
as the default plotting backend. Columns are plotted as the bar groups and the index is the independent axis. - From
matplotlib 3.4.2
,.bar_label
should be used for annotations on bars. - See How to add value labels on a bar chart for addition information and examples about using
.bar_label
, and How to plot and annotate a grouped bar chart for an additional example of grouped bars. - Tested in
python 3.9.7
,pandas 1.3.4
,matplotlib 3.4.3
import pandas as pd
import matplotlib.pyplot as plt
# create a dict with the data
data = {'October': oct_data, 'November': nov_data}
# create the dataframe with the labels as the index
df = pd.DataFrame(data, index=labels)
# display(df)
October November
Account_1 10 12
Account_2 24 42
Account_3 25 21
Account_4 30 78
# plot the dataframe
ax = df.plot(kind='bar', figsize=(10, 6), rot=0, ylabel='Cost ($)', color=['#7f6d5f', '#557f2d'])
# iterate through each group of container (bar) objects
for c in ax.containers:
# annotate the container group
ax.bar_label(c, label_type='center')
plt.show()
How to plot and annotate grouped bars in seaborn / matplotlib
Data
- The data needs to be converted to a long format using
.melt
- Because of the scale of values,
'log'
is used for theyscale
- All of the categories in
'cats'
are included for the example.- Select only the desired columns before melting, or use
dfl = dfl[dfl.cats.isin(['sub', 'vc'])
to filter for the desired'cats'
.
- Select only the desired columns before melting, or use
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# setup dataframe
data = {'vc': [76, 47, 140, 106, 246],
'tv': [29645400, 28770702, 50234486, 30704017, 272551386],
'sub': [66100, 15900, 44500, 37000, 76700],
'name': ['a', 'b', 'c', 'd', 'e']}
df = pd.DataFrame(data)
vc tv sub name
0 76 29645400 66100 a
1 47 28770702 15900 b
2 140 50234486 44500 c
# convert to long form
dfl = (df.melt(id_vars='name', var_name='cats', value_name='values')
.sort_values('values', ascending=False).reset_index(drop=True))
name cats values
0 e tv 272551386
1 c tv 50234486
2 d tv 30704017
Updated as of matplotlib v3.4.2
- Use
matplotlib.pyplot.bar_label
.bar_label
works formatplotlib
,seaborn
, andpandas
plots.- See How to add value labels on a bar chart for additional details and examples with
.bar_label
. - Tested with
seaborn v0.11.1
, which is usingmatplotlib
as the plot engine.
# plot
fig, ax = plt.subplots(figsize=(12, 6))
sns.barplot(x='name', y='values', data=dfl, hue='cats', ax=ax)
ax.set_xticklabels(ax.get_xticklabels(), rotation=0)
ax.set_yscale('log')
for c in ax.containers:
# set the bar label
ax.bar_label(c, fmt='%.0f', label_type='edge', padding=1)
# pad the spacing between the number and the edge of the figure
ax.margins(y=0.1)
Plot with seaborn v0.11.1
- Using
matplotlib
before version 3.4.2 - Note that using
.annotate
and.patches
is much more verbose than with.bar_label
.
# plot
fig, ax = plt.subplots(figsize=(12, 6))
sns.barplot(x='name', y='values', data=dfl, hue='cats', ax=ax)
ax.set_xticklabels(chart.get_xticklabels(), rotation=0)
ax.set_yscale('log')
for p in ax.patches:
ax.annotate(f"{p.get_height():.0f}", (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='center', xytext =(0, 7), textcoords='offset points')
How to plot and annotate a grouped bar chart
Imports and DataFrame
import pandas as pd
import matplotlib.pyplot as plt
# given the following code to create the dataframe
file="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DV0101EN/labs/coursera/Topic_Survey_Assignment.csv"
df=pd.read_csv(file, index_col=0)
df.sort_values(by=['Very interested'], axis=0, ascending=False, inplace=True)
# all columns are being divided by 2233 so those lines can be replace with the following single line
df = df.div(2233)
# display(df)
Very interested Somewhat interested Not interested
Data Analysis / Statistics 0.755934 0.198836 0.026870
Machine Learning 0.729512 0.213614 0.033139
Data Visualization 0.600090 0.328706 0.045678
Big Data (Spark / Hadoop) 0.596507 0.326467 0.056874
Deep Learning 0.565607 0.344828 0.060905
Data Journalism 0.192118 0.484102 0.273175
Using since matplotlib v3.4.2
- Uses
matplotlib.pyplot.bar_label
andpandas.DataFrame.plot
- Some formatting can be done with the
fmt
parameter, but more sophisticated formatting should be done with thelabels
parameter, as show in How to add multiple annotations to a barplot. - See How to add value labels on a bar chart for additional details and examples using
.bar_label
# your colors
colors = ['#5cb85c', '#5bc0de', '#d9534f']
# plot with annotations is probably easier
p1 = df.plot(kind='bar', color=colors, figsize=(20, 8), rot=0, ylabel='Percentage', title="The percentage of the respondents' interest in the different data science Area")
for p in p1.containers:
p1.bar_label(p, fmt='%.2f', label_type='edge')
Using before matplotlib v3.4.2
w = 0.8 / 3
will resolve the issue, given the current code.- However, generating the plot can be accomplished more easily with
pandas.DataFrame.plot
# your colors
colors = ['#5cb85c', '#5bc0de', '#d9534f']
# plot with annotations is probably easier
p1 = df.plot.bar(color=colors, figsize=(20, 8), ylabel='Percentage', title="The percentage of the respondents' interest in the different data science Area")
p1.set_xticklabels(p1.get_xticklabels(), rotation=0)
for p in p1.patches:
p1.annotate(f'{p.get_height():0.2f}', (p.get_x() + p.get_width() / 2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 10), textcoords = 'offset points')
Related Topics
Collision Between Masks in Pygame
How to Scroll Frame Using Mouse Wheel & Adding Horizontal Scrollbar
Log into Gmail Using Selenium in Python
How to Make Image/Images Disappear in Pygame
What Is _Future_ in Python Used for and How/When to Use It, and How It Works
How to Check the Versions of Python Modules
When I Catch an Exception, How to Get the Type, File, and Line Number
Libxml Install Error Using Pip
How to Print Bold Text in Python
Understanding Popen.Communicate
Search for String in All Pandas Dataframe Columns and Filter
Example Use of "Continue" Statement in Python
Django: Improperlyconfigured: the Secret_Key Setting Must Not Be Empty
Problems with Pip Install Numpy - Runtimeerror: Broken Toolchain: Cannot Link a Simple C Program