Python: Plotting percentage in seaborn bar plot
You could use your own function in sns.barplot
estimator
, as from docs:
estimator : callable that maps vector -> scalar, optional
Statistical function to estimate within each categorical bin.
For you case you could define function as lambda:
sns.barplot(x='group', y='Values', data=df, estimator=lambda x: sum(x==0)*100.0/len(x))
How to add percentages on top of bars in seaborn
The seaborn.catplot
organizing function returns a FacetGrid, which gives you access to the fig, the ax, and its patches. If you add the labels when nothing else has been plotted you know which bar-patches came from which variables. From @LordZsolt's answer I picked up the order
argument to catplot
: I like making that explicit because now we aren't relying on the barplot function using the order we think of as default.
import seaborn as sns
from itertools import product
titanic = sns.load_dataset("titanic")
class_order = ['First','Second','Third']
hue_order = ['child', 'man', 'woman']
bar_order = product(class_order, hue_order)
catp = sns.catplot(data=titanic, kind='count',
x='class', hue='who',
order = class_order,
hue_order = hue_order )
# As long as we haven't plotted anything else into this axis,
# we know the rectangles in it are our barplot bars
# and we know the order, so we can match up graphic and calculations:
spots = zip(catp.ax.patches, bar_order)
for spot in spots:
class_total = len(titanic[titanic['class']==spot[1][0]])
class_who_total = len(titanic[(titanic['class']==spot[1][0]) &
(titanic['who']==spot[1][1])])
height = spot[0].get_height()
catp.ax.text(spot[0].get_x(), height+3, '{:1.2f}'.format(class_who_total/class_total))
#checking the patch order, not for final:
#catp.ax.text(spot[0].get_x(), -3, spot[1][0][0]+spot[1][1][0])
produces
An alternate approach is to do the sub-summing explicitly, e.g. with the excellent pandas
, and plot with matplotlib
, and also do the styling yourself. (Though you can get quite a lot of styling from sns
context even when using matplotlib
plotting functions. Try it out -- )
Add labels as percentages instead of counts on a grouped bar graph in seaborn
Use groupby.transform
to compute the split percentages per day:
df['Customers (%)'] = df.groupby('Day')['Customers'].transform(lambda x: x / x.sum() * 100)
# Day Customers Time Customers (%)
# 0 Mon 44 M 57.142857
# 1 Tue 46 M 50.000000
# 2 Wed 49 M 49.494949
# 3 Thur 59 M 54.629630
# 4 Fri 54 M 47.368421
# 5 Mon 33 E 42.857143
# 6 Tue 46 E 50.000000
# 7 Wed 50 E 50.505051
# 8 Thur 49 E 45.370370
# 9 Fri 60 E 52.631579
Then plot this new Customers (%)
column and label the bars using ax.bar_label
(with percentage formatting via the fmt
param):
ax = sns.barplot(x='Day', y='Customers (%)', hue='Time', data=df)
for container in ax.containers:
ax.bar_label(container, fmt='%.0f%%')
Note that ax.bar_label
requires matplotlib 3.4.0.
How can i plot a Seaborn Percentage Bar graph using a dictionary?
A bar plot with the dictionary keys as x-axis and the dictionary values divided by total as height. Optionally, a PercentFormatter can be set as display format. Please note that the values need to be converted from string to numeric, so they can be used as bar height.
Also note that using dict
as a variable name can complicate future code, as afterwards dict
can't be used anymore as keyword.
from matplotlib import pyplot as plt
from matplotlib.ticker import PercentFormatter
fruit_dict = {'apple': '12', 'orange': '9', 'banana': '9', 'kiwi': '3'}
for f in fruit_dict:
fruit_dict[f] = int(fruit_dict[f])
total = sum(fruit_dict.values())
plt.bar(fruit_dict.keys(), [v/total for v in fruit_dict.values()], color='salmon')
plt.gca().yaxis.set_major_formatter(PercentFormatter(xmax=1, decimals=0))
plt.grid(axis='y')
plt.show()
How to calculate percentage for row counts in groupby in Python and generate bar plot
I believe this is what you are looking for:
temp_df = (df.groupby('Status').size().sort_values(ascending=False) / df.groupby('Status').size().sort_values(ascending=False).sum())*100
ax = temp_df.plot(kind='bar')
ax.bar_label(ax.containers[0])
plt.show()
How to add percentages on countplot in seaborn
First, note that in matplotlib and seaborn, a subplot is called an "ax". Giving such a subplot a name such as "p3" or "plot" leads to unnecessary confusion when studying the documentation and online example code.
The bars in the seaborn bar plot are organized, starting with all the bars belonging to the first hue value, then the second, etc. So, in the given example, first come all the blue, then all the orange and finally all the green bars. This makes looping through ax.patches
somewhat complicated. Luckily, the same patches are also available via ax.collections
, where each hue group forms a separate collection of bars.
Here is some example code:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
def percentage_above_bar_relative_to_xgroup(ax):
all_heights = [[p.get_height() for p in bars] for bars in ax.containers]
for bars in ax.containers:
for i, p in enumerate(bars):
total = sum(xgroup[i] for xgroup in all_heights)
percentage = f'{(100 * p.get_height() / total) :.1f}%'
ax.annotate(percentage, (p.get_x() + p.get_width() / 2, p.get_height()), size=11, ha='center', va='bottom')
df = sns.load_dataset("titanic")
plt.figure(figsize=(12, 8))
ax3 = sns.countplot(x="class", hue="who", data=df)
ax3.set(xlabel='Class', ylabel='Count')
percentage_above_bar_relative_to_xgroup(ax3)
plt.show()
Graphing percentage data in seaborn
I suggest you organize your DataFrame more like this, it will make it much easier to plot and organize this type of data.
Instead of doing your DataFrame as you have it, instead transpose it to two simple columns like so:
name value
debt_consolidation 0.152388
credit_card 0.115689
all_other 0.170111
etc. By doing this you can simply plot your data in Seaborn by doing the below:
sns.barplot(x="name",y="value", data = df)
Which will look like this (click)
How to plot groupby as percentage in seaborn?
You can use barplot
here. I wasn't 100% sure of what you actually want to achieve so I developed several solutions.
Frequency of successful (unsuccessful) per total successful (unsuccessful)
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
mainDf['frequency'] = 0 # a dummy column to refer to
for col, ax in zip(['page_name', 'weekday', 'type', 'industry'], axes.flatten()):
counts = mainDf.groupby([col, 'successful']).count()
freq_per_group = counts.div(counts.groupby('successful').transform('sum')).reset_index()
sns.barplot(x=col, y='frequency', hue='successful', data=freq_per_group, ax=ax)
Frequency of successful (unsuccessful) per group
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
mainDf['frequency'] = 0 # a dummy column to refer to
for col, ax in zip(['page_name', 'weekday', 'type', 'industry'], axes.flatten()):
counts = mainDf.groupby([col, 'successful']).count()
freq_per_group = counts.div(counts.groupby(col).transform('sum')).reset_index()
sns.barplot(x=col, y='frequency', hue='successful', data=freq_per_group, ax=ax)
which, based on the data you provided, gives
Frequency of successful (unsuccessful) per total
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
mainDf['frequency'] = 0 # a dummy column to refer to
total = len(mainDf)
for col, ax in zip(['page_name', 'weekday', 'type', 'industry'], axes.flatten()):
counts = mainDf.groupby([col, 'successful']).count()
freq_per_total = counts.div(total).reset_index()
sns.barplot(x=col, y='frequency', hue='successful', data=freq_per_total, ax=ax)
plot percent bar in seaborn from dataframe
not my most elegant code, but this works:
import pandas as pd
stacked = pd.DataFrame({'Scenario': list('ABCD'), 'Male': [79, 59, 420, 208], 'Female': [217, 408, 330, 1330]})
pd.DataFrame(stacked.apply(lambda x: {'Scenario': x.Scenario, 'Male_pct': x.Male / (x.Male + x.Female), 'Female_pct': x.Female / (x.Male + x.Female)}, axis=1,).tolist())
then just plot it how you were plotting already.
Related Topics
Macos: How to Downgrade Homebrew Python
How to Reset Anaconda Root Environment
Python - Use Previous Row'S Value to Update the New Rows Values
Finding Out Who Got the Highest Mark Among the Students
Python, Pandas:Write Content of Dataframe into Text File
How to Remove Hashtag, @User, Link of a Tweet Using Regular Expression
How to Write List Elements into a Tab-Separated File
How to Download the Latest File of an S3 Bucket Using Boto3
How Does \R (Carriage Return) Work in Python
How to Insert String Value into Specific Column Value on Python Pandas
How to Remove Square Brackets from List in Python
Export Pandas Dataframe into a Pdf File Using Python
How to Convert Datetime by Removing Nanoseconds
Converting a List into Comma Separated and Add Quotes in Python
Reading a CSV File into Pandas Dataframe With Quotation in Some Entries
How to Determine Whether a Pandas Column Contains a Particular Value