Python: Plotting Percentage in Seaborn Bar Plot

Python: Plotting percentage in seaborn bar plot

You could use your own function in sns.barplot estimator, as from docs:

estimator : callable that maps vector -> scalar, optional

Statistical function to estimate within each categorical bin.

For you case you could define function as lambda:

sns.barplot(x='group', y='Values', data=df, estimator=lambda x: sum(x==0)*100.0/len(x))

enter image description here

How to add percentages on top of bars in seaborn

The seaborn.catplot organizing function returns a FacetGrid, which gives you access to the fig, the ax, and its patches. If you add the labels when nothing else has been plotted you know which bar-patches came from which variables. From @LordZsolt's answer I picked up the order argument to catplot: I like making that explicit because now we aren't relying on the barplot function using the order we think of as default.

import seaborn as sns
from itertools import product

titanic = sns.load_dataset("titanic")

class_order = ['First','Second','Third']
hue_order = ['child', 'man', 'woman']
bar_order = product(class_order, hue_order)

catp = sns.catplot(data=titanic, kind='count',
x='class', hue='who',
order = class_order,
hue_order = hue_order )

# As long as we haven't plotted anything else into this axis,
# we know the rectangles in it are our barplot bars
# and we know the order, so we can match up graphic and calculations:

spots = zip(catp.ax.patches, bar_order)
for spot in spots:
class_total = len(titanic[titanic['class']==spot[1][0]])
class_who_total = len(titanic[(titanic['class']==spot[1][0]) &
(titanic['who']==spot[1][1])])
height = spot[0].get_height()
catp.ax.text(spot[0].get_x(), height+3, '{:1.2f}'.format(class_who_total/class_total))

#checking the patch order, not for final:
#catp.ax.text(spot[0].get_x(), -3, spot[1][0][0]+spot[1][1][0])

produces

barplot of three-by-three variable values, with subset calculations as text labels

An alternate approach is to do the sub-summing explicitly, e.g. with the excellent pandas, and plot with matplotlib, and also do the styling yourself. (Though you can get quite a lot of styling from sns context even when using matplotlib plotting functions. Try it out -- )

Add labels as percentages instead of counts on a grouped bar graph in seaborn

Use groupby.transform to compute the split percentages per day:

df['Customers (%)'] = df.groupby('Day')['Customers'].transform(lambda x: x / x.sum() * 100)

# Day Customers Time Customers (%)
# 0 Mon 44 M 57.142857
# 1 Tue 46 M 50.000000
# 2 Wed 49 M 49.494949
# 3 Thur 59 M 54.629630
# 4 Fri 54 M 47.368421
# 5 Mon 33 E 42.857143
# 6 Tue 46 E 50.000000
# 7 Wed 50 E 50.505051
# 8 Thur 49 E 45.370370
# 9 Fri 60 E 52.631579

Then plot this new Customers (%) column and label the bars using ax.bar_label (with percentage formatting via the fmt param):

ax = sns.barplot(x='Day', y='Customers (%)', hue='Time', data=df) 

for container in ax.containers:
ax.bar_label(container, fmt='%.0f%%')

Note that ax.bar_label requires matplotlib 3.4.0.

How can i plot a Seaborn Percentage Bar graph using a dictionary?

A bar plot with the dictionary keys as x-axis and the dictionary values divided by total as height. Optionally, a PercentFormatter can be set as display format. Please note that the values need to be converted from string to numeric, so they can be used as bar height.

Also note that using dict as a variable name can complicate future code, as afterwards dict can't be used anymore as keyword.

from matplotlib import pyplot as plt
from matplotlib.ticker import PercentFormatter

fruit_dict = {'apple': '12', 'orange': '9', 'banana': '9', 'kiwi': '3'}
for f in fruit_dict:
fruit_dict[f] = int(fruit_dict[f])
total = sum(fruit_dict.values())
plt.bar(fruit_dict.keys(), [v/total for v in fruit_dict.values()], color='salmon')
plt.gca().yaxis.set_major_formatter(PercentFormatter(xmax=1, decimals=0))
plt.grid(axis='y')
plt.show()

example plot

How to calculate percentage for row counts in groupby in Python and generate bar plot

I believe this is what you are looking for:

temp_df = (df.groupby('Status').size().sort_values(ascending=False) / df.groupby('Status').size().sort_values(ascending=False).sum())*100

ax = temp_df.plot(kind='bar')

ax.bar_label(ax.containers[0])

plt.show()

How to add percentages on countplot in seaborn

First, note that in matplotlib and seaborn, a subplot is called an "ax". Giving such a subplot a name such as "p3" or "plot" leads to unnecessary confusion when studying the documentation and online example code.

The bars in the seaborn bar plot are organized, starting with all the bars belonging to the first hue value, then the second, etc. So, in the given example, first come all the blue, then all the orange and finally all the green bars. This makes looping through ax.patches somewhat complicated. Luckily, the same patches are also available via ax.collections, where each hue group forms a separate collection of bars.

Here is some example code:

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

def percentage_above_bar_relative_to_xgroup(ax):
all_heights = [[p.get_height() for p in bars] for bars in ax.containers]
for bars in ax.containers:
for i, p in enumerate(bars):
total = sum(xgroup[i] for xgroup in all_heights)
percentage = f'{(100 * p.get_height() / total) :.1f}%'
ax.annotate(percentage, (p.get_x() + p.get_width() / 2, p.get_height()), size=11, ha='center', va='bottom')

df = sns.load_dataset("titanic")

plt.figure(figsize=(12, 8))
ax3 = sns.countplot(x="class", hue="who", data=df)
ax3.set(xlabel='Class', ylabel='Count')

percentage_above_bar_relative_to_xgroup(ax3)
plt.show()

barplots with percentages per x group

Graphing percentage data in seaborn

I suggest you organize your DataFrame more like this, it will make it much easier to plot and organize this type of data.

Instead of doing your DataFrame as you have it, instead transpose it to two simple columns like so:

name                 value
debt_consolidation 0.152388
credit_card 0.115689
all_other 0.170111

etc. By doing this you can simply plot your data in Seaborn by doing the below:

sns.barplot(x="name",y="value", data = df)

Which will look like this (click)

How to plot groupby as percentage in seaborn?

You can use barplot here. I wasn't 100% sure of what you actually want to achieve so I developed several solutions.

Frequency of successful (unsuccessful) per total successful (unsuccessful)

fig, axes = plt.subplots(2, 2, figsize=(15, 10))

mainDf['frequency'] = 0 # a dummy column to refer to
for col, ax in zip(['page_name', 'weekday', 'type', 'industry'], axes.flatten()):
counts = mainDf.groupby([col, 'successful']).count()
freq_per_group = counts.div(counts.groupby('successful').transform('sum')).reset_index()
sns.barplot(x=col, y='frequency', hue='successful', data=freq_per_group, ax=ax)

enter image description here

Frequency of successful (unsuccessful) per group

fig, axes = plt.subplots(2, 2, figsize=(15, 10))

mainDf['frequency'] = 0 # a dummy column to refer to
for col, ax in zip(['page_name', 'weekday', 'type', 'industry'], axes.flatten()):
counts = mainDf.groupby([col, 'successful']).count()
freq_per_group = counts.div(counts.groupby(col).transform('sum')).reset_index()
sns.barplot(x=col, y='frequency', hue='successful', data=freq_per_group, ax=ax)

which, based on the data you provided, gives

enter image description here

Frequency of successful (unsuccessful) per total

fig, axes = plt.subplots(2, 2, figsize=(15, 10))

mainDf['frequency'] = 0 # a dummy column to refer to
total = len(mainDf)
for col, ax in zip(['page_name', 'weekday', 'type', 'industry'], axes.flatten()):
counts = mainDf.groupby([col, 'successful']).count()
freq_per_total = counts.div(total).reset_index()
sns.barplot(x=col, y='frequency', hue='successful', data=freq_per_total, ax=ax)

enter image description here

plot percent bar in seaborn from dataframe

not my most elegant code, but this works:

import pandas as pd


stacked = pd.DataFrame({'Scenario': list('ABCD'), 'Male': [79, 59, 420, 208], 'Female': [217, 408, 330, 1330]})

pd.DataFrame(stacked.apply(lambda x: {'Scenario': x.Scenario, 'Male_pct': x.Male / (x.Male + x.Female), 'Female_pct': x.Female / (x.Male + x.Female)}, axis=1,).tolist())

then just plot it how you were plotting already.



Related Topics



Leave a reply



Submit