Plot Different Color for Different Categorical Levels Using Matplotlib

plot different color for different categorical levels using matplotlib

Imports and Sample DataFrame

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns # for sample data
from matplotlib.lines import Line2D # for legend handle

# DataFrame used for all options
df = sns.load_dataset('diamonds')

carat cut color clarity depth table price x y z
0 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43
1 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31
2 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31

With matplotlib

You can pass plt.scatter a c argument, which allows you to select the colors. The following code defines a colors dictionary to map the diamond colors to the plotting colors.

fig, ax = plt.subplots(figsize=(6, 6))

colors = {'D':'tab:blue', 'E':'tab:orange', 'F':'tab:green', 'G':'tab:red', 'H':'tab:purple', 'I':'tab:brown', 'J':'tab:pink'}

ax.scatter(df['carat'], df['price'], c=df['color'].map(colors))

# add a legend
handles = [Line2D([0], [0], marker='o', color='w', markerfacecolor=v, label=k, markersize=8) for k, v in colors.items()]
ax.legend(title='color', handles=handles, bbox_to_anchor=(1.05, 1), loc='upper left')

plt.show()

df['color'].map(colors) effectively maps the colors from "diamond" to "plotting".

(Forgive me for not putting another example image up, I think 2 is enough :P)

With seaborn

You can use seaborn which is a wrapper around matplotlib that makes it look prettier by default (rather opinion-based, I know :P) but also adds some plotting functions.

For this you could use seaborn.lmplot with fit_reg=False (which prevents it from automatically doing some regression).

  • sns.scatterplot(x='carat', y='price', data=df, hue='color', ec=None) also does the same thing.

Selecting hue='color' tells seaborn to split and plot the data based on the unique values in the 'color' column.

sns.lmplot(x='carat', y='price', data=df, hue='color', fit_reg=False)

Sample Image

With pandas.DataFrame.groupby & pandas.DataFrame.plot

If you don't want to use seaborn, use pandas.groupby to get the colors alone, and then plot them using just matplotlib, but you'll have to manually assign colors as you go, I've added an example below:

fig, ax = plt.subplots(figsize=(6, 6))

grouped = df.groupby('color')
for key, group in grouped:
group.plot(ax=ax, kind='scatter', x='carat', y='price', label=key, color=colors[key])
plt.show()

This code assumes the same DataFrame as above, and then groups it based on color. It then iterates over these groups, plotting for each one. To select a color, I've created a colors dictionary, which can map the diamond color (for instance D) to a real color (for instance tab:blue).

Sample Image

Matplotlib scatter color by categorical factors

As your traceback tells you, you can't pass a string to the color parameter. You can pass either colors, or an array of values that it will interpret as colors itself.

See:
http://matplotlib.org/api/pyplot_api.html?highlight=plot#matplotlib.pyplot.plot

There is probably a more elegant way, but one implementation would be the following (I used the following dataset: https://raw.githubusercontent.com/pydata/pandas/master/pandas/tests/data/iris.csv):

import matplotlib.pyplot as plt
import matplotlib.colors as colors
import matplotlib.cm as cmx
from pandas import read_csv

df = read_csv('iris.csv')

#Scatter of Petal
x=df['PetalLength']
y=df['PetalWidth']

# Get unique names of species
uniq = list(set(df['Name']))

# Set the color map to match the number of species
z = range(1,len(uniq))
hot = plt.get_cmap('hot')
cNorm = colors.Normalize(vmin=0, vmax=len(uniq))
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=hot)

# Plot each species
for i in range(len(uniq)):
indx = df['Name'] == uniq[i]
plt.scatter(x[indx], y[indx], s=15, color=scalarMap.to_rgba(i), label=uniq[i])

plt.xlabel('Petal Width')
plt.ylabel('Petal Length')
plt.title('Petal Width vs Length')
plt.legend(loc='upper left')
plt.show()

Gives something like this:
Sample Image

Edit: explicitly add labels for the legend.

get different colors for different categorical levels in a single line chart python Plotly

Perhaps you are looking for the color parameter?

See below:

import pandas as pd, plotly.express as px

# Copied from SO question
df = pd.read_clipboard()

fig = px.line(df, x='Day-Shift', y='Production', color='Brand')
fig.show()

yields:

Sample Image

Seaborn set color for unique categorical over several pair-plots

For this use case, seaborn allows a dictionary as palette. The dictionary will assign a color to each hue value.

Here is an example of how such a dictionary could be created for your data:

from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

df1 = pd.DataFrame({'tsne_one': np.random.randn(10),
'tsne-two': np.random.randn(10),
'mode': np.random.choice(['foot', 'metro', 'bike'], 10)})
df2 = pd.DataFrame({'tsne_one': np.random.randn(10),
'tsne-two': np.random.randn(10),
'mode': np.random.choice(['car', 'metro', 'bike'], 10)})
df3 = pd.DataFrame({'tsne_one': np.random.randn(10),
'tsne-two': np.random.randn(10),
'mode': np.random.choice(['foot', 'bus', 'metro', 'bike'], 10)})
df4 = pd.DataFrame({'tsne_one': np.random.randn(10),
'tsne-two': np.random.randn(10),
'mode': np.random.choice(['car', 'bus', 'metro', 'bike'], 10)})
modes = pd.concat([df['mode'] for df in (df1, df2, df3, df4)], ignore_index=True).unique()
colors = sns.color_palette('hls', len(modes))
palette = {mode: color for mode, color in zip(modes, colors)}

fig, axs = plt.subplots(2, 2, figsize=(12,6))
for df, ax in zip((df1, df2, df3, df4), axs.flatten()):
sns.scatterplot(x='tsne_one', y='tsne-two', hue='mode', data=df, palette=palette, legend='full', alpha=0.7, ax=ax)

plt.tight_layout()
plt.show()

sns.scatterplot with dictionary as palette

How to use categorical colour scheme in matplotlib with missing values?

I haven't found a solution with seaborn but you can tweak the lambda function to work it in matplotlib. If the species is in your colour dictionary it looks the colour up there but if the species is null it return a novel colour.

import seaborn as sns
iris = sns.load_dataset('iris')

#colour dictionary
colours = {'setosa':'skyblue', 'versicolor':'orangered', 'virginica':'forestgreen'}

col_convert = np.vectorize(lambda x: 'grey' if pd.isnull(x) else colours[x])

plt.scatter(iris.sepal_length, iris.sepal_width, c=col_convert(iris.species))

Different color for single line plot in plotly based on category (Green & red)

To color-code by value, the graph is broken down into a graph between two points and created by the comparison condition. Use the data frame iterator to get a row and the next row, compare the condition with the data in those two rows, and set the graph. Finally, the graph is updated to remove duplicate legend items.

import plotly
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = go.Figure()
fig = make_subplots(specs=[[{"secondary_y": True}]])
x = ty['tag'];y1=ty['num1'];y2=ty['num2']

fig.add_trace(go.Scatter(x=x, y=y1,
mode='lines',
marker_color='blue',
name='macd'), secondary_y=False)
for i, row in ty.iterrows():
if i <= len(ty)-2:
if row['num2'] < ty.loc[i+1,'num2']:
colors = 'green'
else:
colors = 'red'
fig.add_trace(go.Scatter(x=[row['tag'], ty.loc[i+1,'tag']],
y=[row['num2'], ty.loc[i+1,'num2']],
mode='lines',
marker_color=colors,
name='signal',
), secondary_y=False)

names = set()
fig.for_each_trace(
lambda trace:
trace.update(showlegend=False)
if (trace.name in names) else names.add(trace.name))
fig.show()

Sample Image



Related Topics



Leave a reply



Submit