Categorical Bubble Plot for Mapping Studies

Categorical bubble plot for mapping studies

Here is ggplot2 solution. First, added radius as new variable to your data frame.

grid$radius <- sqrt( grid$count / pi )

You should play around with size of the points and text labels inside the plot to perfect fit.

library(ggplot2)
ggplot(grid,aes(Var1,Var2))+
geom_point(aes(size=radius*7.5),shape=21,fill="white")+
geom_text(aes(label=count),size=4)+
scale_size_identity()+
theme(panel.grid.major=element_line(linetype=2,color="black"),
axis.text.x=element_text(angle=90,hjust=1,vjust=0))

Sample Image

Indicator species plot (Bubble plot with categorical axes)

Do you mean something like this?

df %>%
gather(Direction, Value, -Species) %>%
mutate(
Species = as.factor(Species),
Orientation = as.factor(Direction)) %>%
ggplot(aes(Direction, Species)) +
geom_point(aes(size = Value))

Sample Image

For setting the range, take a look at scale_size, scale_radius and scale_size_area.

Python - Categorical bubble plot

I think a scatter plot is perfectly suitable to create this kind of categorical bubble plot.

Create the dataframe:

import pandas as pd
df = pd.DataFrame([[.3,.2,.4],[.1,.4,.1]], columns=list("ABC"), index=list("XY"))

Option 1: unstack the DataFrame

dfu = df.unstack().reset_index()
dfu.columns = list("XYS")

This creates a table like

   X  Y    S
0 A X 0.3
1 A Y 0.1
2 B X 0.2
3 B Y 0.4
4 C X 0.4
5 C Y 0.1

which you can plot column-wise. Since the sizes of scatters are points one would need to multiply the S column with some large number, like 5000 to get large bubbles.

import matplotlib.pyplot as plt
dfu["S"] *= 5000
plt.scatter(x="X", y="Y", s="S", data=dfu)
plt.margins(.4)
plt.show()

Option 2: create grid

Using e.g. numpy, one may create a grid of the dataframe's columns and index such that one may then plot a scatter of the flattened grid. Again one would need to multiply the dataframe values by some large number.

import numpy as np
import matplotlib.pyplot as plt

x,y = np.meshgrid(df.columns, df.index)

df *= 5000
plt.scatter(x=x.flatten(), y=y.flatten(), s=df.values.flatten())
plt.margins(.4)
plt.show()

In both cases the result would look like

Sample Image

Plot bubbleplot of entire dataframe with categorical x and y labels

First reshape your dataframe:

df2 = (df.rename_axis('month')
.reset_index()
.melt(id_vars=['month'], var_name='day')
)
>>> df2.head()
month day value
0 JAN MON 31
1 FEB MON 77
2 MAR MON 39
3 APR MON 100
4 MAY MON 60

Then, plot using seaborn.scatterplot:

import seaborn as sns
sns.scatterplot(data=df2, x='month', y='day', size='value')

seaborn bubbleplot

Order of categorical x and y axis in bubble chart reversed? How to unreverse?

Again, I don't understand what this question is all about. I like the plot, it's simple to follow. In my opinion, the ggplot's x and y variables are incorrectly defined. Also added scale_y_discrete(limits = rev) to change the order top-to-bottom.

Sample code:

pindalad_joonisele$asutus <- factor(pindalad_joonisele$asutus, levels = c("Eesti Draamateater", "Eesti Noorsooteater",
"Rahvusooper Estonia", "Rakvere Teatrimaja", "Sakala Teatrimaja",
"Vene Teater"))

pindalad_joonisele$hoone_liik <- factor(pindalad_joonisele$hoone_liik, levels = c("teatrihoone", "ladu", "töökojad", "muu"))

ggplot(pindalad_joonisele,
aes(x = hoone_liik,
y = asutus,
colour = hoone_liik,
size = hoone_suletud_netopind)) +
geom_point() +
geom_text(aes(label = hoone_suletud_netopind),
colour = "white",
size = 3.5) +
scale_x_discrete(position = "top")+
scale_y_discrete(limits = rev) # reverse y axis
scale_size_continuous(range = c(13, 35)) + # Adjust as required.
scale_colour_manual(values = prx_cols)+
labs(x = NULL, y = NULL) +
theme(legend.position = "none",
panel.background = element_blank(),
panel.grid = element_blank(),
axis.text.x = element_text(size=11),
axis.text.y = element_text(size=11),
axis.ticks = element_blank())

Plot:

Sample Image

Bubble Plot by Shape per data split type

Seaborn has a lot of in-depth customization packed into simple parameters. For your code, you simply want to add a keyword parameter to your sns.scatterplot() function:

style = 'split',

This will change the markers according to the categorical values, although it will pick the defaults. If you want more control over the specific markers being used, you can pass another parameter to map the categorical values to a specific marker:

markers = {'train': 'X', 'valid':'s'},

The marker codes can be found on the Matplotlib website (https://matplotlib.org/3.1.0/api/markers_api.html).

The final code should look like:

sns.scatterplot(x="y", 
y="y_pred",
size="interval_size",
data=df,
alpha=0.65,
c=interval_size,
cmap='viridis',
hue = 'split',
s = (interval_size**2)*50,
style = 'split',
markers = {'train': 'X', 'valid':'s'},
)


Related Topics



Leave a reply



Submit