Adding breaks to count (y axis) of a histogram according to the count min-max range in R?
You can make a function for breaks
that takes the limits of axis as the argument.
From the documentation of scale_continuous
, breaks
can take:
A function that takes the limits as input and returns breaks as output
Here is an example, where I go from 0 to the maximum y axis limit by 1. (I use 0 instead of the minimum count because histograms start at 0.)
The x
in the function is the limits of the axis in the plot as calculated by ggplot()
or as set by the user.
byone = function(x) {
seq(0, max(x), by = 1)
}
You can pas this function to breaks
in scale_y_continuous()
. The limits
are pulled from directly from the plot and passed to the first argument of the function.
ggplot(df, aes(user_id)) +
geom_histogram() +
scale_y_continuous(breaks = byone)
R Need to put a break in y axis
You can create a bit of a work-around for this issue. I'm assuming you want the plot to be more readable. You can reduce the magnitude of the very large values to make them more comparable to the smaller values while keeping the original axis labels. That is,
y2 <- ifelse(y>25000, y/100 + 25000, y)
gap.plot(x,y2,gap=c(25000,25000), gap.axis="y",ylim=c(0,2000000/100 + 25000),
ytics=c(seq(0,25000,by=5000),seq(27000,45000,by=5000)),
yticlab=c(seq(0,25000,by=5000),seq(200000,2000000,by=500000)))
axis.break(axis=2,breakpos=25000,style="slash")
gnuplot, break y-axis in two parts
My answer is based on the example and comment on this website. In essence, you are looking for a broken y-axis, which can be achieved using the multiplot
option:
reset
unset key
bm = 0.15
lm = 0.12
rm = 0.95
gap = 0.03
size = 0.75
kk = 0.5 # relative height of bottom plot
y1 = 0.0; y2 = 15.0; y3 = 180.0; y4 = 220.0
set style histogram columnstacked
set style data histograms
set key autotitle columnheader
set style fill solid 1.0 border -1
set multiplot
set border 1+2+8
set xtics nomirror
set ytics nomirror
set lmargin at screen lm
set rmargin at screen rm
set bmargin at screen bm
set tmargin at screen bm + size * kk
set yrange [y1:y2]
plot for [i=2:6] 'test.dat' using i
unset xtics
unset xlabel
set border 2+4+8
set bmargin at screen bm + size * kk + gap
set tmargin at screen bm + size + gap
set yrange [y3:y4]
plot for [i=2:6] 'test.dat' using i
unset multiplot
Remarks:
- The first block of code specifies your plot size, as well as the two min/max for the y-axis range.
- The second block of code sets your histogram style. I suggest you used filled columns (else, you might get the wrong colors where the plot is cut off by the y-range).
- The third block starts you
multiplot
environment and sets up the lower plot (4th block). - The fifth block of code defines you upper plot.
- To customize your plot, change the parameters in the first block (
bm
,lm
,rm
,gap
,size
,kk
andyi
)
And this is the result:
Histogram with breaking axis and interlaced colorbar
Have you seen this example? It's for a broken y-axis plot in matplotlib.
Hope this helps.
Combining with pandas this gives:
import pandas as pd
import matplotlib.pyplot as plt
from StringIO import StringIO
data = """\
a b c d e
alpha 5.51 0.60 -0.12 26.90 76284.53
beta 3.39 0.94 -0.17 -0.20 -0.20
gamma 7.98 3.34 -1.41 7.74 28394.93
delta 2.29 1.24 0.40 0.29 0.28
"""
df = pd.read_csv(StringIO(data), sep='\s+')
f, axis = plt.subplots(2, 1, sharex=True)
df.plot(kind='bar', ax=axis[0])
df.plot(kind='bar', ax=axis[1])
axis[0].set_ylim(20000, 80000)
axis[1].set_ylim(-2, 30)
axis[1].legend().set_visible(False)
axis[0].spines['bottom'].set_visible(False)
axis[1].spines['top'].set_visible(False)
axis[0].xaxis.tick_top()
axis[0].tick_params(labeltop='off')
axis[1].xaxis.tick_bottom()
d = .015
kwargs = dict(transform=axis[0].transAxes, color='k', clip_on=False)
axis[0].plot((-d,+d),(-d,+d), **kwargs)
axis[0].plot((1-d,1+d),(-d,+d), **kwargs)
kwargs.update(transform=axis[1].transAxes)
axis[1].plot((-d,+d),(1-d,1+d), **kwargs)
axis[1].plot((1-d,1+d),(1-d,1+d), **kwargs)
plt.show()
Formatting a broken y axis in python matplotlib
OK, well I have made some edits and it now works (just not quite as I'd originally intended) and there is a new solution here which should be pushed to the matplotlib page soon.
The key code is this section
# arguments to pass to plot, just so we don't keep repeating them
kwargs = dict(transform=ax.transAxes, color='k', clip_on=False)
ax.plot((-d, +d), (-d, +d), **kwargs) # top-left diagonal
ax.plot((1 - d, 1 + d), (-d, +d), **kwargs) # top-right diagonal
kwargs.update(transform=ax2.transAxes) # switch to the bottom axes
ax2.plot((-d, +d), (1 - d, 1 + d), **kwargs) # bottom-left diagonal
ax2.plot((1 - d, 1 + d), (1 - d, 1 + d), **kwargs) # bottom-right diagonal
You can amend it to
axis_break1 = 450
axis_break2 = 9951
x_min = -0.75
x_max = len(data.index)
l = 0.2 # "break" line length
kwargs = dict(color="k", clip_on=False, linewidth=1)
ax.plot((x_min - l, x_min + l), (axis_break2, axis_break2), **kwargs)# top-left
ax.plot((x_max - l, x_max + l), (axis_break2, axis_break2), **kwargs)# top-right
ax2.plot((x_min - l, x_min + l), (axis_break1, axis_break1), **kwargs)# bottom-left
ax2.plot((x_max - l, x_max + l), (axis_break1, axis_break1), **kwargs)# bottom-right
Which leaves us with a neat (if slightly less fancy) result.
Or a revised (and more elegant) version (from ImportanceOfBeingErnest):
d = .25 # proportion of vertical to horizontal extent of the slanted line
kwargs = dict(marker=[(-1, -d), (1, d)], markersize=12,
linestyle="none", color='k', mec='k', mew=1, clip_on=False)
ax.plot([0, 1], [0, 0], transform=ax.transAxes, **kwargs)
ax2.plot([0, 1], [1, 1], transform=ax2.transAxes, **kwargs)
which results in diagonal lines as originally intended.
Related Topics
Equivalent to Unix "Less" Command Within R Console
The Condition Has Length > 1 and Only the First Element Will Be Used in If Else Statement
Is It a Good Practice to Call Functions in a Package via ::
Extract Names of Objects from List
Rcpparmadillo Pass User-Defined Function
Removing the Border of Legend Symbol
How to Add Multiple Columns to a Data.Frame in One Go
Is There a Vectorized Parallel Max() and Min()
Most Frequent Value (Mode) by Group
Pass Function Arguments to Both Dplyr and Ggplot
What Is the Most Useful R Trick
How to Convert Data.Frame Column from Factor to Numeric
Connecting Across Missing Values with Geom_Line
Ggmap Error: Geomrasterann Was Built with an Incompatible Version of Ggproto
What Methods How to Use to Reshape Very Large Data Sets
How to Add a Number of Observations Per Group and Use Group Mean in Ggplot2 Boxplot