How to Output a Stem and Leaf Plot as a Plot

How to output a stem and leaf plot as a plot

Here is one simple example:

plot.new()
tmp <- capture.output(stem(iris$Petal.Length))
text( 0,1, paste(tmp, collapse='\n'), adj=c(0,1), family='mono' )

Sample Image

If you want to overlay a histogram then you probably want to use the text function on each of the elements of tmp rather than pasteing. Functions like strheight and strwidth will be useful to find the coordinates.

There are also functions in the gplots and plotrix packages for plotting text and adding tables to plots (other functions in other packages probably exist along these lines as well).

Stem-and-leaf display graphical output with subset highlight in R

One possibility would be to use a back to back stem and leaf plot, with the X's to be highlighted on one side, and the rest on the other side. The ?stem.leaf.backback function in the aplpack package can do this for you. All you really need to do is write some code to extract the highlighted values from the rest. Consider:

library(aplpack)
xs <- c(rep(10,3), rep(20, 2), rep(30, 5))
xs_to_highlight <- c(10, 30)
other_xs <- xs[-match(xs_to_highlight, xs)]
stem.leaf.backback(x=xs_to_highlight, y=other_xs, m=1)
# ____________________________
# 1 | 2: represents 12, leaf unit: 1
# xs_to_highlight
# other_xs
# ____________________________
# 1 0| 1 |00 2
# | 2 |00 (2)
# 1 0| 3 |0000 4
# ____________________________
# n: 2 8
# ____________________________

stem and leaf plot in R wrong '|' position

This is not an issue of integer vs. double. The issue is that the range of values is too small to create a big enough stem-and-leaf plot with the second digit.

You can simply change the tolerance to get the result you want, by simply passing a value for atom:

stem(x1,atom=10)

The output is:

  The decimal point is 1 digit(s) to the right of the |

8 | 0000111122222333444
8 | 555666666777778888999999999
9 | 000111122223333444444
9 | 55556666666677777788888999999
10 | 0000

how to add stem-leaf plot in R subplot

You need to define the variables of df (assuming it is a data frame) you want to plot using (for instance) the $ sign. Furthermore, stem gives a text as output. You have to convert it to a plot using text() function (see How to output a stem and leaf plot as a plot).

Here is an example using mtcars dataset:

par(mfrow=c(2,2))
hist(mtcars$cyl, col="orange")
hist(mtcars$mpg, col="green")
boxplot(mtcars$hp, main="Boxplot",col = "yellow")
plot.new()
tmp <- capture.output(stem(mtcars$drat))
text(0, 1, paste(tmp, collapse='\n'), adj=c(0,1), family='mono', cex=1) #you can adjust the size of the text using cex parameter

Sample Image

Grouping of Stem- Leaf plot in R

This isn't going to win any beauty contest, but you can definitely use a combination of cut and some string processing to create your own grouped stem function.

Here's an example function, commented so you can extend it to suit your actual needs:

grouped_stem <- function(invec, n = 3) {
# Sequence of lowest tens and highest tens in the input data, by 10
cuts <- seq((min(invec) %/% 10) * 10, round(max(invec), -(nchar(max(invec))-1)), 10)
# For pretty labels in `cut`
labs <- sub("(.*).$", "\\1", cuts)
labs <- replace(labs, !nzchar(labs), "0")
# List of the values according to their `cut` intervals
temp <- split(invec, cut(invec, cuts, labs[-length(labs)], right = FALSE))
# Only interested in the last digit
temp <- relist(sub(".*(.)$", "\\1", unlist(temp, use.names = FALSE)), temp)
# Paste the values together. Add in a "*" that we can get rid of later if not required
combined <- vapply(temp, function(y) sprintf("%s*", paste(y, collapse = "")), character(1L))
# Split by number of groups of tens per stem
splits <- split(combined, ((seq_along(combined)-1) %/% n))
# Construct the stems and leaves
stems <- vapply(splits, function(x) {
paste(names(x)[1], names(x)[length(x)], sep = " to ")
}, character(1L))
leaves <- vapply(splits, function(x) {
sub("[*]$", "", paste(x, sep = "", collapse = ""))
}, character(1L))
# Print and store
cat(sprintf(sprintf("%%%ss | %%s", max(nchar(stems))+2), stems, leaves), sep = "\n")
invisible(setNames(as.list(leaves), stems))
}

Run on your sample data, it produces:

grouped_stem(data)
## 0 to 2 | 67*179*26
## 3 to 5 | 2478*15699*368
## 6 to 8 | 24457**56


Related Topics



Leave a reply



Submit