How Is Ggplot2 Plus Operator Defined

How is ggplot2 plus operator defined?

If you dissect +.gg we have:

> ggplot2:::`+.gg`
function (e1, e2)
{
e2name <- deparse(substitute(e2))
if (is.theme(e1))
add_theme(e1, e2, e2name)
else if (is.ggplot(e1))
add_ggplot(e1, e2, e2name)
}

Besides, add_theme, what you're interested in is is add_ggplot which can be accessed with ggplot2:::add_ggplot. The latter - a long yet very organized function - reveals more "cascading" functions to dispatch what's meant to be added.

That being said, R "knows" when using "+" on an object of class gg which function to apply (because of S3 classes). You can find the starting point in ggplot2 GitHub repos, in the ggproto.R on which I think most of ggplot2 behaviour depends on.

Is that what you're looking for?

Plus sign between ggplot2 and other function (R)

The function definition that @Richard Scriven refers to in comment is defined in plot-construction.r, which might make it clearer. You'll need to go through the source to see exactly what those two (unexported) functions do (whether the LHS of the call is a theme or a ggplot object) but the names should give you a pretty good idea. The return value is e1 modified by "adding" e2.

"+.gg" <- function(e1, e2) {
# Get the name of what was passed in as e2, and pass along so that it
# can be displayed in error messages
e2name <- deparse(substitute(e2))

if (is.theme(e1)) add_theme(e1, e2, e2name)
else if (is.ggplot(e1)) add_ggplot(e1, e2, e2name)
}

So, yes, + is overloaded for objects inheriting class gg (all ggplot2 objects).

I think 'pipe' (@alistaire's comment) is a misleading analogy; this is very much in the style of the standard Ops group generic.

What exactly is ggplot doing with the `+` operator?

I'll answer the first question. You should ask the second question in a separate posting.

R lets you override most operators. The easiest way to do it is using the "S3" object system. This is a very simple system where you attach an attribute named "class" to the object, and that affects how R processes some functions. (The ones this applies to are called "generic functions". There are other functions that don't pay any attention to the class.)

Each ggplot2 function returns an object with a class. You can use the class() function to get the class. For example, class(ggplot(data = "mtcars")) is a character vector containing c("gg", "ggplot"), and class(geom_histogram(bins = 10, color="purple", fill="white")) is the vector c("LayerInstance","Layer","ggproto","gg").

If you ask for methods("+") you'll see all the classes with methods defined for addition, and that includes "gg", so R will call that method to process the addition in the expression you used.

What is the difference between the + operator in ggplot2 and the %% operator in magrittr?

Piping is very different from ggplot2's addition. What the pipe operator, %>%, does is take the result of the left-hand side and put it as the first argument of the function on the right-hand side. For example:

1:10 %>% mean()
# [1] 5.5

Is exactly equivalent to mean(1:10). The pipe is more useful to replace multiply nested functions, e.g.,

x = factor(2008:2012)
x_num = as.numeric(as.character(x))
# could be rewritten to read from left-to-right as
x_num = x %>% as.character() %>% as.numeric()

but this is all explained nicely over at What does %>% mean in R?, you should read through that for a couple more examples.

Using this knowledge, we can re-write your pipe examples as nested functions and see that they still do the same things; but now it (hopefully) is obvious why #4 doesn't work:

# 3. This is acceptable ggplot2 syntax
ggplot(data = mtcars) + geom_point(aes(x=wt, y = mpg))

# 4. This is not
geom_point(aes(ggplot(data = mtcars), x=wt, y = mpg))

ggplot2 includes a special "+" method for ggplot objects, which it uses to add layers to plots. I didn't know until you asked your question that it also works with the aes() function, but apparently that's defined as well. These are all specially defined within ggplot2. The use of + in ggplot2 predates the pipe, and while the usage is similar, the functionality is quite different.

As an interesting side-note, Hadley Wickham (the creator of ggplot2) said that:

...if I'd discovered the pipe earlier, there never would've been a ggplot2, because you could write ggplot graphics as

ggplot(mtcars, aes(wt, mpg)) %>%
geom_point() %>%
geom_smooth()

What is the %||% operator (used in ggplot2) and where is it defined?

Your initial approach was good, one additional trick would be to add backticks to your query:

R> ?`%||%`

Which brings up the help page for null-default from purrr that describes it as "This infix function makes it easy to replace NULLs with a default value"

In use:

R> 1 %||% 2
[1] 1
R> NULL %||% 2
[1] 2

Meaning of @ operator in R language?

meuse is an S4 object

isS4(meuse)
[1] TRUE

If you take the structure of of meuse (str_meuse) you'll see some fields are denoted with your @ operator, including one called data. These slots can be accessed with @ similar to how you might see other slots in other objects accessed using the $ operator. So meuse@data gives you the data portion of the meuse object.

str(meuse)

Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
..@ data :'data.frame': 155 obs. of 12 variables:
.. ..$ cadmium: num [1:155] 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
.. ..$ copper : num [1:155] 85 81 68 81 48 61 31 29 37 24 ...
.. ..$ lead : num [1:155] 299 277 199 116 117 137 132 150 133 80 ...
.. ..$ zinc : num [1:155] 1022 1141 640 257 269 ...
.. ..$ elev : num [1:155] 7.91 6.98 7.8 7.66 7.48 ...
.. ..$ dist : num [1:155] 0.00136 0.01222 0.10303 0.19009 0.27709 ...
.. ..$ om : num [1:155] 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
.. ..$ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
.. ..$ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
.. ..$ landuse: Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
.. ..$ dist.m : num [1:155] 50 30 150 270 380 470 240 120 240 420 ...
..@ coords.nrs : int [1:2] 1 2
..@ coords : num [1:155, 1:2] 181072 181025 181165 181298 18130

See how that subsetting is working?

str(meuse@data)
'data.frame': 155 obs. of 12 variables:
$ cadmium: num 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
$ copper : num 85 81 68 81 48 61 31 29 37 24 ...
$ lead : num 299 277 199 116 117 137 132 150 133 80 ...
$ zinc : num 1022 1141 640 257 269 ...
$ elev : num 7.91 6.98 7.8 7.66 7.48 ...
$ dist : num 0.00136 0.01222 0.10303 0.19009 0.27709 ...
$ om : num 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
$ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
$ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
$ landuse: Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
$ dist.m : num 50 30 150 270 380 470 240 120 240 420 ...

How to use the rlang `!!!` operator to define a function that wraps around a ggplot call? (Error: Can't use `!!!` at top level)

Your specific use case is an example in ?aes. aes automatically quotes its arguments. One can simply directly pass the dots. Try:

plot_points3 <- function(d, ...){
print(aes(...))
ggplot(d, aes(...)) + geom_point(alpha = 0.1)
}
plot_points3(df, x = x, y = y, color = z)

This nicely prints:

Aesthetic mapping: 
* `x` -> `x`
* `y` -> `y`
* `colour` -> `z`

And yields the required plot.

What does eg %+% do? in R

The ultimate reason is that if you do both general-purpose programming and numerical computations, it is useful to have a large complement of binary operators available. For example, if you store numbers in two-dimensional arrays, you may want to multiply the arrays elementwise, or you may want to compute the matrix product of two arrays. In Matlab these two operators are .* and *; in R they are * and %*%. Python has resisted attempts to add new operators, and so numpy differentiates between the two kinds of product by having two classes: the array class is multiplied elementwise, the matrix class is multiplied in the linear-algebra sense.

Another example from Python is that for lists, plus means concatenation: [1,2,3]+[4,5] == [1,2,3,4,5]. But for numpy arrays, plus means elementwise addition: array([1,2]) + array([4,5]) == array([5,7]). If your code needs to do both, you have to convert between classes or use function notation, which can lead to cumbersome-looking code, especially where mathematics is involved.

So it would sometimes be convenient to have more operators available for use, and you might not know in advance what sorts of operators a particular application calls for. Therefore, the implementors of R have chosen to treat as operators anything named like %foo%, and several examples exist: %in% is set membership, %x% is Kronecker product, %o% is outer product. For an example of a language that has taken this to the extreme, see Fortress (section 16 of the specification starts with the rules for operator names).

In the blog post you mentioned, the author is using the ggplot2 graphing package, which defines %+% to mean some kind of combination of two plot elements. Really it seems to add a method to the bare + (which is a generic function so you can define what it means for user-defined objects), but it also defines %+% so that you can use the ggplot2 meaning of + (whatever it is) for other objects. If you install ggplot2, type require(ggplot2) and ?`%+%` to see the documentation of that operator, and methods(`+`) to see that a new definition has been added to +.

Adding objects together in R (like ggplot layers)

You just need to define a method for the generic function +. (At the link in your question, that method is "+.gg", designed to be dispatched by arguments of class "gg"). :

## Example data of a couple different classes
dd <- mtcars[1, 1:4]
mm <- as.matrix(dd)

## Define method to be dispatched when one of its arguments has class data.frame
`+.data.frame` <- function(x,y) rbind(x,y)

## Any of the following three calls will dispatch the method
dd + dd
# mpg cyl disp hp
# Mazda RX4 21 6 160 110
# Mazda RX41 21 6 160 110
dd + mm
# mpg cyl disp hp
# Mazda RX4 21 6 160 110
# Mazda RX41 21 6 160 110
mm + dd
# mpg cyl disp hp
# Mazda RX4 21 6 160 110
# Mazda RX41 21 6 160 110


Related Topics



Leave a reply



Submit