What's the Real Meaning About 'Everything That Exists Is an Object' in R

What's the real meaning about 'Everything that exists is an object' in R?

The function is.object seems only to look if the object has a "class" attribute. So it has not the same meaning as in the slogan.

For instance:

x <- 1
attributes(x) # it does not have a class attribute
NULL
is.object(x)
[1] FALSE
class(x) <- "my_class"
attributes(x) # now it has a class attribute
$class
[1] "my_class"
is.object(x)
[1] TRUE

Now, trying to answer your real question, about the slogan, this is how I would put it. Everything that exists in R is an object in the sense that it is a kind of data structure that can be manipulated. I think this is better understood with functions and expressions, which are not usually thought as data.

Taking a quote from Chambers (2008):

The central computation in R is a function call, defined by the
function object itself and the objects that are supplied as the
arguments. In the functional programming model, the result is defined
by another object, the value of the call. Hence the traditional motto
of the S language: everything is an object—the arguments, the value,
and in fact the function and the call itself: All of these are defined
as objects. Think of objects as collections of data of all kinds. The data contained and the way the data is organized depend on the class from which the object was generated.

Take this expression for example mean(rnorm(100), trim = 0.9). Until it is is evaluated, it is an object very much like any other. So you can change its elements just like you would do it with a list. For instance:

call <- substitute(mean(rnorm(100), trim = 0.9))
call[[2]] <- substitute(rt(100,2 ))
call
mean(rt(100, 2), trim = 0.9)

Or take a function, like rnorm:

rnorm
function (n, mean = 0, sd = 1) 
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>

You can change its default arguments just like a simple object, like a list, too:

formals(rnorm)[2] <- 100
rnorm
function (n, mean = 100, sd = 1) 
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>

Taking one more time from Chambers (2008):

The key concept is that expressions for evaluation are themselves
objects; in the traditional motto of the S language, everything is an
object. Evaluation consists of taking the object representing an
expression and returning the object that is the value of that
expression.

So going back to our call example, the call is an object which represents another object. When evaluated, it becomes that other object, which in this case is the numeric vector with one number: -0.008138572.

set.seed(1)
eval(call)
[1] -0.008138572

And that would take us to the second slogan, which you did not mention, but usually comes together with the first one: "Everything that happens is a function call".

Taking again from Chambers (2008), he actually qualifies this statement a little bit:

Nearly everything that happens in R results from a function call.
Therefore, basic programming centers on creating and refining
functions.

So what that means is that almost every transformation of data that happens in R is a function call. Even a simple thing, like a parenthesis, is a function in R.

So taking the parenthesis like an example, you can actually redefine it to do things like this:

`(` <- function(x) x + 1
(1)
[1] 2

Which is not a good idea but illustrates the point. So I guess this is how I would sum it up: Everything that exists in R is an object because they are data which can be manipulated. And (almost) everything that happens is a function call, which is an evaluation of this object which gives you another object.

What kind of object is `...`?

What an interesting question!

Dot-dot-dot ... is an object (John Chambers is right!) and it's a type of pairlist. Well, I searched the documentation, so I'd like to share it with you:

R Language Definition document says:

The ‘...’ object type is stored as a type of pairlist. The components of ‘...’ can be accessed in the usual pairlist manner from C code, but is not easily accessed as an object in interpreted code. The object can be captured as a list.

Another chapter defines pairlists in detail:

Pairlist objects are similar to Lisp’s dotted-pair lists.

Pairlists are handled in the R language in exactly the same way as generic vectors (“lists”).

Help on Generic and Dotted Pairs says:

Almost all lists in R internally are Generic Vectors, whereas traditional dotted pair lists (as in LISP) remain available but rarely seen by users (except as formals of functions).

And a nice summary is here at Stack Overflow!

R scoping question: Object exists but I can't do anything with it

The catch is that CheckNumber exists in the parent frame (from where little_fun is called) but not in the parent environment (where little_fun is defined).

test with additional code in little_fun:

little_fun <- function(){
    print(paste("CheckNumber exists in parent frame =",
                exists("CheckNumber", where = parent.frame())))
    ## present in parent environment?
    print(paste("CheckNumber exists in parent environment =",
                exists("CheckNumber", where = parent.env(environment()))))
    print(paste("CheckNumber exists in current frame =", exists("CheckNumber")))
    if(exists("CheckNumber", where = parent.frame())){
        print(CheckNumber + 2)
    }
}

To make CheckNumber available, define it in the same or a higher level environment as little_fun, not in a sibling environment (big_fun is a sibling of little fun inside the global environment, unless you e.g. define little_fun inside big_fun).

Anyhow, supplying the value as a function argument– little_fun(CheckNumber = 5)–will prevent functions groping around in parent environments for same-named variables. Functions depending on variables apart from their function arguments are not easy to re-use for other code.

(Background explanation in Chapter 7 "Environments" of Hadley Wickhams Advanced R.)

How to check if object (variable) is defined in R?

You want exists():

R> exists("somethingUnknown")
[1] FALSE
R> somethingUnknown <- 42
R> exists("somethingUnknown")
[1] TRUE
R>

What happens during evaluation in R?

This is going to be an incomplete answer, but it seems your question is about the nature of the "internal representation." In essence, R's parser takes arbitrary R code, removes irrelevant stuff (like superfluous whitespace) and creates a nested set of expressions to evaluate. We can use pryr::call_tree() to see what is going on.

Take a simple expression that only uses mathematical operators:

> 1 + 2 - 3 * 4 / 5
[1] 0.6

In that series of operations, an output occurs that respects R's precedence rules. But what is actually happening? First, the parser converts whatever is typed into an "expression":

> parse(text = "1 + 2 - 3 * 4 / 5")
expression(1 + 2 - 3 * 4 / 5)

This expression masks a deeper complexity:

> library("pryr")
> call_tree(parse(text = "1 + 2 - 3 * 4 / 5"))
\- ()
  \- `-
  \- ()
    \- `+
    \-  1
    \-  2
  \- ()
    \- `/
    \- ()
      \- `*
      \-  3
      \-  4
    \-  5

This expression is the sequential evaluation of four functions, first "*"(), then "/"(), then "+"(), then "-"(). Thus, this can actually be rewritten as a deeply nested expression:

> "-"("+"(1,2), "/"("*"(3,4), 5))
[1] 0.6
> call_tree(parse(text = '"-"("+"(1,2), "/"("*"(3,4), 5))'))
\- ()
  \- `-
  \- ()
    \- `+
    \-  1
    \-  2
  \- ()
    \- `/
    \- ()
      \- `*
      \-  3
      \-  4
    \-  5

Multi-line expressions are also parsed into individual expressions:

> parse(text = "1; 2; 3")
expression(1, 2, 3)
> parse(text = "1\n2\n3")
expression(1, 2, 3)
> call_tree(parse(text = "1; 2; 3"))
\-  1

\-  2

\-  3

These call trees are then evaluated.

Thus when R's read-eval-print loop executes, it parses the code typed in the interpreter or sourced from a file into this call tree structure, then sequentially evaluates each function call, and then prints the result unless an error occurs). Errors occur when a parsable line of code cannot be fully evaluated:

> call_tree(parse(text = "2 + 'A'"))
\- ()
  \- `+
  \-  2
  \-  "A"

And a parsing failure occurs when a typable line of code cannot be parsed into a call tree:

> parse(text = "2 + +")
Error in parse(text = "2 + +") : <text>:2:0: unexpected end of input
1: 2 + +
   ^

That's not a complete story, but perhaps it gets you part way to understanding.

R: How to check whether object exists inside function?

We can do this by specifying a specific environment for exists to search in and tell it to only look there, and not in the enclosing environments.

The where= argument tells exists where to look for that object. We can either specify it explicitly with environment() (which returns the current environment), or use the default value of -1 which drops the R_GlobalEnv from the list of environments to search.

Either way, the key is to set inherits=FALSE to limit it to only the specified environment. Otherwise, it also looks in the enclosing environments (like R_GlobalEnv) which we don't want:

x <- 1
fun <- function(){exists("x", inherits = F)}

fun()
[1] FALSE

However if we define x in the enviroment of the function, it returns TRUE:

fun <- function(){
    x<-3;
    exists("x", inherits = F)}
fun()
[1] TRUE

The example mentioned with explicitly defined environment:

fun <- function(){exists("x", where = environment(), inherits = F)}

What's happening with the default where=-1 argument? The documentation says that if you provide it with an integer, it selects the environment based on "the position in the search list". We can see that .GlobalEnv is at position 1, followed by attached packages

rm(list=ls()) # clear .GlobalEnv
x <- 3  # Add x to .GlobalEnv
ls()    # Show that x is the only thing in .GlobalEnv
[1] "x"

search() # Show the search list
 [1] ".GlobalEnv"        "package:lubridate" "package:forcats"   "package:stringr"  
 [5] "package:dplyr"     "package:purrr"     "package:readr"     "package:tidyr"    
 [9] "package:tibble"    "package:ggplot2"   "package:tidyverse" "tools:rstudio"    
[13] "package:stats"     "package:graphics"  "package:grDevices" "package:utils"    
[17] "package:datasets"  "package:methods"   "Autoloads"         "package:base"

Now we run this function which checks for different objects in different environments by integer value:

fun <- function(){
    y <- 3
    k <- function(){
        z <- 3
        print(exists('z', -1, inherit=FALSE))
        print(exists('x', 1, inherit=FALSE))
        print(exists('y', parent.frame(), inherit=FALSE))}
    k()
    print(exists('x', -1, inherit=FALSE))
    print(exists('y', -1, inherit=FALSE))
    print(exists('x', 1, inherit=FALSE))
    print(exists('ymd', 2, inherit=FALSE))
    print(exists('last2', 3, inherit=FALSE))
    print(exists('str_detect', 4, inherit=FALSE))
}

> fun()
[1] TRUE   # Inside k(), -1 is the function env for k() - z is there
[1] TRUE   # 1 is .GlobalEnv, x is there
[1] TRUE   # to choose parent env with y, we need to specify it with parent.frame()
[1] FALSE  # -1 is the function env, x not in function env
[1] TRUE   # -1 is the function env, y is in function env
[1] TRUE   # 1 is .GlobalEnv, x is in .GlobalEnv
[1] TRUE   # position 2 is the lubridate package
[1] TRUE   # position 3 is the forcats package
[1] TRUE   # position 4 is the stringr package

From this we can see that -1 is always the local environment of the current function, while 1 is .GlobalEnv and higher numbers are attached packages as listed by search(). If you want to specify with more detail, for instance to look in fun() from within k(), then you need to specify the environment explicitly, either with a relative function like parent.frame() as above or by getting the environment as an object and referring directly as below:

fun <- function(){
    y <- 3
    env <- environment()
    k <- function(e){
        z <- 3
        print(exists('y', inherit=FALSE))
        print(exists('y', where=e, inherit=FALSE))
        }
    k(env)
}
fun()
[1] FALSE
[1] TRUE

How are values assigned to the result of a function call?

We can show what's going on more clearly by writing a little function to create objects of S3 class "box", to represent a box with a specified height, width and length:

make_box <- function(height, width, length) {
    structure(list(height = height, width = width, length = length), 
              class = "box")
}

We also want a function to be able to retrieve the width of box objects:

width <- function(b)
{
  if(class(b) != "box") stop("Can only get width of objects of class box")
  return(b$width)
}

For clarity, we will also define a print method for our box class:

print.box <- function(b) {
  cat("A box of dimension", paste(b$height, b$width, b$length, sep = " x "),
      "and volume", b$height * b$width * b$length)
}

So now we can create a box:

my_box <- make_box(height = 1, width = 2, length = 3)

my_box
#> A box of dimension 1 x 2 x 3 and volume 6

And extract its width:

width(my_box)
#> [1] 2

However, if we try to assign a new width using the assignment syntax, we throw an error:

width(my_box) <- 3
#> Error in width(my_box) <- 3: could not find function "width<-"

Notice that there is no such function. We need to tell R what we mean with this syntax:

`width<-` <- function(b, value)
{
  b$width <- value
  return(b)
}

So now when we do:

width(my_box) <- 3

We get no error, and we can see that my_box has had its width member updated:

my_box
#> A box of dimension 1 x 3 x 3 and volume 9

This works because the parser knows that if we are using the "assign to a function call" syntax, then it's supposed to look for the appropriate subassignment function. It therefore effectively interprets width(my_box) <- 3 as my_box <- 'width<-'(my_box, 3)

^{Created on 2021-09-23 by the reprex package (v2.0.0)}

How to check if an R object has a certain attribute?

Two ways:

%in% names(attributes(..):

"labels" %in% names(attributes(my_vector))
# [1] FALSE
"labels" %in% names(attributes(my_vector_labelled))
# [1] TRUE

is.null(attr(..,"")):

is.null(attr(my_vector, "labels"))
# [1] TRUE                                   # NOT present
is.null(attr(my_vector_labelled, "labels"))
# [1] FALSE                                  # present

(Perhaps !is.null(attr(..)) is preferred?)

What's the Real Meaning About 'Everything That Exists Is an Object' in R