Differencebetween [ ] and [[ ]] in R

What is the difference between [ ] and [[ ]] in R?

[] = always returns object of same class (out of basic object classes), can select more than one element of an object

[[]] = can extract one element from list or data frame, returned object (out of basic object classes) not necessarily list/dataframe

Is there a technical difference between = and -

Yes there is. This is what the help page of '=' says:

The operators <- and = assign into the
environment in which they are
evaluated. The operator <- can be used
anywhere, whereas the operator = is
only allowed at the top level (e.g.,
in the complete expression typed at
the command prompt) or as one of the
subexpressions in a braced list of
expressions.

With "can be used" the help file means assigning an object here. In a function call you can't assign an object with = because = means assigning arguments there.

Basically, if you use <- then you assign a variable that you will be able to use in your current environment. For example, consider:

matrix(1,nrow=2)

This just makes a 2 row matrix. Now consider:

matrix(1,nrow<-2)

This also gives you a two row matrix, but now we also have an object called nrow which evaluates to 2! What happened is that in the second use we didn't assign the argument nrow 2, we assigned an object nrow 2 and send that to the second argument of matrix, which happens to be nrow.

Edit:

As for the edited questions. Both are the same. The use of = or <- can cause a lot of discussion as to which one is best. Many style guides advocate <- and I agree with that, but do keep spaces around <- assignments or they can become quite hard to interpret. If you don't use spaces (you should, except on twitter), I prefer =, and never use ->!

But really it doesn't matter what you use as long as you are consistent in your choice. Using = on one line and <- on the next results in very ugly code.

Difference between [] and $ operators for subsetting

Below we will use the one-row data frame in order to provide briefer output:

mtcars1 <- mtcars[1, ]

Note the differences among these. We can use class as in class(mtcars["hp"]) to investigate the class of the return value.

The first two correspond to the code in the question and return a data frame and plain vector respectively. The key differences between [ and $ are that [ (1) can specify multiple columns, (2) allows passing of a variable as the index and (3) returns a data frame (although see examples later on) whereas $ (1) can only specify a single column, (2) the index must be hard coded and (3) it returns a vector.

mtcars1["hp"]  # returns data frame
##            hp
## Mazda RX4 110

mtcars1$hp # returns plain vector
## [1] 110

Other examples where index is a single element. Note that the first and second examples below are actually the same as drop = TRUE is the default.

mtcars1[, "hp"] # returns plain vector
## [1] 110  

mtcars1[, "hp", drop = TRUE] # returns plain vector
## [1] 110

mtcars1[, "hp", drop = FALSE] # returns data frame
##            hp
## Mazda RX4 110

Also there is the [[ operator which is like the $ operator except it can accept a variable as the index whereas $ requires the index to be hard coded:

mtcars1[["hp"]] # returns plain vector
## [1] 110

Others where index specifies multiple elements. $ and [[ cannot be used with multiple elements so these examples only use [:

mtcars1[c("mpg", "hp")] # returns data frame
##           mpg  hp
## Mazda RX4  21 110

mtcars1[, c("mpg", "hp")] # returns data frame
##           mpg  hp
## Mazda RX4  21 110

mtcars1[, c("mpg", "hp"), drop = FALSE] # returns data frame
##           mpg  hp
## Mazda RX4  21 110

mtcars1[, c("mpg", "hp"), drop = TRUE] # returns list
## $mpg
## [1] 21
## 
## $hp
## [1] 110

[

mtcars[foo] can return more than one column if foo is a vector with more than one element, e.g. mtcars[c("hp", "mpg")], and in all cases the return value is a data.frame even if foo has only one element (as it does in the question).

There is also mtcars[, foo, drop = FALSE] which returns the same value as mtcars[foo] so it always returns a data frame. With drop = TRUE it will return a list rather than a data.frame in the case that foo specifies multiple columns and returns the column itself if it specifies a single column.

[[

On the other hand mtcars[[foo]] only works if foo has one element and it returns that column, not a data frame.

$

mtcars$hp also only works for a single column, like [[, and returns the column, not a data frame containing that column.

mtcars$hp is like mtcars[["hp"]]; however, there is no possibility to pass a variable index with $. One can only hard-code the index with $.

subset

Note that this works:

subset(mtcars, hp > 150)

returning a data frame containing those rows where the hp column exceeds 150:

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8

other objects

The above pertain to data frames but other objects that can use $, [ and [[ will have their own rules. In particular if m is a matrix, e.g. m <- as.matrix(BOD), then m[, 1] is a vector, not a one column matrix, but m[, 1, drop = FALSE] is a one column matrix. m[[1]] and m[1] are both the first element of m, not the first column. m$a does not work at all.

help

See ?Extract for more information. Also ?"$", ?"[" and ?"[[" all get to the same page, as well.

The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe

The R Language Definition is handy for answering these types of questions:

http://cran.r-project.org/doc/manuals/R-lang.html#Indexing

R has three basic indexing operators, with syntax displayed by the following examples
    x[i]
    x[i, j]
    x[[i]]
    x[[i, j]]
    x$a
    x$"a"
For vectors and matrices the [[ forms are rarely used, although they have some slight semantic differences from the [ form (e.g. it drops any names or dimnames attribute, and that partial matching is used for character indices). When indexing multi-dimensional structures with a single index, x[[i]] or x[i] will return the ith sequential element of x.

For lists, one generally uses [[ to select any single element, whereas [ returns a list of the selected elements.

The [[ form allows only a single element to be selected using integer or character indices, whereas [ allows indexing by vectors. Note though that for a list, the index can be a vector and each element of the vector is applied in turn to the list, the selected component, the selected component of that component, and so on. The result is still a single element.

What's the difference between `=` and ` -` in R?

From here:

The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

In R programming, what's the difference between & vs &&, and | vs ||

they can only handle a single logical test on each side of the operator

a <- c(T, F, F, F)
b <- c(T, F, F, F)
a && b

Returns
[1] TRUE

Because only the first element of a and b are tested!

Edit:

Consider the following, where we 'rotate' a and b after each && test:

a <- c(T, F, T, F)
b <- c(T, F, F, T)
for (i in seq_along(a)){
  cat(paste0("'a' is: ", paste0(a, collapse=", "), " and\n'b' is: ", paste0(b, collapse=", "),"\n"))
  print(paste0("'a && b' is: ", a && b))
  a <- c(a[2:length(a)], a[1])
  b <- c(b[2:length(b)], b[i])
}

Gives us:

'a' is: TRUE, FALSE, TRUE, FALSE and
'b' is: TRUE, FALSE, FALSE, TRUE
[1] "'a && b' is: TRUE"
'a' is: FALSE, TRUE, FALSE, TRUE and
'b' is: FALSE, FALSE, TRUE, TRUE
[1] "'a && b' is: FALSE"
'a' is: TRUE, FALSE, TRUE, FALSE and
'b' is: FALSE, TRUE, TRUE, FALSE
[1] "'a && b' is: FALSE"
'a' is: FALSE, TRUE, FALSE, TRUE and
'b' is: TRUE, TRUE, FALSE, TRUE
[1] "'a && b' is: FALSE"

Additionally, &&, || stops as soon as the expression is clear:

FALSE & a_not_existing_object
TRUE | a_not_existing_object

Returns:

Error: object 'a_not_existing_object' not found
Error: object 'a_not_existing_object' not found

But:

FALSE && a_not_existing_object
TRUE || a_not_existing_object

Returns:

[1] FALSE

[1] TRUE

Because anything after FALSE AND something (and TRUE OR something) becomes FALSE and TRUE respectively

This last behavior of && and || is especially useful if you want to check in your control-flow for an element that may not exist:

if (exists(a_not_existing_object) && a_not_existing_object > 42) {...}

This way the evaluation stops after the first expression evaluates to FALSE and the a_not_existing_object > 42 part is not even atempted!

R difference between [[]] and []

all_data[1]=list(5,6) gives you a Warning (not an error) that the lengths aren't the same. You can't set a one-element list to a two-element list. It's like trying x <- 1; x[1] <- 1:2.

But you can set one element of a list to contain another list, which is why all_data[[1]]=list(5,6) works.

Percent difference between two numbers in R

This really seems more of a question about how to find a percentage difference in general, which is something you can easily google. Nonetheless, to calculate a % difference:

((original value - new value) / original value) * 100.

So,

((combined_data2015_2019$Economy_GDPperCapita_2015 - 
  combined_data_2015_2019$Economy_GDPpercapita_2019) / 
  combined_data2015_2019$Economy_GDPperCapita_2015) * 100

((1.29025 - 1.34) / 1.29025) * 100 = ~3.85% change.

Hopefully this answers your question.

Differencebetween [ ] and [[ ]] in R

What is the difference between [ ] and [[ ]] in R?

Is there a technical difference between = and -

Edit:

Difference between [] and $ operators for subsetting

The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe

What's the difference between `=` and ` -` in R?

In R programming, what's the difference between & vs &&, and | vs ||

Edit:

R difference between [[]] and []

Percent difference between two numbers in R

Related Topics

Leave a reply