The Difference Between Bracket [ ] and Double Bracket [[ ]] For Accessing the Elements of a List or Dataframe

The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe

The R Language Definition is handy for answering these types of questions:

http://cran.r-project.org/doc/manuals/R-lang.html#Indexing

R has three basic indexing operators, with syntax displayed by the following examples
    x[i]
    x[i, j]
    x[[i]]
    x[[i, j]]
    x$a
    x$"a"
For vectors and matrices the [[ forms are rarely used, although they have some slight semantic differences from the [ form (e.g. it drops any names or dimnames attribute, and that partial matching is used for character indices). When indexing multi-dimensional structures with a single index, x[[i]] or x[i] will return the ith sequential element of x.

For lists, one generally uses [[ to select any single element, whereas [ returns a list of the selected elements.

The [[ form allows only a single element to be selected using integer or character indices, whereas [ allows indexing by vectors. Note though that for a list, the index can be a vector and each element of the vector is applied in turn to the list, the selected component, the selected component of that component, and so on. The result is still a single element.

Difference between single and double bracket in calling columns

A data.frame is a list with columns of equal length. By using [[, we extract the column as a vector, while with [, get a data.frame with single or multiple columns. Another option to return a vector with [ is to specify the , to indicate explicitly that it is a column index and by default then the drop = TRUE gets triggered for data.frame

myDataset[, 1]

If we still want a data.frame single column

myDataset[, 1, drop = FALSE]

What is the difference between [ ] and [[ ]] in R?

[] = always returns object of same class (out of basic object classes), can select more than one element of an object

[[]] = can extract one element from list or data frame, returned object (out of basic object classes) not necessarily list/dataframe

single vs double square brackets in python

The list inside a list is called a nested list. In the following list my_movies_1, you have length 1 for my_movies_1 and the length of the inner list is 9. This inner list is accessed using my_movies_1[0].

my_movies_1 = [['How I Met your Mother', 'Friends', 'sillicon valley','The Wire','breakin bad', 'Family Guy','Game of Throne','South park', 'Rick and Morty']]

On the other hand, the following list is not a nested list and has a length of 9

my_movies_2 = ['How I Met your Mother', 'Friends', 'sillicon valley','The Wire','breakin bad','Family Guy','Game of Throne','South park', 'Rick and Morty']

How are they related:

Here my_movies_1[0] would give you my_movies_2

The difference between double brace `[[...]]` and single brace `[..]` indexing in Pandas

Consider this:

Source DF:

In [79]: df
Out[79]:
   Brains  Bodies
0      42      34
1      32      23

Selecting one column - results in Pandas.Series:

In [80]: df['Brains']
Out[80]:
0    42
1    32
Name: Brains, dtype: int64

In [81]: type(df['Brains'])
Out[81]: pandas.core.series.Series

Selecting subset of DataFrame - results in DataFrame:

In [82]: df[['Brains']]
Out[82]:
   Brains
0      42
1      32

In [83]: type(df[['Brains']])
Out[83]: pandas.core.frame.DataFrame

Conclusion: the second approach allows us to select multiple columns from the DataFrame. The first one just for selecting single column...

Demo:

In [84]: df = pd.DataFrame(np.random.rand(5,6), columns=list('abcdef'))

In [85]: df
Out[85]:
          a         b         c         d         e         f
0  0.065196  0.257422  0.273534  0.831993  0.487693  0.660252
1  0.641677  0.462979  0.207757  0.597599  0.117029  0.429324
2  0.345314  0.053551  0.634602  0.143417  0.946373  0.770590
3  0.860276  0.223166  0.001615  0.212880  0.907163  0.437295
4  0.670969  0.218909  0.382810  0.275696  0.012626  0.347549

In [86]: df[['e','a','c']]
Out[86]:
          e         a         c
0  0.487693  0.065196  0.273534
1  0.117029  0.641677  0.207757
2  0.946373  0.345314  0.634602
3  0.907163  0.860276  0.001615
4  0.012626  0.670969  0.382810

and if we specify only one column in the list we will get a DataFrame with one column:

In [87]: df[['e']]
Out[87]:
          e
0  0.487693
1  0.117029
2  0.946373
3  0.907163
4  0.012626

R difference between [[]] and []

all_data[1]=list(5,6) gives you a Warning (not an error) that the lengths aren't the same. You can't set a one-element list to a two-element list. It's like trying x <- 1; x[1] <- 1:2.

But you can set one element of a list to contain another list, which is why all_data[[1]]=list(5,6) works.

Difference between [] and $ operators for subsetting

Below we will use the one-row data frame in order to provide briefer output:

mtcars1 <- mtcars[1, ]

Note the differences among these. We can use class as in class(mtcars["hp"]) to investigate the class of the return value.

The first two correspond to the code in the question and return a data frame and plain vector respectively. The key differences between [ and $ are that [ (1) can specify multiple columns, (2) allows passing of a variable as the index and (3) returns a data frame (although see examples later on) whereas $ (1) can only specify a single column, (2) the index must be hard coded and (3) it returns a vector.

mtcars1["hp"]  # returns data frame
##            hp
## Mazda RX4 110

mtcars1$hp # returns plain vector
## [1] 110

Other examples where index is a single element. Note that the first and second examples below are actually the same as drop = TRUE is the default.

mtcars1[, "hp"] # returns plain vector
## [1] 110  

mtcars1[, "hp", drop = TRUE] # returns plain vector
## [1] 110

mtcars1[, "hp", drop = FALSE] # returns data frame
##            hp
## Mazda RX4 110

Also there is the [[ operator which is like the $ operator except it can accept a variable as the index whereas $ requires the index to be hard coded:

mtcars1[["hp"]] # returns plain vector
## [1] 110

Others where index specifies multiple elements. $ and [[ cannot be used with multiple elements so these examples only use [:

mtcars1[c("mpg", "hp")] # returns data frame
##           mpg  hp
## Mazda RX4  21 110

mtcars1[, c("mpg", "hp")] # returns data frame
##           mpg  hp
## Mazda RX4  21 110

mtcars1[, c("mpg", "hp"), drop = FALSE] # returns data frame
##           mpg  hp
## Mazda RX4  21 110

mtcars1[, c("mpg", "hp"), drop = TRUE] # returns list
## $mpg
## [1] 21
## 
## $hp
## [1] 110

[

mtcars[foo] can return more than one column if foo is a vector with more than one element, e.g. mtcars[c("hp", "mpg")], and in all cases the return value is a data.frame even if foo has only one element (as it does in the question).

There is also mtcars[, foo, drop = FALSE] which returns the same value as mtcars[foo] so it always returns a data frame. With drop = TRUE it will return a list rather than a data.frame in the case that foo specifies multiple columns and returns the column itself if it specifies a single column.

[[

On the other hand mtcars[[foo]] only works if foo has one element and it returns that column, not a data frame.

$

mtcars$hp also only works for a single column, like [[, and returns the column, not a data frame containing that column.

mtcars$hp is like mtcars[["hp"]]; however, there is no possibility to pass a variable index with $. One can only hard-code the index with $.

subset

Note that this works:

subset(mtcars, hp > 150)

returning a data frame containing those rows where the hp column exceeds 150:

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8

other objects

The above pertain to data frames but other objects that can use $, [ and [[ will have their own rules. In particular if m is a matrix, e.g. m <- as.matrix(BOD), then m[, 1] is a vector, not a one column matrix, but m[, 1, drop = FALSE] is a one column matrix. m[[1]] and m[1] are both the first element of m, not the first column. m$a does not work at all.

help

See ?Extract for more information. Also ?"$", ?"[" and ?"[[" all get to the same page, as well.

The Difference Between Bracket [ ] and Double Bracket [[ ]] For Accessing the Elements of a List or Dataframe