How to Index an Element of a List Object in R

How to index an element of a list object in R

Indexing a list is done using double bracket, i.e. hypo_list[[1]] (e.g. have a look here: http://www.r-tutor.com/r-introduction/list). BTW: read.table does not return a table but a dataframe (see value section in ?read.table). So you will have a list of dataframes, rather than a list of table objects. The principal mechanism is identical for tables and dataframes though.

Note: In R, the index for the first entry is a 1 (not 0 like in some other languages).

Dataframes

l <- list(anscombe, iris)   # put dfs in list
l[[1]] # returns anscombe dataframe

anscombe[1:2, 2] # access first two rows and second column of dataset
[1] 10 8

l[[1]][1:2, 2] # the same but selecting the dataframe from the list first
[1] 10 8

Table objects

tbl1 <- table(sample(1:5, 50, rep=T))
tbl2 <- table(sample(1:5, 50, rep=T))
l <- list(tbl1, tbl2) # put tables in a list

tbl1[1:2] # access first two elements of table 1

Now with the list

l[[1]]                 # access first table from the list

1 2 3 4 5
9 11 12 9 9

l[[1]][1:2] # access first two elements in first table

1 2
9 11

R - Multi-level list indexing

Although the title of the question referst to multi-level list indexing, and the syntax mylist[['a']][['b']][['c']] is the same that one would use to retrieve an element of a multi-level list, the differences that you're observing actually arise from using the same syntax for creation (or not) of multi-level lists.

To show this, we can first explicitly create the multi-level (nested) lists, and then check that the indexing works as expected both for matrices and for single numbers.

mymatrix=matrix(1:4,nrow=2)
list_b=list(c=mymatrix)
list_a=list(b=list_b)
mynestedlist1=list(a=list_a)
str( mynestedlist1 )
# List of 1
# $ a:List of 1
# ..$ b:List of 1
# .. ..$ c: int [1:2, 1:2] 1 2 3 4

mynumber=2
list_e=list(f=mynumber)
list_d=list(e=list_e)
mynestedlist2=list(d=list_d)
str( mynestedlist2 )
# List of 1
# $ d:List of 1
# ..$ e:List of 1
# .. ..$ f: num 2

( Note that I've created the lists in sequential steps for clarity; the could have been all rolled-together in a single line, like: mynestedlist2=list(d=list(e=list(f=mynumber))) )

Anyway, now we'll check that indexing works Ok:

str(mynestedlist1[['a']][['b']][['c']])
# int [1:2, 1:2] 1 2 3 4
str(mynestedlist1$a$b$c)
# int [1:2, 1:2] 1 2 3 4

str(mynestedlist2[['d']][['e']][['f']])
# num 2
str(mynestedlist2$d$e$f)
# num 2

# and, just to check that we don't 'skip the last level':
str(mynestedlist2[['d']][['e']])
# List of 1
# $ f: num 2

So the direct answer to the question 'which of these methods should be used for indexing a multi-level list in R' is: 'any of them - they're all ok'.

So what's going on with the examples in the question, then?

Here, the same syntax is being used to try to implicitly create lists, and since the structure of the nested list is not specified explicitly, this relies on whether R can infer the structure that you want.

In the first and third examples, there's no ambiguity, but each for a different reason:

First example:

mynestedlist1=list()
mynestedlist1[['a']][['b']][['c']]=mymatrix

We've specified that mynestedlist1 is a list. But its elements could be any kind of object, until we assign them. In this case, we put into the element named 'a' an object with an element 'b' that contains an object with an element 'c' that is a matrix. Since there's no R object that can contain a matrix in a single element except a list, the only way to achieve this assignment is by creating a nested list.

Third example:

mynestedlist3=list()
mynestedlist3$g$h$i=mynumber

In this case, we've used the $ notation, which only applies to lists (or to data types that are similar/equivalent to lists, like dataframes). So, again, the only way to follow the instructions of this assignment is by creating a nested list.

Finally, the pesky second example, but starting with a simpler variant of it:

mylist2=list()
mylist2[['c']][['d']]=mynumber

Here there's an ambiguity. We've specified that mylist2 is a list, and we've put into the element named 'c' an object with an element 'd' that contains a single number. This element could have been a list, but it can also be a simple vector, and in this case R chooses this as the simpler option:

str(mylist2)
# List of 1
# $ c: Named num 2
# ..- attr(*, "names")= chr "d"

Contrast this to the behaviour when trying to assign a matrix using exactly the same syntax: in this case, the only way follow the syntax would be by creating another, nested, list inside the first one.

What about the full second example mylist2[['c']][['d']][['e']]=mynumber, where we try to assign a number named 'e' to the just-created but still-empty object 'd'?

This seems rather unclear, and this may be the reason for the different behaviours of different versions of R (as reported in the comments to the question). In the question, the action taken by R has been to assign the number while dropping its name, similarly to:

myvec=vector(); myvec2=vector()
myvec[['a']]=1
myvec2[['b']]=2
myvec[['a']]=myvec2
str(myvec)
# Named num 2
# - attr(*, "names")= chr "a"

However, the syntax alone doesn't seem to force this behaviour, so it would be sensible to avoid relying on this behaviour when trying to create nested lists, or lists of vectors.

Access lapply index names inside FUN

Unfortunately, lapply only gives you the elements of the vector you pass it.
The usual work-around is to pass it the names or indices of the vector instead of the vector itself.

But note that you can always pass in extra arguments to the function, so the following works:

x <- list(a=11,b=12,c=13) # Changed to list to address concerns in commments
lapply(seq_along(x), function(y, n, i) { paste(n[[i]], y[[i]]) }, y=x, n=names(x))

Here I use lapply over the indices of x, but also pass in x and the names of x. As you can see, the order of the function arguments can be anything - lapply will pass in the "element" (here the index) to the first argument not specified among the extra ones. In this case, I specify y and n, so there's only i left...

Which produces the following:

[[1]]
[1] "a 11"

[[2]]
[1] "b 12"

[[3]]
[1] "c 13"

UPDATE Simpler example, same result:

lapply(seq_along(x), function(i) paste(names(x)[[i]], x[[i]]))

Here the function uses "global" variable x and extracts the names in each call.

Return list index from a list of tidygraph objects in R?

You may try -

getTreeListNumber <- function(listObj, level, rank){
which(sapply(myList, function(x) {
nodes <- tidygraph::activate(x, nodes) %>% tibble::as_tibble()
all(nodes$level == level & nodes$rank == rank)
}))
}

getTreeListNumber(myList, 1, 2)
#[1] 2

Is there an R function for finding the index of an element in a vector?

The function match works on vectors:

x <- sample(1:10)
x
# [1] 4 5 9 3 8 1 6 10 7 2
match(c(4,8),x)
# [1] 1 5

match only returns the first encounter of a match, as you requested. It returns the position in the second argument of the values in the first argument.

For multiple matching, %in% is the way to go:

x <- sample(1:4,10,replace=TRUE)
x
# [1] 3 4 3 3 2 3 1 1 2 2
which(x %in% c(2,4))
# [1] 2 5 9 10

%in% returns a logical vector as long as the first argument, with a TRUE if that value can be found in the second argument and a FALSE otherwise.



Related Topics



Leave a reply



Submit