Finding Elements of Lists in R

Finding Elements of Lists in R

As alexwhan says, grep is the function to use. However, be careful about using it with a list. It isn't doing what you might think it's doing. For example:

grep("c", z)
[1] 1 2 3 # ?

grep(",", z)
[1] 1 2 3 # ???

What's happening behind the scenes is that grep coerces its 2nd argument to character, using as.character. When applied to a list, what as.character returns is the character representation of that list as obtained by deparsing it. (Modulo an unlist.)

as.character(z)
[1] "c(\"a\", \"b\", \"c\")" "c(\"b\", \"d\", \"e\")" "c(\"a\", \"e\", \"f\")"

cat(as.character(z))
c("a", "b", "c") c("b", "d", "e") c("a", "e", "f")

This is what grep is working on.

If you want to run grep on a list, a safer method is to use lapply. This returns another list, which you can operate on to extract what you're interested in.

res <- lapply(z, function(ch) grep("a", ch))
res
[[1]]
[1] 1

[[2]]
integer(0)

[[3]]
[1] 1

# which vectors contain a search term
sapply(res, function(x) length(x) > 0)
[1] TRUE FALSE TRUE

elements from list of list in R

You can try:

sapply(l, `[`, "v")

$v
numeric(0)

$v
numeric(0)

$v
numeric(0)

$v
[1] 1.227

$v
[1] 1.227

$v
[1] 15.2

Or if you mean a vector containing values from each list:

vec <- sapply(l, `[`, "v")
vec[lengths(vec) == 0] <- NA
unlist(vec)

v v v v v v
NA NA NA 1.227 1.227 15.200

In R, find elements of a vector in a list using vectorization

we can do this, seems to be the fastest by far.

v1 <- c(1, 200, 4000)
L1 <- list(1:4, 1:4*100, 1:4*1000)

sequence(lengths(L1))[match(v1, unlist(L1))]
# [1] 1 2 4
sequence(lengths(L1))[which(unlist(L1) %in% v1)]
# [1] 1 2 4

library(microbenchmark)
library(tidyverse)

microbenchmark(
akrun_sapply = {sapply(L1, function(x) which(x %in% v1))},
akrun_Vectorize = {Vectorize(function(x) which(x %in% v1))(L1)},
akrun_mapply = {mapply(function(x, y) which(x %in% y), L1, v1)},
akrun_mapply_match = {mapply(match, v1, L1)},
akrun_map2 = {purrr::map2_int(L1, v1, ~ .x %in% .y %>% which)},
CPak = {setNames(rep(1:length(L1), times=lengths(L1)), unlist(L1))[as.character(v1)]},
zacdav = {sequence(lengths(L1))[match(v1, unlist(L1))]},
zacdav_which = {sequence(lengths(L1))[which(unlist(L1) %in% v1)]},
times = 10000
)

Unit: microseconds
expr min lq mean median uq max neval
akrun_sapply 18.187 22.7555 27.17026 24.6140 27.8845 2428.194 10000
akrun_Vectorize 60.119 76.1510 88.82623 83.4445 89.9680 2717.420 10000
akrun_mapply 19.006 24.2100 29.78381 26.2120 29.9255 2911.252 10000
akrun_mapply_match 14.136 18.4380 35.45528 20.0275 23.6560 127960.324 10000
akrun_map2 217.209 264.7350 303.64609 277.5545 298.0455 9204.243 10000
CPak 15.741 19.7525 27.31918 24.7150 29.0340 235.245 10000
zacdav 6.649 9.3210 11.30229 10.4240 11.5540 2399.686 10000
zacdav_which 7.364 10.2395 12.22632 11.2985 12.4515 2492.789 10000

Find element in a list

One-liner:

lapply(list_fruits[sapply(list_fruits, "[[", "third") %in% fruits], "[[", "first")

Find the number of elements in a list

Here, using a small example of list v:

v = vector("list",4)
v[[1]] = 1:5
v[[2]] = 1:50
v[[3]] = NA

> v
[[1]]
[1] 1 2 3 4 5

[[2]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50

[[3]]
[1] NA

[[4]]
NULL

And you get the count by doing:

l = unlist(lapply(v,length))

and

> l
[1] 5 50 1 0

If you don't want to count for NA

l = unlist(lapply(v,function(x)length(x[!is.na(x)])))

and you get:

> l
[1] 5 50 0 0

EDIT: from @markus and @A5C1D2H2I1M1N2O1R2T1 comments

As mentioned by @markus, you can go much much simpler by doing:

> lengths(v)
[1] 5 50 1 0

And as mentioned by @A5C1D2H2I1M1N2O1R2T1, you can get rid of NA count by doing:

> replace(lengths(v), is.na(v), 0)
[1] 5 50 0 0

how to see if any element of a list contains only a certain value in R

We may need to wrap with all - loop over the list of matrices with sapply, create a logical expression (x == 0), wrap with all to return a single TRUE/FALSE - if all values excluding NAs (na.rm = TRUE) are 0, this returns TRUE or else FALSe

sapply(my_list, function(x) all(x == 0, na.rm = TRUE))

Is there a way to view a list

I use str to see the structure of any object, especially complex list's

Rstudio shows you the structure by clicking at the blue arrow in the data-window:

Sample Image

Proper way to access list elements in R

All these methods give different outputs

[ ] returns a list

[[ ]] returns the object which is stored in list

If it is a named list, then

List$name or List[["name"]] will return same as List[[ ]]

While List["name"] returns a list, Consider the following example

> List <- list(A = 1,B = 2,C = 3,D = 4)
> List[1]
$A
[1] 1

> class(List[1])
[1] "list"
> List[[1]]
[1] 1
> class(List[[1]])
[1] "numeric"
> List$A
[1] 1
> class(List$A)
[1] "numeric"
> List["A"]
$A
[1] 1

> class(List["A"])
[1] "list"
> List[["A"]]
[1] 1
> class(List[["A"]])
[1] "numeric"

How to extract elements from list of lists

rapply offers yet another option:

unique(rapply(t1, function(x) head(x, 1)))


Related Topics



Leave a reply



Submit