Finding Elements of Lists in R
As alexwhan says, grep
is the function to use. However, be careful about using it with a list. It isn't doing what you might think it's doing. For example:
grep("c", z)
[1] 1 2 3 # ?
grep(",", z)
[1] 1 2 3 # ???
What's happening behind the scenes is that grep
coerces its 2nd argument to character, using as.character
. When applied to a list, what as.character
returns is the character representation of that list as obtained by deparsing it. (Modulo an unlist.)
as.character(z)
[1] "c(\"a\", \"b\", \"c\")" "c(\"b\", \"d\", \"e\")" "c(\"a\", \"e\", \"f\")"
cat(as.character(z))
c("a", "b", "c") c("b", "d", "e") c("a", "e", "f")
This is what grep
is working on.
If you want to run grep
on a list, a safer method is to use lapply
. This returns another list, which you can operate on to extract what you're interested in.
res <- lapply(z, function(ch) grep("a", ch))
res
[[1]]
[1] 1
[[2]]
integer(0)
[[3]]
[1] 1
# which vectors contain a search term
sapply(res, function(x) length(x) > 0)
[1] TRUE FALSE TRUE
elements from list of list in R
You can try:
sapply(l, `[`, "v")
$v
numeric(0)
$v
numeric(0)
$v
numeric(0)
$v
[1] 1.227
$v
[1] 1.227
$v
[1] 15.2
Or if you mean a vector containing values from each list:
vec <- sapply(l, `[`, "v")
vec[lengths(vec) == 0] <- NA
unlist(vec)
v v v v v v
NA NA NA 1.227 1.227 15.200
In R, find elements of a vector in a list using vectorization
we can do this, seems to be the fastest by far.
v1 <- c(1, 200, 4000)
L1 <- list(1:4, 1:4*100, 1:4*1000)
sequence(lengths(L1))[match(v1, unlist(L1))]
# [1] 1 2 4
sequence(lengths(L1))[which(unlist(L1) %in% v1)]
# [1] 1 2 4
library(microbenchmark)
library(tidyverse)
microbenchmark(
akrun_sapply = {sapply(L1, function(x) which(x %in% v1))},
akrun_Vectorize = {Vectorize(function(x) which(x %in% v1))(L1)},
akrun_mapply = {mapply(function(x, y) which(x %in% y), L1, v1)},
akrun_mapply_match = {mapply(match, v1, L1)},
akrun_map2 = {purrr::map2_int(L1, v1, ~ .x %in% .y %>% which)},
CPak = {setNames(rep(1:length(L1), times=lengths(L1)), unlist(L1))[as.character(v1)]},
zacdav = {sequence(lengths(L1))[match(v1, unlist(L1))]},
zacdav_which = {sequence(lengths(L1))[which(unlist(L1) %in% v1)]},
times = 10000
)
Unit: microseconds
expr min lq mean median uq max neval
akrun_sapply 18.187 22.7555 27.17026 24.6140 27.8845 2428.194 10000
akrun_Vectorize 60.119 76.1510 88.82623 83.4445 89.9680 2717.420 10000
akrun_mapply 19.006 24.2100 29.78381 26.2120 29.9255 2911.252 10000
akrun_mapply_match 14.136 18.4380 35.45528 20.0275 23.6560 127960.324 10000
akrun_map2 217.209 264.7350 303.64609 277.5545 298.0455 9204.243 10000
CPak 15.741 19.7525 27.31918 24.7150 29.0340 235.245 10000
zacdav 6.649 9.3210 11.30229 10.4240 11.5540 2399.686 10000
zacdav_which 7.364 10.2395 12.22632 11.2985 12.4515 2492.789 10000
Find element in a list
One-liner:
lapply(list_fruits[sapply(list_fruits, "[[", "third") %in% fruits], "[[", "first")
Find the number of elements in a list
Here, using a small example of list v
:
v = vector("list",4)
v[[1]] = 1:5
v[[2]] = 1:50
v[[3]] = NA
> v
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50
[[3]]
[1] NA
[[4]]
NULL
And you get the count by doing:
l = unlist(lapply(v,length))
and
> l
[1] 5 50 1 0
If you don't want to count for NA
l = unlist(lapply(v,function(x)length(x[!is.na(x)])))
and you get:
> l
[1] 5 50 0 0
EDIT: from @markus and @A5C1D2H2I1M1N2O1R2T1 comments
As mentioned by @markus, you can go much much simpler by doing:
> lengths(v)
[1] 5 50 1 0
And as mentioned by @A5C1D2H2I1M1N2O1R2T1, you can get rid of NA count by doing:
> replace(lengths(v), is.na(v), 0)
[1] 5 50 0 0
how to see if any element of a list contains only a certain value in R
We may need to wrap with all
- loop over the list
of matrices with sapply
, create a logical expression (x == 0
), wrap with all
to return a single TRUE/FALSE - if all values excluding NAs (na.rm = TRUE
) are 0, this returns TRUE or else FALSe
sapply(my_list, function(x) all(x == 0, na.rm = TRUE))
Is there a way to view a list
I use str
to see the structure of any object, especially complex list's
Rstudio shows you the structure by clicking at the blue arrow in the data-window:
Proper way to access list elements in R
All these methods give different outputs
[ ] returns a list
[[ ]] returns the object which is stored in list
If it is a named list, then
List$name or List[["name"]] will return same as List[[ ]]
While List["name"] returns a list, Consider the following example
> List <- list(A = 1,B = 2,C = 3,D = 4)
> List[1]
$A
[1] 1
> class(List[1])
[1] "list"
> List[[1]]
[1] 1
> class(List[[1]])
[1] "numeric"
> List$A
[1] 1
> class(List$A)
[1] "numeric"
> List["A"]
$A
[1] 1
> class(List["A"])
[1] "list"
> List[["A"]]
[1] 1
> class(List[["A"]])
[1] "numeric"
How to extract elements from list of lists
rapply
offers yet another option:
unique(rapply(t1, function(x) head(x, 1)))
Related Topics
Generating a Very Large Matrix of String Combinations Using Combn() and Bigmemory Package
R: Creating a Map of Selected Canadian Provinces and U.S. States
Difference Between Installing a Package from Source and from Compiled Binary
Function Commenting Conventions in R
Rjava Is Not Picking Up the Correct Java Version
Clustered Standard Errors in R Using Plm (With Fixed Effects)
Change Color Actionbutton Shiny R
How to Flip Rows and Columns in R
Setting Hex Bins in Ggplot2 to Same Size
Removing a List of Columns from a Data.Frame Using Subset
Importing Wikipedia Tables in R
Linear Interpolate Missing Values in Time Series
If_Else() 'False' Must Be Type Double, Not Integer - in R
R: Bar Plot with Two Groups, of Which One Is Stacked
How to Determine the Geom Type of Each Layer of a Ggplot2 Object
Aggregating Multiple Columns in Data.Table
How to Change the Order of the Panels in Simple Lattice Graphs