In R, Extract Part of Object from List

in R, extract part of object from list

sapply is going to apply some function to every element in your list. In your case, you want to access each element in a (nested) list. sapply is certainly capable of this. For instance, if you want to access the first child of every element in your list:

sapply(listJson, "[[", 1)

Or if you wanted to access the item named "favorited", you could use:

sapply(listJson, "[[", "favorited")

Note that the [ operator will take a subset of the list you're working with. So when you access myList[1], you still have a list, it's just of length 1. However, if you reference myList[[1]], you'll get the contents of the first space in your list (which may or may not be another list). Thus, you'll use the [[ operator in sapply, because you want to get down to the contents of the list.

How to extract elements of a list in R?

I like the purrr::map family of functions for their ease of passing functions and arguments. Two quick options for extracting those elements are with grep using value = T to return the matching strings, not just their indices, or with stringr::str_subset which does the same.

The regex here matches strings that begin with "OTU", followed by 1 or more digits to the end.

Both methods scale for multiple matches at a time: I added an item "OTU1234" in the last list element to illustrate this.

dl <- list(
`56` = c("OTU2998", "UniRef90_A0A1Z9FS94", "UniRef90_A0A257ESC3", "UniRef90_A0A293NAV3", "UniRef90_A0A2E1NMU8", "UniRef90_A0A2E1NPX9", "UniRef90_A0A2E1NQL1", "UniRef90_A0A2E1NRD2", "UniRef90_X0UC66"),
`57` = c("OTU3820", "UniRef90_A0A1Z9H3N2", "UniRef90_A0A2D5I161", "UniRef90_A0A2E6PRN5"),
`58` = c("OTU4452", "UniRef90_A0A1Z9KBI8", "UniRef90_A0A2E1VTI6", "UniRef90_A0A2G2KCN6", "UniRef90_UPI000BFEC744"),
`59` = c("OTU0245", "UniRef90_A0A1Z9MPM9", "UniRef90_A0A2E2ME98", "UniRef90_A0A2E8X9N7", "OTU1234")
)

purrr::map(dl, ~grep("^OTU\\d+$", ., value = T))
#> $`56`
#> [1] "OTU2998"
#>
#> $`57`
#> [1] "OTU3820"
#>
#> $`58`
#> [1] "OTU4452"
#>
#> $`59`
#> [1] "OTU0245" "OTU1234"
purrr::map(dl, stringr::str_subset, "^OTU\\d+$")
# same output as above

Extract values from list of arbitrary depth

1) rrapply Flatten m using rrapply giving r and then separate the name and code fields of unlist(r) using tapply, remove the dimensions using c, convert to data.frame and set the order of the columns.

Note that this is not hard coded to name and code and would work with other fields and numbers of fields.

library(rrapply)

r <- rrapply(m, f = c, how = "flatten")
nms <- names(r)
as.data.frame(c(tapply(unname(r), nms, unlist)))[unique(nms)]

giving:

  name code
1 Bob 12
2 Mary 15

An alternative to the final two lines of code above would be:

out <- unstack(stack(r))
out[] <- lapply(out, type.convert)

If there could be other fields in m in addition to name and code that we want ignored then use this in place of the statement that defines r above:

cond <- function(x, .xname) .xname %in% c("name", "code")
r <- rrapply(m, cond, c, how = "flatten")

2) Base R A base R solution is the following which unlists m, and then uses tapply as in (1) grouping by the suffixes of names(r). Like (1) this is a general approach that is not hard coded to name and code. Note that tools comes with R so it is part of Base R.

r <- unlist(m)
nms <- tools::file.ext(names(r))
as.data.frame(c(tapply(unname(r), nms, unlist)))[unique(nms)]

How to extract elements from a list with mixed elements

If you want to extract the third element of each list element you can do:

List <- list(c(1:3), c(4:6), c(7:9))
lapply(List, '[[', 3) # This returns a list with only the third element
unlist(lapply(List, '[[', 3)) # This returns a vector with the third element

Using your example and taking into account @GSee comment you can do:

yourList <- list(c("","668","12345_s_at","667", "4.899777748","49.53333333",
"10.10930207", "1.598228663","5.087437057"),
c("","376", "6789_at", "375", "4.899655078","136.3333333",
"27.82508792", "2.20223398", "5.087437057"),
c("", "19265", "12351_s_at", "19264", "4.897730912",
"889.3666667", "181.5874908","1.846451572","5.087437057" ))

sapply(yourList, '[[', 3)
[1] "12345_s_at" "6789_at" "12351_s_at"

Next time you can provide some data using dput on a portion of your dataset so we can reproduce your problem easily.

How to extract elements from list of lists

rapply offers yet another option:

unique(rapply(t1, function(x) head(x, 1)))

Extract object from list using dplyr

If you prefer map to sapply for this, you can do

library(purrr)
map(gof_stats, ~ .x[["cvm"]])

If you just like pipes you could do

gof_stats %>% sapply("[[", "cvm")

Your question is about lists, not data frames, so dplyr doesn't really apply. You may want to look up ?magrittr::multiply_by to see a list of other aliases from the package that defines %>% as you seem to like piping. For example, magrittr::extract2 is an alias of [[ that can be easily used in the middle of a piping chain.

As for your bonus, I would pre-filter the list to remove NULL elements before attempting to extract things.

Extracting data from lists of different levels using a function

How about using a recursive function ?

df_func <- function(list){
tmp <- list[[1]]
if(class(tmp) == 'list') {
df_func(tmp)
} else tmp
}

How to extract values from string lists in R?

While not what you've asked, it looks as though you used capture.output(.) to capture those strings. Instead of trying to extract the strings from the captured output, I suggest you get the real numbers from the objects themselves.

M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))
dimnames(M) <- list(gender = c("F", "M"),
party = c("Democrat","Independent", "Republican"))
Xsq <- chisq.test(M)
names(Xsq)
# [1] "statistic" "parameter" "p.value" "method" "data.name" "observed" "expected" "residuals" "stdres"
Xsq[c("statistic","p.value")]
# $statistic
# X-squared
# 30.07015
# $p.value
# [1] 2.953589e-07

Since you mention having a list of these, it's easy to work with that as well. For instance, if you have a list of test results as in

Xsq2 <- lapply(list(M, M), chisq.test)
Xsq2
# [[1]]
# Pearson's Chi-squared test
# data: X[[i]]
# X-squared = 30.07, df = 2, p-value = 2.954e-07
# [[2]]
# Pearson's Chi-squared test
# data: X[[i]]
# X-squared = 30.07, df = 2, p-value = 2.954e-07
lapply(Xsq2, `[`, c("statistic", "p.value"))
# [[1]]
# [[1]]$statistic
# X-squared
# 30.07015
# [[1]]$p.value
# [1] 2.953589e-07
# [[2]]
# [[2]]$statistic
# X-squared
# 30.07015
# [[2]]$p.value
# [1] 2.953589e-07

which can be easily converted into a data.frame with:

do.call(rbind.data.frame, lapply(Xsq2, `[`, c("statistic", "p.value")))
# statistic p.value
# 1 30.07015 2.953589e-07
# 2 30.07015 2.953589e-07

Extract names of objects from list

Making a small tweak to the inside function and using lapply on an index instead of the actual list itself gets this doing what you want

x <- c("yes", "no", "maybe", "no", "no", "yes")
y <- c("red", "blue", "green", "green", "orange")
list.xy <- list(x=x, y=y)

WORD.C <- function(WORDS){
require(wordcloud)

L2 <- lapply(WORDS, function(x) as.data.frame(table(x), stringsAsFactors = FALSE))

# Takes a dataframe and the text you want to display
FUN <- function(X, text){
windows()
wordcloud(X[, 1], X[, 2], min.freq=1)
mtext(text, 3, padj=-4.5, col="red") #what I'm trying that isn't working
}

# Now creates the sequence 1,...,length(L2)
# Loops over that and then create an anonymous function
# to send in the information you want to use.
lapply(seq_along(L2), function(i){FUN(L2[[i]], names(L2)[i])})

# Since you asked about loops
# you could use i in seq_along(L2)
# instead of 1:length(L2) if you wanted to
#for(i in 1:length(L2)){
# FUN(L2[[i]], names(L2)[i])
#}
}

WORD.C(list.xy)


Related Topics



Leave a reply



Submit