Extract Names of Deeply Nested Lists

extracting data from deeply nested list

I was going to suggest the use of purrr::pluck(), then then reading through the doco I discovered you could actually just use a purrr::map().

You're very close: you need to pass a list of accessors to map() rather than a character vector, and there's an accessor you've missed.

nestedlist %>% map( list('data', 1, 'name') )

[[1]]
[[1]][[1]]
[1] "john"

[[1]][[2]]
[1] "litz"

[[2]]
[[2]][[1]]
[1] "frank"

[[2]][[2]]
[1] "doe"

Get names at deepest level of a nested list in R

Here is one possible approach, using only base R. The following function f replaces each terminal node (or "leaf") of a recursive list x with the sequence of names leading up to it. It treats unnamed lists like named lists with all names equal to "", which is a useful generalization.

f <- function(x, s = NULL) {
if (!is.list(x)) {
return(s)
}
nms <- names(x)
if (is.null(nms)) {
nms <- character(length(x))
}
Map(f, x = x, s = Map(c, list(s), nms))
}

f(lst)
$title
[1] "title"

$author
[1] "author"

$date
[1] "date"

$`header-includes`
[1] "header-includes"

$output
$output$pdf_document
$output$pdf_document$citation_package
[1] "output" "pdf_document" "citation_package"

$`biblio-style`
[1] "biblio-style"

$bibliography
[1] "bibliography"

$papersize
[1] "papersize"

How to get common elements in a deep nested list: my two solutions work but take some time

You can try to convert each nested array at the second level into the set of tuples, where each lowest level array (i.e. [0,4]) is an element of the set.
The conversion into tuples is required because lists are not hashable.
Once you have each nested list of lists as a set, simply find their intersection.

set.intersection(*[set(tuple(elem) for elem in sublist) for sublist in ary])

How to access very first object in differently deep nested lists?

Using a while loop :

x <- list1
while (inherits(x <- x[[1]], "list")) {}
x
#> Time Series:
#> Start = 1
#> End = 100
#> Frequency = 1
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
#> [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
#> [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
#> [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
#> [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
#> [91] 91 92 93 94 95 96 97 98 99 100

x <- list2
while (inherits(x <- x[[1]], "list")) {}
x
#> Time Series:
#> Start = 1
#> End = 100
#> Frequency = 1
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
#> [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
#> [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
#> [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
#> [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
#> [91] 91 92 93 94 95 96 97 98 99 100

Extract and align specific elements from deeply nested list into R dataframe

Using tidyr, we can unnest the list by combining a bunch of calls to unnest_wider() and unnest_longer():

library(tidyr)

tibble(conditions) |>
unnest_wider(conditions) |>
unnest_longer(Phrases) |>
unnest_wider(Phrases) |>
unnest_longer(Mappings) |>
unnest_wider(Mappings) |>
unnest_longer(MappingCandidates) |>
unnest_wider(MappingCandidates) |>
unnest_longer(MatchedWords)
#> # A tibble: 4 × 8
#> PMID PhraseText MappingScore CandidateScore CandidateCUI CandidateMatched CandidatePreferred MatchedWords
#> <dbl> <chr> <dbl> <dbl> <chr> <chr> <chr> <list>
#> 1 1 Hodgkin Lymphoma 1000 1000 C075655 Hodgkins Lymphoma Hodgkins Lymphoma <chr [2]>
#> 2 1 Hodgkin Lymphoma 1000 850 C095659 Lymphoma Lymphoma <chr [1]>
#> 3 2 Plaque Psoriasis 1000 1000 C0125609 Plaque Psoriasis Plaque Psoriasis <chr [2]>
#> 4 2 Plaque Psoriasis 1000 750 C0320011 Psoriasis Psoriasis <chr [1]>

And another approach (perhaps easier to generalize) using rrapply() in the rrapply-package. Here rrapply() is called twice with the option how = "bind". Once to bind together all repeated MappingCandidates and once to bind the other nodes (PMID, Phrases, PhraseText, MappingScore):

library(rrapply)

## bind MappingCandidates
candidateNodes <- rrapply(
conditions,
how = "bind",
options = list(namecols = TRUE, coldepth = 8)
)
candidateNodes
#> L1 L2 L3 L4 L5 L6 L7 CandidateScore CandidateCUI CandidateMatched CandidatePreferred MatchedWords.1
#> 1 1 Phrases 1 Mappings 1 MappingCandidates 1 1000 C075655 Hodgkins Lymphoma Hodgkins Lymphoma hodgkin, lymphoma
#> 2 1 Phrases 1 Mappings 1 MappingCandidates 2 850 C095659 Lymphoma Lymphoma lymphoma
#> 3 2 Phrases 1 Mappings 1 MappingCandidates 1 1000 C0125609 Plaque Psoriasis Plaque Psoriasis plaque, psoriasis
#> 4 2 Phrases 1 Mappings 1 MappingCandidates 2 750 C0320011 Psoriasis Psoriasis psoriasis

## bind other nodes
otherNodes <- rrapply(
conditions,
condition = \(x, .xparents) !"MappingCandidates" %in% .xparents,
how = "bind",
options = list(namecols = TRUE)
)
otherNodes
#> L1 PMID Phrases.1.PhraseText Phrases.1.Mappings.1.MappingScore
#> 1 1 1 Hodgkin Lymphoma 1000
#> 2 2 2 Plaque Psoriasis 1000

## merge into single data.frame
allNodes <- merge(candidateNodes, otherNodes, by = "L1")
allNodes
#> L1 L2 L3 L4 L5 L6 L7 CandidateScore CandidateCUI CandidateMatched CandidatePreferred MatchedWords.1 PMID Phrases.1.PhraseText Phrases.1.Mappings.1.MappingScore
#> 1 1 Phrases 1 Mappings 1 MappingCandidates 1 1000 C075655 Hodgkins Lymphoma Hodgkins Lymphoma hodgkin, lymphoma 1 Hodgkin Lymphoma 1000
#> 2 1 Phrases 1 Mappings 1 MappingCandidates 2 850 C095659 Lymphoma Lymphoma lymphoma 1 Hodgkin Lymphoma 1000
#> 3 2 Phrases 1 Mappings 1 MappingCandidates 1 1000 C0125609 Plaque Psoriasis Plaque Psoriasis plaque, psoriasis 2 Plaque Psoriasis 1000
#> 4 2 Phrases 1 Mappings 1 MappingCandidates 2 750 C0320011 Psoriasis Psoriasis psoriasis 2 Plaque Psoriasis 1000

R: Find object by name in deeply nested list

Here's a function that will return the first match if found

find_name <- function(haystack, needle) {
if (hasName(haystack, needle)) {
haystack[[needle]]
} else if (is.list(haystack)) {
for (obj in haystack) {
ret <- Recall(obj, needle)
if (!is.null(ret)) return(ret)
}
} else {
NULL
}
}

find_name(my_list, "XY01")

We avoid lapply so the loop can break early if found.

The list pruning is really a separate issue. Better to attack that with a different function. This should work

list_prune <- function(list, depth=1) {
if (!is.list(list)) return(list)
if (depth>1) {
lapply(list, list_prune, depth = depth-1)
} else {
Filter(function(x) !is.list(x), list)
}
}

Then you could do

list_prune(find_name(my_list, "XY01"), 1)

or with pipes

find_name(my_list, "XY01") %>% list_prune(1)

Extracting values from complex and deeply nested list of dictionaires using python?

Here the main idea is to convert dict to dataframe and dataframe to append on new list by rows

Code:

Step 1:

df = pd.json_normalize(complex_data )
df[2] = df[2].apply(lambda x: {k:v for k , v in dict(map(dict.popitem, x['B']))['C'].items() if k=='test456'})
df

#Output

                0               1                              2
0 {'A': 'test1'} {'A': 'test2'} {'test456': {'A': '111def'}}
1 {'A': 'test3'} {'A': 'test4'} {'test456': {'A': '999def'}}

Step 2:

desired_output = df.values.tolist()
desired_output

#output

[[{'A': 'test1'}, {'A': 'test2'}, {'test456': {'A': '111def'}}],
[{'A': 'test3'}, {'A': 'test4'}, {'test456': {'A': '999def'}}]]

Update you can avoid the None or {} value using if..else.. as below:

df[2].apply(lambda x: {} if len(x['B'])==0 else({} if not x['B'][-1] else ({'test456':x['B'][-1]['C']['test456']} if 'test456' in  x['B'][-1]['C'].keys() else {})))


Related Topics



Leave a reply



Submit