Extract Names of Deeply Nested Lists

extracting data from deeply nested list

I was going to suggest the use of purrr::pluck(), then then reading through the doco I discovered you could actually just use a purrr::map().

You're very close: you need to pass a list of accessors to map() rather than a character vector, and there's an accessor you've missed.

nestedlist %>% map( list('data', 1, 'name') )

[[1]]
[[1]][[1]]
[1] "john"

[[1]][[2]]
[1] "litz"

[[2]]
[[2]][[1]]
[1] "frank"

[[2]][[2]]
[1] "doe"

Get names at deepest level of a nested list in R

Here is one possible approach, using only base R. The following function f replaces each terminal node (or "leaf") of a recursive list x with the sequence of names leading up to it. It treats unnamed lists like named lists with all names equal to "", which is a useful generalization.

f <- function(x, s = NULL) {
  if (!is.list(x)) {
    return(s)
  }
  nms <- names(x)
  if (is.null(nms)) {
    nms <- character(length(x))
  }
  Map(f, x = x, s = Map(c, list(s), nms))
}

f(lst)

$title
[1] "title"

$author
[1] "author"

$date
[1] "date"

$`header-includes`
[1] "header-includes"

$output
$output$pdf_document
$output$pdf_document$citation_package
[1] "output"           "pdf_document"     "citation_package"

$`biblio-style`
[1] "biblio-style"

$bibliography
[1] "bibliography"

$papersize
[1] "papersize"

How to get common elements in a deep nested list: my two solutions work but take some time

You can try to convert each nested array at the second level into the set of tuples, where each lowest level array (i.e. [0,4]) is an element of the set.
The conversion into tuples is required because lists are not hashable.
Once you have each nested list of lists as a set, simply find their intersection.

set.intersection(*[set(tuple(elem) for elem in sublist) for sublist in ary])

How to access very first object in differently deep nested lists?

Using a while loop :

x <- list1
while (inherits(x <- x[[1]], "list")) {}
x
#> Time Series:
#> Start = 1 
#> End = 100 
#> Frequency = 1 
#>   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
#>  [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
#>  [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
#>  [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
#>  [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
#>  [91]  91  92  93  94  95  96  97  98  99 100

x <- list2
while (inherits(x <- x[[1]], "list")) {}
x
#> Time Series:
#> Start = 1 
#> End = 100 
#> Frequency = 1 
#>   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
#>  [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
#>  [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
#>  [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
#>  [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
#>  [91]  91  92  93  94  95  96  97  98  99 100

Extract and align specific elements from deeply nested list into R dataframe

Using tidyr, we can unnest the list by combining a bunch of calls to unnest_wider() and unnest_longer():

library(tidyr)

tibble(conditions) |>
  unnest_wider(conditions) |>
  unnest_longer(Phrases) |>
  unnest_wider(Phrases) |>
  unnest_longer(Mappings) |>
  unnest_wider(Mappings) |>
  unnest_longer(MappingCandidates) |>
  unnest_wider(MappingCandidates) |>
  unnest_longer(MatchedWords)
#> # A tibble: 4 × 8
#>    PMID PhraseText       MappingScore CandidateScore CandidateCUI CandidateMatched  CandidatePreferred MatchedWords
#>   <dbl> <chr>                   <dbl>          <dbl> <chr>        <chr>             <chr>              <list>      
#> 1     1 Hodgkin Lymphoma         1000           1000 C075655      Hodgkins Lymphoma Hodgkins Lymphoma  <chr [2]>   
#> 2     1 Hodgkin Lymphoma         1000            850 C095659      Lymphoma          Lymphoma           <chr [1]>   
#> 3     2 Plaque Psoriasis         1000           1000 C0125609     Plaque Psoriasis  Plaque Psoriasis   <chr [2]>   
#> 4     2 Plaque Psoriasis         1000            750 C0320011     Psoriasis         Psoriasis          <chr [1]>

And another approach (perhaps easier to generalize) using rrapply() in the rrapply-package. Here rrapply() is called twice with the option how = "bind". Once to bind together all repeated MappingCandidates and once to bind the other nodes (PMID, Phrases, PhraseText, MappingScore):

library(rrapply)

## bind MappingCandidates
candidateNodes <- rrapply(
  conditions, 
  how = "bind", 
  options = list(namecols = TRUE, coldepth = 8)
)
candidateNodes 
#>   L1      L2 L3       L4 L5                L6 L7 CandidateScore CandidateCUI  CandidateMatched CandidatePreferred    MatchedWords.1
#> 1  1 Phrases  1 Mappings  1 MappingCandidates  1           1000      C075655 Hodgkins Lymphoma  Hodgkins Lymphoma hodgkin, lymphoma
#> 2  1 Phrases  1 Mappings  1 MappingCandidates  2            850      C095659          Lymphoma           Lymphoma          lymphoma
#> 3  2 Phrases  1 Mappings  1 MappingCandidates  1           1000     C0125609  Plaque Psoriasis   Plaque Psoriasis plaque, psoriasis
#> 4  2 Phrases  1 Mappings  1 MappingCandidates  2            750     C0320011         Psoriasis          Psoriasis         psoriasis

## bind other nodes
otherNodes <- rrapply(
  conditions, 
  condition = \(x, .xparents) !"MappingCandidates" %in% .xparents, 
  how = "bind", 
  options = list(namecols = TRUE)
)
otherNodes
#>   L1 PMID Phrases.1.PhraseText Phrases.1.Mappings.1.MappingScore
#> 1  1    1     Hodgkin Lymphoma                              1000
#> 2  2    2     Plaque Psoriasis                              1000

## merge into single data.frame
allNodes <- merge(candidateNodes, otherNodes, by = "L1")
allNodes
#>   L1      L2 L3       L4 L5                L6 L7 CandidateScore CandidateCUI  CandidateMatched CandidatePreferred    MatchedWords.1 PMID Phrases.1.PhraseText Phrases.1.Mappings.1.MappingScore
#> 1  1 Phrases  1 Mappings  1 MappingCandidates  1           1000      C075655 Hodgkins Lymphoma  Hodgkins Lymphoma hodgkin, lymphoma    1     Hodgkin Lymphoma                              1000
#> 2  1 Phrases  1 Mappings  1 MappingCandidates  2            850      C095659          Lymphoma           Lymphoma          lymphoma    1     Hodgkin Lymphoma                              1000
#> 3  2 Phrases  1 Mappings  1 MappingCandidates  1           1000     C0125609  Plaque Psoriasis   Plaque Psoriasis plaque, psoriasis    2     Plaque Psoriasis                              1000
#> 4  2 Phrases  1 Mappings  1 MappingCandidates  2            750     C0320011         Psoriasis          Psoriasis         psoriasis    2     Plaque Psoriasis                              1000

R: Find object by name in deeply nested list

Here's a function that will return the first match if found

find_name <- function(haystack, needle) {
 if (hasName(haystack, needle)) {
   haystack[[needle]]
 } else if (is.list(haystack)) {
   for (obj in haystack) {
     ret <- Recall(obj, needle)
     if (!is.null(ret)) return(ret)
   }
 } else {
   NULL
 }
}

find_name(my_list, "XY01")

We avoid lapply so the loop can break early if found.

The list pruning is really a separate issue. Better to attack that with a different function. This should work

list_prune <- function(list, depth=1) {
  if (!is.list(list)) return(list)
  if (depth>1) {
    lapply(list, list_prune, depth = depth-1)
  } else  {
    Filter(function(x) !is.list(x), list)
  }
}

Then you could do

list_prune(find_name(my_list, "XY01"), 1)

or with pipes

find_name(my_list, "XY01") %>% list_prune(1)

Extracting values from complex and deeply nested list of dictionaires using python?

Here the main idea is to convert dict to dataframe and dataframe to append on new list by rows

Code:

Step 1:

df = pd.json_normalize(complex_data )
df[2] = df[2].apply(lambda x: {k:v for k , v in dict(map(dict.popitem, x['B']))['C'].items() if k=='test456'})
df

#Output

                0               1                              2
0   {'A': 'test1'}  {'A': 'test2'}  {'test456': {'A': '111def'}}
1   {'A': 'test3'}  {'A': 'test4'}  {'test456': {'A': '999def'}}

Step 2:

desired_output = df.values.tolist()
desired_output

#output

[[{'A': 'test1'}, {'A': 'test2'}, {'test456': {'A': '111def'}}],
 [{'A': 'test3'}, {'A': 'test4'}, {'test456': {'A': '999def'}}]]

Update you can avoid the None or {} value using if..else.. as below:

df[2].apply(lambda x: {} if len(x['B'])==0 else({} if not x['B'][-1] else ({'test456':x['B'][-1]['C']['test456']} if 'test456' in  x['B'][-1]['C'].keys() else {})))