Determine level of nesting in R?
A little recursive function can do this for you:
depth <- function(this,thisdepth=0){
if(!is.list(this)){
return(thisdepth)
}else{
return(max(unlist(lapply(this,depth,thisdepth=thisdepth+1))))
}
}
If you've got package:testthat
, here's a test set:
l1=list(1,2,3)
l2=list(1,2,l1,4)
l3=list(1,l1,l2,5)
require(testthat)
expect_equal(depth(l1),1)
expect_equal(depth(l2),2)
expect_equal(depth(l3),3)
Apologies for using lower-case L in variable names. Readability fail.
How to run function on the deepest level only in a nested list?
We can recursively descend lst
to find the maximum depth and then use that to recursively descend again applying unique
only at the maximum depth. No packages are used.
maxDepth <- function(x, depth = 0) {
if (is.list(x)) max(sapply(x, maxDepth, depth+1))
else depth
}
lstUnique <- function(x, depth = maxDepth(x)) {
if (depth == 0) unique(x)
else if (is.list(x)) lapply(x, lstUnique, depth-1)
else x
}
lstUnique(lst)
Variation using rapply
A variation of the above is to recursively add a class to each leaf equal to its depth. Then we can use rapply
three times. First use rapply
to extract the classes and take the maximum to find the maximum depth. second use rapply
to apply unique
on just the nodes having the maximum depth class. Third, remove any remaining classes that were not removed by unique
because the node was not at maximum depth. (The third rapply
, i.e. the last line of code below, could be omitted if it is ok to leave some leaves with the classes we added.)
addDepth <- function(x, depth = 0) {
if (is.list(x)) lapply(x, addDepth, depth+1)
else structure(x, class = format(depth))
}
lst2 <- addDepth(lst)
mx <- max(as.numeric(rapply(lst2, class))) # max depth
lst3 <- rapply(lst2, unique, classes = format(mx), how = "replace")
rapply(lst3, as.vector, how = "replace")
Note on rapply
Note that if you alternately wanted to run unique on all leaves rather than just on the maximum depth leaves then rapply
in base R would work.
rapply(lst, unique, how = "replace")
data.tree
This alternative does require the use of a package. First we create a data.tree dt
and then traverse it applying unique to the nodes that satisfy the filterFun.
library(data.tree)
dt <- as.Node(lst)
dt$Do(function(x) x$"1" <- unique(x$"1"),
filterFun = function(x) x$level == dt$height)
print(dt, "1")
rrapply
The rrapply package provides an enhancement to rapply which can also pass a position vector whose length equals the depth so we can use it first to calculate the maximum depth mx and then again to apply unique only at that depth. (Have updated rrapply
call to use how = "unlist" as opposed to applying unlist afterwards as per suggestion in comments.)
library(rrapply)
mx <- max(rrapply(lst, f = function(x, .xpos) length(.xpos), how = "unlist"))
uniq_mx <- function(x, .xpos) if (length(.xpos) == mx) unique(x) else x
rrapply(lst, is.numeric, uniq_mx)
Test whether one factor is nested in another
Suppose you have two factors f
and g
, and want to know whether g
is nested in f
.
Method 1: For people who love linear algebra
Consider the design matrix for two factors:
Xf <- model.matrix(~ f + 0)
Xg <- model.matrix(~ g + 0)
If g
is nested in f
, then the column space of Xf
must be a subspace of the column space of Xg
. In other word, for any linear combination of Xf
's columns: y = Xf %*% bf
, equation Xg %*% bg = y
can be solved exactly.
y <- Xf %*% rnorm(ncol(Xf)) ## some random linear combination on `Xf`'s columns
c(crossprod(round(.lm.fit(Xg, y)$residuals, 8))) ## least squares residuals
## if this is 0, you have nesting.
Method 2: For people who love statistics
We check contingency table:
M <- table(f, g)
If all columns have only one non-zero entry, you have g
nested in f
. In other words:
all(colSums(M > 0L) == 1L)
## `TRUE` if you have nesting
Comment: For any method, you can squeeze the code into one line easily.
Check if a list is nested or not
You can use the is.list
function:
any(sapply(x.1, is.list))
[1] FALSE
any(sapply(x.2, is.list))
[1] TRUE
As a function isNested
:
isNested <- function(l) {
stopifnot(is.list(l))
for (i in l) {
if (is.list(i)) return(TRUE)
}
return(FALSE)
}
Instead of testing all list elements, the function stops as soon as it detects a nested list.
`group_by` and keep grouping levels as nested data frame's name
You need to add setNames
in the map
step :
library(tidyverse)
warpbreaks %>%
group_by(tension) %>%
nest() %>%
ungroup %>%
mutate(models=map(data,~glm(breaks~wool,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
p_cont = setNames(map(means,
~emmeans::contrast(.x, "pairwise",infer = c(T,T))),.$tension))
If you want to name all the list output use across
:
warpbreaks %>%
group_by(tension) %>%
nest() %>%
ungroup %>%
mutate(models=map(data,~glm(breaks~wool,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),
across(models:p_cont, setNames, .$tension)) -> result
result$jt
#$L
# model term df1 df2 F.ratio p.value
# wool 1 Inf 5.653 0.0174
#$M
# model term df1 df2 F.ratio p.value
# wool 1 Inf 1.253 0.2630
#$H
# model term df1 df2 F.ratio p.value
# wool 1 Inf 2.321 0.1277
Finding first place in nested list where some member doesn't exist
We can loop through the nested list
with lapply
, subset the elements based on logical condition %in%
, Filter
the NULL elements of list
Filter(length, lapply(li, function(x) Filter(length,
lapply(x, function(nm) nm[!"occupation" %in% names(nm)]))))
Nested lists: how to define the size before entering data
You can do that using a recursive function.
rec.list <- function(len){
if(length(len) == 1){
vector("list", len)
} else {
lapply(1:len[1], function(...) rec.list(len[-1]))
}
}
l <- rec.list(c(2, 3, 3, 4, 2, 3, 3))
Or perhaps with a 7-d list array? It might look bizarre at first, but it is a perfectly valid data structure.
l <- vector("list", 2*3*3*4*2*3*3)
dim(l) <- c(2, 3, 3, 4, 2, 3, 3)
l[[1,1,1,1,1,1,1]] <- "content"
Related Topics
Display HTML File in Shiny App
Add a New Column Between Other Dataframe Columns
Dplyr Group by Colnames Described as Vector of Strings
R: Calculate Cosine Distance from a Term-Document Matrix with Tm and Proxy
Passing Along Ellipsis Arguments to Two Different Functions
"'\W' Is an Unrecognized Escape" in Grep
Keeping Only Certain Rows of a Data Frame Based on a Set of Values
Real Cube Root of a Negative Number
Error in Strsplit When Trying to Separate by a Comma
Filtering Rows in R Unexpectedly Removes Nas When Using Subset or Dplyr::Filter
Using Geom_Rect for Time Series Shading in R
Replace Specific Values Based on Another Dataframe
How to Get the Cumulative Sum by Group in R
Got Message Unable to Load Shared Object Stats.So When R Starts
Dealing with Spaces and "Weird" Characters in Column Names with Dplyr::Rename()
Print to PDF File Using Grid.Table in R - Too Many Rows to Fit on One Page