How to Find All Possible Subsets of a Set Iteratively in R

How do I find all possible subsets of a set iteratively in R?

No need for loops (lapply or otherwise):

combn(1:4,2)
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 1 1 1 2 2 3
# [2,] 2 3 4 3 4 4

Example with calculating the sums of combinations:

combn(1:4,2,FUN=sum)
# [1] 3 4 5 5 6 7

An example with a user defined function:

x <- 11:14
combn(1:4,2,FUN=function(i,a) sum(a[i]),a=x)
#[1] 23 24 25 25 26 27

Here (in the anonymous function) i is the combination used as index and argument a is a vector to which I pass x.

And the same with a user-defined named function:

fun <- function(i,a) sum(a[i])
combn(1:4,2,FUN=fun,a=x)

finding all possible subsets of a dataframe

You can use split and paste like so:

split(z, paste(z$b, z$c, z$d))

But the tricky part of your question is how to programmatically combine the variables in columns 2:end without knowing beforehand the number of columns, their names or values. We can use a function like below to paste the values by row in columns 2:end

apply(df, 1, function(i) paste(i[-1], collapse=""))

Now combine with split

split(z, apply(z, 1, function(i) paste(i[-1], collapse="")))

How to get all subsets of a set? (powerset)

The Python itertools page has exactly a powerset recipe for this:

from itertools import chain, combinations

def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

Output:

>>> list(powerset("abcd"))
[(), ('a',), ('b',), ('c',), ('d',), ('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd'), ('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'c', 'd'), ('b', 'c', 'd'), ('a', 'b', 'c', 'd')]

If you don't like that empty tuple at the beginning, you can just change the range statement to range(1, len(s)+1) to avoid a 0-length combination.

How to iterate column values to find out all possible combinations in R?

This won't (can't) be fast for many IDs. If it is too slow, you need to parallelize or implement it in a compiled language (e.g., using Rcpp).

We sort vals. We can then create all combination of two items grouped by ID. We exclude ID's with 1 item. Finally we tabulate the result.

library(data.table)
setDT(example)
setorder(example, id, vals)
example[, if (.N > 1) split(combn(vals, 2), 1:2), by = id][, .N, by = c("1", "2")]
# 1 2 N
# 1: a b 3
# 2: a c 2
# 3: a d 3
# 4: a e 1
# 5: b c 2
# 6: b d 2
# 7: b e 1
#<...>

All N Combinations of All Subsets

This is what I would do, with, e.g., s=1:2:

1) Represent subsets with a 0/1 matrix for each element's membership.

subsets = as.matrix(do.call(expand.grid,replicate(length(s),0:1,simplify=FALSE)))

which gives

     Var1 Var2
[1,] 0 0
[2,] 1 0
[3,] 0 1
[4,] 1 1

Here, the first row is the empty subset; the second, {1}; the third, {2}; and the fourth, {1,2}. To get the subset itself, use mysubset = s[subsets[row,]], where row is the row of the subset you want.

2) Represent pairs of subsets as pairs of rows of the matrix:

pairs <- expand.grid(Row1=1:nrow(subsets),Row2=1:nrow(subsets))

which gives

   Row1 Row2
1 1 1
2 2 1
3 3 1
4 4 1
5 1 2
6 2 2
7 3 2
8 4 2
9 1 3
10 2 3
11 3 3
12 4 3
13 1 4
14 2 4
15 3 4
16 4 4

Here, the fourteenth row corresponds to the second and fourth rows of subsets, so {1} & {1,2}. This assumes the order of the pair matters (which is implicit in taking the Cartesian product). To recover the subsets, use mypairosubsets=lapply(pairs[p,],function(r) s[subsets[r,]]) where p is the row of the pair you want.

Expanding beyond pairs to the P(s)^n case (where P(s) is the power set of s) would look like

setsosets = as.matrix(do.call(expand.grid,replicate(n,1:nrow(subsets),simplify=FALSE)))

Here, each row will have a vector of numbers. Each number corresponds to a row in the subsets matrix.


Making copies of the elements of s is probably not necessary for whatever you are doing after this. However, you could do it from here by using lapply(1:nrow(pairs),function(p)lapply(pairs[p,],function(r) s[subsets[r,]])), which starts like...

[[1]]
[[1]]$Row1
integer(0)

[[1]]$Row2
integer(0)

[[2]]
[[2]]$Row1
[1] 1

[[2]]$Row2
integer(0)

How to find all subsets of a set in JavaScript? (Powerset of array)

Here is one more very elegant solution with no loops or recursion, only using the map and reduce array native functions.

const getAllSubsets =       theArray => theArray.reduce(        (subsets, value) => subsets.concat(         subsets.map(set => [value,...set])        ),        [[]]      );
console.log(getAllSubsets([1,2,3]));

Finding all the subsets of a set

It's very simple to do this recursively. The basic idea is that for each element, the set of subsets can be divided equally into those that contain that element and those that don't, and those two sets are otherwise equal.

  • For n=1, the set of subsets is {{}, {1}}
  • For n>1, find the set of subsets of 1,...,n-1 and make two copies of it. For one of them, add n to each subset. Then take the union of the two copies.

Edit To make it crystal clear:

  • The set of subsets of {1} is {{}, {1}}
  • For {1, 2}, take {{}, {1}}, add 2 to each subset to get {{2}, {1, 2}} and take the union with {{}, {1}} to get {{}, {1}, {2}, {1, 2}}
  • Repeat till you reach n


Related Topics



Leave a reply



Submit