Intersect All Possible Combinations of List Elements

Intersect all possible combinations of list elements

combn works with list structures as well, you just need a little unlist'ing of the result to use intersect...

# Get the combinations of names of list elements
nms <- combn( names(l) , 2 , FUN = paste0 , collapse = "" , simplify = FALSE )

# Make the combinations of list elements
ll <- combn( l , 2 , simplify = FALSE )

# Intersect the list elements
out <- lapply( ll , function(x) length( intersect( x[[1]] , x[[2]] ) ) )

# Output with names
setNames( out , nms )
#$AB
#[1] 2

#$AC
#[1] 2

#$AD
#[1] 0

#$BC
#[1] 1

#$BD
#[1] 0

#$CD
#[1] 1

The intersection of all combinations of n lists

Map of intersections of intersections in Java 7

Three steps of processing the original lists: first collect a map of intersections for each element Map<Integer,Set<Integer>>, then for this map collect a map of intersections of intersections Map<Set<Integer>,Set<Integer>>, and then append the larger intersections sets to the smaller intersections sets if they intersect.

Try it online!

Original lists List<List<Integer>>:

List 0: [1, 2, 6, 5, 4, 3]
List 1: [3, 7, 2, 9, 5, 4]
List 2: [2, 6, 7, 1, 4]

1 - Map of intersections Map<Integer,Set<Integer>>:

Element: 1 is in lists: [0, 2]
Element: 2 is in lists: [0, 1, 2]
Element: 3 is in lists: [0, 1]
Element: 4 is in lists: [0, 1, 2]
Element: 5 is in lists: [0, 1]
Element: 6 is in lists: [0, 2]
Element: 7 is in lists: [1, 2]
Element: 9 is in lists: [1]

2 - Map of intersections of intersections Map<Set<Integer>,Set<Integer>>:

Lists: [0, 1, 2] contain elements: [2, 4]
Lists: [0, 1] contain elements: [3, 5]
Lists: [0, 2] contain elements: [1, 6]
Lists: [1, 2] contain elements: [7]

3 - Map of intersections of intersections after appending the larger intersections sets to the smaller intersections sets if they intersect:

Lists: [0, 1, 2] contain elements: [2, 4]
Lists: [0, 1] contain elements: [2, 3, 4, 5]
Lists: [0, 2] contain elements: [1, 2, 4, 6]
Lists: [1, 2] contain elements: [2, 4, 7]

Java 7 code:

List<List<Integer>> lists = Arrays.asList(
Arrays.asList(1, 2, 6, 5, 4, 3),
Arrays.asList(3, 7, 2, 9, 5, 4),
Arrays.asList(2, 6, 7, 1, 4));
// map of intersections:
// key - element of the list,
// value - set of indexes of lists,
// i.e. where this element occurs
Map<Integer, Set<Integer>> map1 = new TreeMap<>();
for (int i = 0; i < lists.size(); i++) {
// output the original list
System.out.println("List " + i + ": " + lists.get(i));
for (int element : lists.get(i)) {
// pull out the set of intersections
Set<Integer> set = map1.remove(element);
// create it if it doesn't exist
if (set == null) set = new TreeSet<>();
// add index of current list
set.add(i);
// put into the map
map1.put(element, set);
}
}
// intermediate output
for (Map.Entry<Integer, Set<Integer>> entry : map1.entrySet())
System.out.println("Element: " + entry.getKey()
+ " is in lists: " + entry.getValue());
// custom comparator for the map of intersections of intersections
Comparator<Set<Integer>> comparator = new Comparator<Set<Integer>>() {
@Override
public int compare(Set<Integer> o1, Set<Integer> o2) {
// group intersections that are equal
if (o1.containsAll(o2) && o2.containsAll(o1)) return 0;
// compare by number of intersections in reverse order
int val = Integer.compare(o2.size(), o1.size());
if (val != 0) return val;
// if sizes are equal compare hashCodes
return Integer.compare(o1.hashCode(), o2.hashCode());
}
};
// map of intersections of intersections:
// key - set of indexes of lists
// value - set of elements
TreeMap<Set<Integer>, Set<Integer>> map2 = new TreeMap<>(comparator);
for (Map.Entry<Integer, Set<Integer>> entry : map1.entrySet()) {
// set of intersecting elements
Set<Integer> key = entry.getValue();
// filter out unique elements
if (key.size() == 1) continue;
// pull out the set of intersecting elements
Set<Integer> value = map2.remove(key);
// create it if it doesn't exist
if (value == null) value = new TreeSet<>();
// add current element
value.add(entry.getKey());
// put into the map
map2.put(key, value);
}
// intermediate output
for (Map.Entry<Set<Integer>, Set<Integer>> entry : map2.entrySet())
System.out.println("Lists: " + entry.getKey()
+ " contain elements: " + entry.getValue());
// append the larger intersections sets to the
// smaller intersections sets if they intersect
for (Map.Entry<Set<Integer>, Set<Integer>> entry : map2.entrySet()) {
// for each entry process the values of other
// entries with less number of intersections
Map<Set<Integer>, Set<Integer>> tailMap = map2.tailMap(entry.getKey(), false);
for (Map.Entry<Set<Integer>, Set<Integer>> other : tailMap.entrySet())
// if the intersection set of the current entry contains
// all intersections from the set of another entry
if (entry.getKey().containsAll(other.getKey()))
// then add all intersecting elements of
// the current entry to another entry
other.getValue().addAll(entry.getValue());
}
// final output
for (Map.Entry<Set<Integer>, Set<Integer>> entry : map2.entrySet())
System.out.println("Lists: " + entry.getKey()
+ " contain elements: " + entry.getValue());

See also: The intersection of all combinations of n sets

The intersection of all combinations of n sets

Here is a solution inspired by MapReduce: Simplified Data Processing on Large Clusters, which can therefore be written in a distributed way if you want.

Map all of your elements in sets to pairs [element, set]. Group this list by element. You can just sort and pull out elements. Or you can create a hash of arrays whose keys are the elements and values are the list of sets that element is found in. Then for each element, emit a list of [combination of sets, element]. Group that by combination. And now for each non-empty combination, you can just read off the elements in it.

If you wish to distribute the computation with a real map-reduce, the first map would map to a key of element, and value of set. The first reduce would just store by element the list of sets it is in. The second map would emit for each element one key for each combination of sets it is in, with the element as the value. And the second reduce would store your final answer.

It may help to work your example in detail. You start with:

Set 1 = 1, 10, 6, 11, 14, 3
Set 2 = 3, 7, 11, 9, 5
Set 3 = 11, 6, 9, 1, 4

You map that to the list:

[1, Set 1], [10, Set 1], [6, Set 1], [11, Set 1], [14, Set 1], [3, Set 1],
[3, Set 2], [7, Set 2], [11, Set 2], [9, Set 2], [5, Set 2],
[11, Set 3], [6, Set 3], [9, Set 3], [1, Set 3], [4, Set 3],

Now group by element (I did it by hand by sorting) to get:

1: Set 1, Set 3
3: Set 1, Set 2
4: Set 3
5: Set 2
6: Set 1, Set 3
7: Set 2
9: Set 2, Set 3
10: Set 1
11: Set 1, Set 2, Set 3
14: Set 1

And now we do the second mapping (skipping the elements that are only in one set) to get:

[(Set 1, Set 3), 1],
[(Set 1, Set 2), 3],
[(Set 1, Set 3), 6],
[(Set 2, Set 3), 9],
[(Set 1, Set 2), 11],
[(Set 1, Set 3), 11],
[(Set 2, Set 3), 11],
[(Set 1, Set 2, Set 3), 11]

Group that by combination of sets and we get:

(Set 1, Set 2): 3, 11
(Set 1, Set 3): 1, 6, 11
(Set 2, Set 3): 9, 11
(Set 1, Set 2, Set 3): 11

Intersection of pairs of sets (any possible combination)

We can use combn to get the pairwise intersect between the elements, get the lengths of the list elements and find the sum

sum(lengths(combn(list(S1, S2, S3), 2,
FUN = function(x) Reduce(intersect, x), simplify = FALSE)))
#[1] 5

If there are many objects of the same pattern 'S' followed by some digits, use mget to get those all into a list instead of writing them manually

lst1 <- mget(ls(pattern = '^S\\d+$'))
sum(lengths(combn(lst1, 2,
FUN = function(x) Reduce(intersect, x), simplify = FALSE)))
#[1] 5

R: How to Intersect multiple vectors that gives all possible combination

I would suggest saving the sets in a list, then you could iterate over the elements of the list, e.g.:

sets2intersect <- list(set1, set2, set3,set4,set5)

lapply(unique(unlist(sets2intersect)), function(i){
which(sapply(sets2intersect, function(x) any(i == x)))
})

[1]]
[1] 1 2 4

[[2]]
[1] 1 2

[[3]]
[1] 1 4

[[4]]
[1] 1 3

[[5]]
[1] 2

[[6]]
[1] 3

[[7]]
[1] 5

If you want to rename your list, to know which element was used, you can do:

result<- lapply(unique(unlist(sets2intersect)), function(i){
which(sapply(sets2intersect, function(x) any(i == x)))
})
names(result) <- unique(unlist(sets2intersect))

$g1
[1] 1 2 4

$g2
[1] 1 2

$g3
[1] 1 4

$g4
[1] 1 3

$g8
[1] 2

$g17
[1] 3

$g5
[1] 5

What is the best way to find ALL possible intersections of multiple sets?

If you're only looking for intersections of two sets, you can simply do nested for loops:

Set1 = {1,2,3,4,5}
Set2 = {4,5,6,7}
Set3 = {6,7,8,9,10}
Set4 = {1,8,9,15}
sets = [Set1,Set2,Set3,Set4]
for i,s1 in enumerate(sets[:-1]):
for j,s2 in enumerate(sets[i+1:]):
print(f"Set{i+1} and Set{i+j+2} = {s1&s2}")

# Set1 and Set2 = {4, 5}
# Set1 and Set3 = set()
# Set1 and Set4 = {1}
# Set2 and Set3 = {6, 7}
# Set2 and Set4 = set()
# Set3 and Set4 = {8, 9}

If you're looking for intersections of any number of these sets then you can use combinations() from itertools to produce a power set of indices and perform the intersection for each combination:

from itertools import combinations
for comboSize in range(2,len(sets)):
for combo in combinations(range(len(sets)),comboSize):
intersection = sets[combo[0]]
for i in combo[1:]: intersection = intersection & sets[i]
print(" and ".join(f"Set{i+1}" for i in combo),"=",intersection)

Set1 and Set2 = {4, 5}
Set1 and Set3 = set()
Set1 and Set4 = {1}
Set2 and Set3 = {6, 7}
Set2 and Set4 = set()
Set3 and Set4 = {8, 9}
Set1 and Set2 = {4, 5}
Set1 and Set3 = set()
Set1 and Set4 = {1}
Set2 and Set3 = {6, 7}
Set2 and Set4 = set()
Set3 and Set4 = {8, 9}
Set1 and Set2 and Set3 = set()
Set1 and Set2 and Set4 = set()
Set1 and Set3 and Set4 = set()
Set2 and Set3 and Set4 = set()
  • A "power set" is when you take all possible combinations of various sizes from a set of values. itertools documentation has a recipe for that.
  • In this case we are only interested in combinations of 2,3,..., n-1 sizes. Hence the loop on comboSize in range(2,len(sets))
  • For each of these sizes, we get combination of indexes in the sets list using itertool's combinations function. e.g. for comboSize=3 and 4 items in sets, combo will get: (0, 1, 2) (0, 1, 3) (0, 2, 3) (1, 2, 3)
  • The intersection set will intersect all the sets at these indexes using the & operator (set intersection) starting with the first index (combo[0]) and intersecting the remaining indexes (combo[1:]) into a single set.
  • The print function joins the set identifiers (f"Set{i+1}") with an " and " string and prints the resulting intersection on the same line.
  • f"Set{i+1}" is a format string that replaces {i+1} with the set index from combo (+1).

Python 2.** Finding the union of all possible intersections of two-list combinations from a list of lists

A "more pythonic" solution:

import itertools
txArray=[[u'1'],[u'2'],[u'2', u'3']]
# generate all possible pairs from txArray, and intersect them
ix=[set(p[0]).intersection(p[1]) for p in itertools.combinations(txArray,2)]
# calculate the union of the list of sets
set.union(*ix)


Related Topics



Leave a reply



Submit