Set Partitions in Python

get all the partitions of the set python with itertools

I wrote this one for fun:

def partition(a_list):
    yield [[x] for x in a_list]   
    for i in range(1, len(a_list) + 1):
        _l = a_list[:]
        yield [_l.pop(i-1), _l]
    yield a_list

my_list = [1, 2, 3]
print list(partition(my_list))

#or

for p in partition(my_list):
    print p

Find all unordered partitions of a list with a minimum number of elements

Here is one long-ish solution. No rejection is used in generating partitions, so in that sense this may be somewhat efficient. Still, there are lots of things to optimize.

Example:

list(get_partitions(range(3), 1))
# [[[0, 1, 2]], [[0], [1, 2]], [[1], [0, 2]], [[2], [0, 1]], [[0], [1], [2]]]

Here is an outline of how this works:

First, it is useful to define a helper split function that takes a list lst and an integer n and return all ways to split the list into two groups, one of size n and another of size len(lst) - n.
Second, we need to solve an easier version of the problem: how to split a list lst into n groups each of size k. Of course, this is only possible when len(lst) = n * k. This is implemented in get_partitions_same_size function. The idea is to always include the first element of lst in the first group and recurse.
Third, we need to find all integer partitions of len(lst). I copied code from this thread.
Fourth, for each integer partition p of len(lst), we need to find all ways to partition lst according to p.
- For example, say len(lst) == 7 and p = 3 + 2 + 2. In this case, we can choose any three elements for the first group, any remaining two for the second group, and there is no choice to be made for the final third group.
- This can introduce redundancy as we can get two partitions that only differ in the order of the last two groups.
- To deal with this issue, we can represent a partition by counting the number of groups of each size. In this example, p corresponds to p_scheme = [(3, 1), (2, 2)]. The function get_partitions_helper takes in a list lst and a "partition scheme" p_scheme, and returns all corresponding partitions without double-counting. This is where get_partitions_same_size from step two is used.
Finally, everything comes together in get_partitions: we loop over integer partitions of len(lst) and return all possible list partitions corresponding to each possible integer partition.

This is a fun problem and comments on bugs and optimizations are very welcome.

from itertools import combinations
from collections import Counter

# from this thread:
# https://stackoverflow.com/questions/10035752/elegant-python-code-for-integer-partitioning
def partitions(n, I=1):
    yield (n,)
    for i in range(I, n//2 + 1):
        for p in partitions(n-i, i):
            yield (i,) + p

def split(lst, n):
  '''
  return all ways to split lst into two groups, 
  with n and len(lst) - n elements respectively
  '''
  assert len(lst) >= n
  # handle special case of len(lst) == 2 * n
  if len(lst) == 2 * n:
    for first, second in split(lst[1:], n-1):
      yield [lst[0], *first], second
  else:
    for comb in combinations(range(len(lst)), n):
      comb = set(comb)
      first = [x for i, x in enumerate(lst) if i in comb]
      second = [x for i, x in enumerate(lst) if i not in comb]
      yield first, second

def get_partitions_same_size(lst, n, k):
  # print(lst, n, k)
  'return all ways to partition lst into n parts each of size k up to order'
  if len(lst) != n * k:
    print(lst, n, k)
  assert len(lst) == n * k
  if n == 0 and len(lst) == 0:
    yield []
  # case when group size is one
  elif k == 1:
    yield [[x] for x in lst]
  # otherwise, without loss, the first element of lst goes into the first group
  else:
    for first, rest in split(lst[1:], k-1):
      for rec_call in get_partitions_same_size(rest, n-1, k):
        yield [[lst[0], *first], *rec_call]

def get_partitions_helper(lst, p_scheme):
  """
  return all ways to partition lst into groups according to a partition scheme p_scheme

  p_scheme describes an integer partition of len(lst)
  for example, if len(lst) == 5, then possible integer partitions are:
  [(5,), (1, 4), (1, 1, 3), (1, 1, 1, 2), (1, 1, 1, 1, 1), (1, 2, 2), (2, 3)]
  for each, we count the number of groups of a given size
  the corresponding partition schemes are:
  [[(5, 1)],
  [(1, 1), (4, 1)],
  [(1, 2), (3, 1)],
  [(1, 3), (2, 1)],
  [(1, 5)],
  [(1, 1), (2, 2)],
  [(2, 1), (3, 1)]]
  """
  if not lst and not p_scheme:
    yield []
    return 
  assert len(lst) == sum(a * b for a, b in p_scheme)
  group_size, group_count = p_scheme[0]
  num_elts = group_size * group_count
  for first, second in split(lst, num_elts):
    for firsts in get_partitions_same_size(first, group_count, group_size):
      for seconds in get_partitions_helper(second, p_scheme[1:]):
        yield [*firsts, *seconds]

def get_partitions(lst, min_):
  """
  get all partitions of lst into groups s.t. each group has at least min_ elements
  up to order (of groups and elements within a group)
  """
  for partition in partitions(len(lst), min_):
    p_scheme = list(Counter(partition).items())
    yield from get_partitions_helper(lst, p_scheme)

Is there a fast algorithm for generating all partitions of a set into subsets of size 2 (and one subset of size 1)?

Partition should be viewed as a set and two partition differing only by order should be considered as the same one. So there are only 3 partitions of number set (1,2,3,4).

the number of partitions should be N!/(N/2)!/2^(N/2). Using Stirling's formula, it is approx. Sqrt(2)*(N/e)^(N/2) where e=2.71828... and very huge.

I leveraged @VirtualScooter's code and provide the recursive version of Partition, which runs faster than his itertools version (note this is not an apple-apple comparison because my Partition has no repeats).


import itertools
import timeit
t3 = (1, 2, 3)
t4 = (1, 2, 3, 4)
t6 = (1, 2, 3, 4, 5, 6)

def grouper(iterable, n, fillvalue=None):
    """Collect data into fixed-length chunks or blocks.
        Code from Python itertools page
    """
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return itertools.zip_longest(*args, fillvalue=fillvalue)

def partitionIntoPairs(collection):
    perms = itertools.permutations(collection)
    for p in perms:
        group = list(grouper(p, 2))
        if group[-1][-1] is None:
            group[-1] = (group[-1][0],)
        yield group

def Partition(Indexable):
    if len(Indexable)<=2:
        yield [Indexable]
    elif len(Indexable)%2==1:
        for i,x in enumerate(Indexable):
            for s in Partition(Indexable[:i]+Indexable[i+1:]):
                yield [[x]]+s
    else:
        for i,x in enumerate(Indexable):
            if i==0:
                x0=x
            else:
                for s in Partition(Indexable[1:i]+Indexable[i+1:]):
                    yield [[x0,x]]+s
def comp_timeit(collection, repeats=1_000):
    s1 = f"l1 = list(Partition({collection}))"
    s2 = f"l1 = list(partitionIntoPairs({collection}))"
    t1 = timeit.timeit(s1, globals=globals(),number=repeats)
    t2 = timeit.timeit(s2, globals=globals(),number=repeats)
    print(f"partition, {repeats:_} runs: {t1:,.4f}")
    print(f"itertools, {repeats:_} runs: {t2:,.4f}")
for p in Partition(t4):
    print(p)
comp_timeit(t3)
comp_timeit(t4)
comp_timeit(t6)

Speed up set partition generation by skipping ones with subsets smaller or larger than X, or partitions with more or less than Y subsets

Just use recursion.

Note, the arrays returned each time are modified in place for efficiency. If you're going to do anything interesting with them, you probably want to make your own copies.

def split_partition (elements, min_size, max_size, partition=None, rest=None, i = 0):
    if partition is None:
        partition = []
    if rest is None:
        rest = []

    # The first element always goes in.
    if i == 0:
        partition.append(elements[i])
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
    elif len(partition) + len(elements) - i < min_size:
        pass # Too few solutions.
    elif max_size == len(partition):
        for j in range(i, len(elements)):
            rest.append(elements[j])
        yield (partition, rest) # Just the one solution.
        for j in range(i, len(elements)):
            rest.pop()
    elif i == len(elements):
        yield (partition, rest) # Just the one solution.
    else:
        partition.append(elements[i])
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
        rest.append(partition.pop())
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
        rest.pop()

def desired_partitions (elements, min_size, max_size, min_count, max_count):
    if len(elements) < min_size * min_count:
        pass # No solutions possible.
    elif max_size * max_count < len(elements):
        pass # No solutions possible.
    else:
        for partition, rest in split_partition(elements, min_size, max_size):
            if 0 == len(rest):
                yield [partition]
            else:
                for split in desired_partitions(rest, min_size, max_size, min_count - 1, max_count - 1):
                    split.append(partition)
                    yield split
                    split.pop()

for split in desired_partitions([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 3, 5, 2, 3):
    print(split)

Set partitions into k groups (including the NULL set) in Python

Here is a solution in Sagemath, using NumPy arrays and itertools. The idea is same as in your code: create OrderedSetPartitions and beef them up with empty sets. To do this without too many loops, NumPy arrays are used: the key part is partitions[:, s] = P where certain columns of a 2D array partitions, initially filled with empty sets, are replaced by nonempty sets coming from OrderedSetPartitions.

import numpy as np
from itertools import combinations
A = Set([1, 2, 3, 4, 5])        # Sage set, not Python set
k = 3                           # number of elements in partition
all_partitions = np.array(OrderedSetPartitions(A, k).list())
for i in range(k-1, 0, -1):
    P = np.array(OrderedSetPartitions(A, i).list()) if i > 1 else [[A]]
    for s in combinations(range(k), i):
        partitions = np.empty((len(P), k), dtype=object)
        partitions[:, :] = [[Set()]]
        partitions[:, s] = P
        all_partitions = np.vstack((all_partitions, partitions))
print all_partitions

The output is a double NumPy array. You can return all_partitions.tolist() if a Python list is desired.

Technicalities

Sage sets (created with Set([1,2,3])) and Python sets (created with set([1,2,3]) or {1,2,3,4,5}) are objects of different classes. Within Sagemath, the output looks better for Sage sets: they are shown as {1,2,3} while Python sets are displayed as set([1,2,3]). For this reason, Sage sets are to be preferred within Sagemath. Also, OrderedSetPartitions returns Sage sets.

But it takes a bit more effort to get NumPy to play along with Sage sets: in particular, I couldn't get np.full to accept empty Sage set Set() as a filling object. This is the reason for using np.empty and then filling it in.

A similar issue is responsible for the case i == 1 being treated differently: NumPy tries to cast [[Set([1,2,3,4,5])]] to a three-dimensional array of numbers instead of a two-dimensional array containing one Sage set object.

Set ordered partitions in Python of maximum size n

As mentioned in the comments, with n distinct elements and k blocks or slices, it is sufficient to generate all choices of putting k-1 separators in n-1 possible positions:

from itertools import combinations

def segmentations(a, k):
    n = len(a)
    assert 1 <= k <= n, (n, k)

    def split_at(js):
        i = 0

        for j in js:
            yield a[i:j]
            i = j

        yield a[i:]

    for separations in combinations(range(1, n), k - 1):
        yield list(split_at(separations))

This generates divisions into exactly k parts, and it is trivial to modify it to generate up to k parts. It also makes it clear that there are indeed C(n-1, k-1) elements in the result for exactly k parts. Now, C(1000, 8) is 24,115,080,524,699,431,125. Probably better to try a different approach altogether?

Algorithm to produce all partitions of a list in order

You can think of the problem as follows: each of the partitions you want are characterized by a integer between 0 and 2^(n-1). Each 1 in the binary representation of such a number corresponds to a "partition break" between two consecutive numbers, e.g.

 a b|c|d e|f
  0 1 1 0 1

so the number 01101 corresponds to the partition {a,b},{c},{d,e},{f}. To generate the partition from a known parition number, loop through the list and slice off a new subset whenever the corresponding bit it set.

I can understand your pain reading the fashionable functional-programming-flavored Ruby example. Here's a complete example in Python if that helps.

array = ['a', 'b', 'c', 'd', 'e']
n = len(array)

for partition_index in range(2 ** (n-1)):

    # current partition, e.g., [['a', 'b'], ['c', 'd', 'e']]
    partition = []

    # used to accumulate the subsets, e.g., ['a', 'b']
    subset = []

    for position in range(n):

        subset.append(array[position])

        # check whether to "break off" a new subset
        if 1 << position & partition_index or position == n-1:
            partition.append(subset)
            subset = []

    print partition