Intelligently Generating Combinations of Combinations

intelligently generating combinations of combinations

Well, there's (30_C5*25_C5*20_C5*15_C5*10_C5*5_C5)/6! = 30!/(6!*5!⁶) = 123,378,675,083,039,376 different partitons of 30 into groups of 5, so generating them all will take some time, no matter what method you use.

In general, though, a good method to selecting such a partition is to use some ordering on the elements, and find the grouping for the highest ungrouped element, and then group the rest.

     find_partition = lambda do |elts|
        if elts.empty?
         [[]]
        else
         highest = elts.pop
         elts.combination(4).map do |others|
            find_partition[elts - others].map { |part| part << [highest,*others] }
         end.inject(:+)
        end
     end
     find_partition[(1..30).to_a]

This way you're only generating each partition once

Algorithm to generate every combination of n elements split into k sets

I don't know of any algorithm names for this (though one probably exists), but the approach I mentioned in the comments avoids dealing with duplicates and is as efficient as I imagine you can get.

It seems like you could get some improvement by turning the problem on its head: Each letter has to go into one of four buckets, and the buckets have limited space, so recursively try putting each letter into each bucket that has room for it. This way you're only producing combinations rather than permutations.

Here's a C# implementation. It can generate 10,000,000 combinations in under 30 seconds, and 2/3 of that time is spent just in building the string outputs:

void Main()
{
    // Tweak these starting values to create smaller subsets if you want.
    var letters = Enumerable.Range(0, 26).Select(i => (char)('a' + i)).ToList();
    var buckets = new[]{new Bucket(6), new Bucket(6), new Bucket(7), new Bucket(7)};
    // I'm only taking 100 values because otherwise this would take a really long time.
    var combos = Combos(letters, 0, buckets).Take(100);
    foreach (var combo in combos)
    {
        Console.WriteLine(combo);
    }
}

public class Bucket : List<char>
{
    public int MaxLoad {get; private set;}
    public Bucket(int capacity) : base(capacity)
    {
        MaxLoad = capacity;
    }
}

// Define other methods and classes here
IEnumerable<string> Combos(IList<char> letters, int currentIndex, Bucket[] buckets)
{
    if(currentIndex == letters.Count){
        yield return string.Join("|", buckets.Select(b => string.Join(",", b)));
        yield break;
    }
    var currentLetter = letters[currentIndex];
    foreach (var bucket in buckets)
    {
        if(bucket.Count < bucket.Capacity)
        {
            bucket.Add(currentLetter);
            foreach (var possibility in Combos(letters, currentIndex + 1, buckets))
            {
                yield return possibility;
            }
            bucket.Remove(currentLetter);
        }
    }
}

Sample output:

a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,s|t,u,v,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,t|s,u,v,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,u|s,t,v,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,v|s,t,u,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,w|s,t,u,v,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,x|s,t,u,v,w,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,y|s,t,u,v,w,x,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,r,z|s,t,u,v,w,x,y
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,s,t|r,u,v,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,s,u|r,t,v,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,s,v|r,t,u,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,s,w|r,t,u,v,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,s,x|r,t,u,v,w,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,s,y|r,t,u,v,w,x,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,s,z|r,t,u,v,w,x,y
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,t,u|r,s,v,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,t,v|r,s,u,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,t,w|r,s,u,v,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,t,x|r,s,u,v,w,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,t,y|r,s,u,v,w,x,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,t,z|r,s,u,v,w,x,y
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,u,v|r,s,t,w,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,u,w|r,s,t,v,x,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,u,x|r,s,t,v,w,y,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,u,y|r,s,t,v,w,x,z
a,b,c,d,e,f|g,h,i,j,k,l|m,n,o,p,q,u,z|r,s,t,v,w,x,y
...

One nice feature of the approach I've given is that you can process the results as they are produced--you don't have to wait for the whole list to be generated, and you don't need to have all the combinations in memory at the same time.

But be aware that you're going to end up with a lot of combinations--quite possibly more than the computer can generate in any reasonable amount of time regardless of algorithmic efficiency. If Vincent's estimate of 10^12 is correct, for example, it would take roughly a year using the code above. You might be able to optimize it down to a month or so. Parallelization might take it down to a week on a really strong computer.

Generating combinations of edges under a certain condition

EDIT: updated to acommodate the requirements of "only two start/end edges"

I'm not sure what interface you have in mind, but from what I understood it looks like you can use filter() to select a subset of edges that "start in 0" or "end in n".

>>> edges = [(0,1), (0,5), (0,2), (5,3), (2,9), (4,6), (6,9), (3,9), (0,9)]
>>> edges_start = filter(lambda e: e[0] == 0, edges)
>>> edges_end = filter(lambda e: e[1] == 9, edges)
>>> edges_end
[(2, 9), (6, 9), (3, 9), (0, 9)]
>>> edges_start
[(0, 1), (0, 5), (0, 2), (0, 9)]

Now you can use itertools.combinations() to generate all possible pairs from each of the lists. Here's an example:

>>> import itertools
>>> list(itertools.combinations(edges_start, 2))
[((0, 1), (0, 5)), ((0, 1), (0, 2)), ((0, 1), (0, 9)), ((0, 5), (0, 2)), ((0, 5), (0, 9)), ((0, 2), (0, 9))]

Now you can plug in itertools.product() to generate all combinations of "pair from one list" and "pair from other list":

>>> edges_start_pairs = list(itertools.combinations(edges_start, 2))
>>> edges_end_pairs = list(itertools.combinations(edges_end, 2))
>>> pairs = list(itertools.product(edges_start_pairs, edges_end_pairs))

That's it! You can "flatten" the data structure if you like, but this is optional:

>>> flat_pairs = [list(p[0]+p[1]) for p in pairs]

Now let's pretty-print the results:

>>> from pprint import pprint
>>> pprint(flat_pairs)
[[(0, 1), (0, 5), (2, 9), (6, 9)],
 [(0, 1), (0, 5), (2, 9), (3, 9)],
 [(0, 1), (0, 5), (2, 9), (0, 9)],
 [(0, 1), (0, 5), (6, 9), (3, 9)],
 [(0, 1), (0, 5), (6, 9), (0, 9)],
 [(0, 1), (0, 5), (3, 9), (0, 9)],
 [(0, 1), (0, 2), (2, 9), (6, 9)],
 [(0, 1), (0, 2), (2, 9), (3, 9)],
 [(0, 1), (0, 2), (2, 9), (0, 9)],
 [(0, 1), (0, 2), (6, 9), (3, 9)],
 [(0, 1), (0, 2), (6, 9), (0, 9)],
 [(0, 1), (0, 2), (3, 9), (0, 9)],
 [(0, 1), (0, 9), (2, 9), (6, 9)],
 [(0, 1), (0, 9), (2, 9), (3, 9)],
 [(0, 1), (0, 9), (2, 9), (0, 9)],
 [(0, 1), (0, 9), (6, 9), (3, 9)],
 [(0, 1), (0, 9), (6, 9), (0, 9)],
 [(0, 1), (0, 9), (3, 9), (0, 9)],
 [(0, 5), (0, 2), (2, 9), (6, 9)],
 [(0, 5), (0, 2), (2, 9), (3, 9)],
 [(0, 5), (0, 2), (2, 9), (0, 9)],
 [(0, 5), (0, 2), (6, 9), (3, 9)],
 [(0, 5), (0, 2), (6, 9), (0, 9)],
 [(0, 5), (0, 2), (3, 9), (0, 9)],
 [(0, 5), (0, 9), (2, 9), (6, 9)],
 [(0, 5), (0, 9), (2, 9), (3, 9)],
 [(0, 5), (0, 9), (2, 9), (0, 9)],
 [(0, 5), (0, 9), (6, 9), (3, 9)],
 [(0, 5), (0, 9), (6, 9), (0, 9)],
 [(0, 5), (0, 9), (3, 9), (0, 9)],
 [(0, 2), (0, 9), (2, 9), (6, 9)],
 [(0, 2), (0, 9), (2, 9), (3, 9)],
 [(0, 2), (0, 9), (2, 9), (0, 9)],
 [(0, 2), (0, 9), (6, 9), (3, 9)],
 [(0, 2), (0, 9), (6, 9), (0, 9)],
 [(0, 2), (0, 9), (3, 9), (0, 9)]]

Generating All Combinations of List n Levels Deep in Java

I believe I made what you are looking for. The code is split up into four separate methods(You didn't make any distinction if it had to be only 1):

public static <T extends Comparable<? super T>> List<List<List<T>>> level(List<List<T>> items, int level) {
    List<List<List<T>>> result = new ArrayList<>();
    if(level == 1) {
        for(List<T> item : items) {
            result.add(Collections.singletonList(item));
        }
        return result;
    }

    for(int i = 0; i < level; i++) {
        if(i == 0) {
            for(List<T> item : items)
                result.add(Collections.singletonList(item));
            continue;
        }

        List<List<List<T>>> newResult = new ArrayList<>();
        for(List<List<T>> item : result) {
            List<List<List<T>>> combined = new ArrayList<>();
            List<T> first = item.get(0);
            for(int j = 0; j < items.size(); j++) {
                List<List<T>> current = new ArrayList<>();
                List<T> it = items.get(j);
                current.addAll(item);
                current.add(it);

                combined.add(current);
            }

            newResult.addAll(combined);
        }

        result = newResult;
    }

    clean(result);
    return result;
}

This is what does the majority of the algorithm. First, like in the function you provided, it checks to see if the level is 1, in which case you can just return a list of list, like in the given method. After that, we loop over however many levels we have. We first check to see if the current level is 1, in which case it will just do the same thing as if the method were called with a level of 1. Next we create a new list called newResult. This variable is basically a temporary value of result to prevent a concurrent modification exception. Then we loop through every value already in result. We create a few new values based on whatever value we have, and add them to combined. We then add combined into newResult, and the algorithm is basically over. Now, those loops don't count for when all of the values are the same, for example, [1, 2], [1, 2], [1, 2], [1, 2]. So we call the clean method to remove those cases.

public static <T extends Comparable<? super T>> void clean(List<List<List<T>>> list) {
    List<List<List<T>>> removals = new ArrayList<>();
    for(List<List<T>> item : list) {
        if(!check(item))
            removals.add(item);
    }
    for(List<List<T>> item : removals) {
        list.remove(item);
    }
}

This method runs through everything inside of the given list, and runs the check method on the element. That method is explained further down. If the element is not valid, it is marked for removal, and is removed in the next loop.

public static <T extends Comparable<? super T>> boolean check(List<List<T>> list) {
    if(list.size() < 2) return true;

    for(int i = 1; i < list.size(); i++) {
        List<T> previous = list.get(i-1);
        List<T> item = list.get(i);

        if(notEqual(previous, item)){
            return true;
        }
    }

    return false;
}

This loop checks to see if a given list is valid, by comparing one list against another, until it finds two that aren't the same. When that happens, the list is valid, and returns true. If not, it will never return, will break out of the loop, and return false.

public static <T extends Comparable<? super T>> boolean notEqual(List<T> a, List<T> b) {
    for(int i = 0; i < Math.min(a.size(), b.size()); i++) {
        T ao = a.get(i);
        T bo = b.get(i);

        if(ao.compareTo(bo) != 0)
            return true;
    }

    return false;
}

This method takes two input lists, and checks to see if the elements inside of them aren't equal. It loops over both lists, getting the elements at the same index, and compares them to one another. If they are not equal, it returns true, and otherwise, it finishes the loop and returns false.

Please note that this is simply a proof of concept and not a finished version. A lot of aspects of it could definitely be improved upon, but this is a working version of what you asked for.

Link to working version on jDoodle

If you have any questions about any aspect of it, or want anything clarified, don't hesitate to ask!

Edit:
I have revised the algorithm to incorporate what you have asked for. Here is the new code:

public static <T extends Comparable<? super T>> List<List<List<T>>> level(List<List<T>> items, int minLevel, int maxLevel) {
    List<List<List<T>>> result = new ArrayList<>();

    for(int i = minLevel; i < maxLevel+1; i++) {
        result.addAll(level(items, i));
    }

    return result;
}

This is the overloaded method that will allow you to specify the range of levels you want. Given a minimum and maximum level, it will return a new list containing all of the levels within that range, inclusive. As you said, it is relatively trivial, as a simple loop.

public static <T extends Comparable<? super T>> List<List<List<T>>> level(List<List<T>> items, int level) {
    List<List<List<T>>> result = new ArrayList<>();
    if(level == 1) {
        for(List<T> item : items) {
            result.add(Collections.singletonList(item));
        }
        return result;
    }

    for(int i = 0; i < level; i++) {
        if(i == 0) {
            for(List<T> item : items)
                result.add(Collections.singletonList(item));
            continue;
        }

        List<List<List<T>>> newResult = new ArrayList<>();
        for(List<List<T>> item : result) {
            if(item.size() < i)
                continue;

            List<List<List<T>>> combined = new ArrayList<>();
            List<T> first = item.get(0);
            for(int j = 0; j < items.size(); j++) {
                List<List<T>> current = new ArrayList<>();
                List<T> it = items.get(j);
                current.addAll(item);
                current.add(it);

                combined.add(current);
            }

            newResult.addAll(combined);
        }

        result = newResult;
    }

    List<List<List<T>>> removals = new ArrayList<>();
    for(List<List<T>> item : result) {
        if(!check(item))
            removals.add(item);
    }
    for(List<List<T>> item : removals) {
        result.remove(item);
    }

    return result;
}

Here is the revised method. I removed the clean method, and just put it inside of the level method, as it was only called once. I don't think it is really possible, at least with the current code to run that clean method during the algorithm, because at this point in time, the way it works is it generates all possible combinations for a given level, then goes to the next one. If combinations that are the same were removed, on the next level, those combinations wouldn't be added.

Here is an example:
Say I have [1, 2], [1, 3], [2, 3]. If I went to level two, I would have the combinations specified in your question. Pretty obvious right? Well, if I then went on to level 3, only using the results from level 2, I would miss out on all combinations containing [1, 2], [1, 2] [...], since that is not in the given list. This is an issue with the algorithm, and could definitely be improved upon.

I plan on further refactoring this, to make it check inside the algorithm, but it might take me a hot minute to do that.

New working version in jDoodle

Edit 2:
Incorporating the clean method inside of the algorithm was actually much simpler than I initially thought. Here is the new code with a couple of comments:

public static <T extends Comparable<? super T>> List<List<List<T>>> level(List<List<T>> items, int level) {
    List<List<List<T>>> result = new ArrayList<>();
    for(int i = 0; i < level; i++) {
        if(i == 0) { // If level is 0, we can just add the items as singleton lists to the result
            for(List<T> item : items)
                result.add(Collections.singletonList(item));
            continue;
        }

        List<List<List<T>>> newResult = new ArrayList<>(); // Temporary items that will be added
        for(List<List<T>> item : result) {
            if(item.size() < i) // Make sure we are manipulating items that are on the previous level
                continue;

            List<List<List<T>>> combined = new ArrayList<>(); // The temporary values for this specific item
            for(int j = 0; j < items.size(); j++) {
                List<List<T>> current = new ArrayList<>(); // The current list with the value
                current.addAll(item); // Add the current items from result to the list
                current.add(items.get(j)); // Add the current item from items to the list

                if (i == level-1 && !check(current)) { // If this is the last level, and the current list shouldn't be added, skip adding
                    continue;
                }

                combined.add(current); // Add the current list to the combined values
            }

            newResult.addAll(combined); // Add all of the lists in combined to the new result
        }

        result = newResult; // Make result equal to the new result
    }

    return result;
}

Now what it does is, when adding a new combination to the list, it first checks if the current level is the final one. If so, it actually checks the list, and if it is not valid, it skips actually adding it.

I again plan on completely rewriting the algorithm in a much more intelligent format, but this code completely works right now.

Working version on jDoodle

Algorithm to match most possible combinations

Your algorithm is generally right way to go, but there is one improvement you can do. Since you only care about make exact matches of a given party size, I would map each party to a list that contains all parties of a given size:
IE: {1: [parties of size 1], 2: [parties of size 2], ...}

Then take a look at this:

https://en.wikipedia.org/wiki/Partition_%28number_theory%29

The tricky bit is we want to match things as ideally as possible. Basically we want to start with the least number of parties to most number of parties that need to be combined. IE for party size of 8: 8, 7 + 1, 6 + 2, 5 + 3 and so on. Once those have all been matched then we look at the ones that require combining 3 parties (always in order of most to least): 6 + 1 + 1, 5 + 2 + 1, 4 + 2 + 2... then the ones with 4 parts, then 5 parts, 6 parts, 7 parts, and lastly 8 parts (all ones).

Looks like there are 22 partitions of 8, you could likely just hard code them and loop over them. Depending on your max number of parties, you could build a partitions table for all number of parties you need.

This is roughly how that algorithm would work on your example list:

1, 6, 3, 8, 2, 1, 5, 3, 2

{party size: number of parties remaining}
{8: 1, 6: 1, 5: 1, 3: 2, 2: 2, 1: 1}
{6: 1, 5: 1, 3: 2, 2: 2, 1: 1} => (8)
{5: 1, 3: 2, 2: 1, 1: 1} => (8), (6,2)
{3: 1, 2: 1, 1: 1} => (8), (6,2), (5,3)
{3: 1, 2: 1, 1: 1}

If you keep an eye on the total of the remaining parties you would stop checking there are 3 + 2 + 1 = 6 < 8, so you can't create any additional valid parties. I believe this creates the idea number of parties.

Example where you aimed to have party of 7:

2,2,4,3,1,5
{5: 1, 4: 1, 3: 1, 2: 2}
{4: 1, 3: 1, 2: 1} => (5,2),
{2: 1} => (5,2),(4,3)

Again at this point, it's impossible to make a party of 7.

As for generating partitions, this seems solid:

https://www.geeksforgeeks.org/generate-unique-partitions-of-an-integer/

Whether you need to optimize or not depends on max party size. 16 has 231 partitions, but 32 has 8349 still not bad, but 64 has 1741630, that could be bad unless you have a lot of parties.
http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Partitions/partitions.html#pcalc1

Edit:
Only problem here is that small parties may be disadvantaged. In this case, you may reverse the search order, so it starts looking at parties of min size (all ones) instead of min size (full party). I would probably reverse the search order every say 3-10 party searches. Depending on how often you do it.

You might also want to try doing both directions, then just picking the one with the better results.

In reverse order:

1, 6, 3, 8, 2, 1, 5, 3, 2
{8: 1, 6: 1, 5: 1, 3: 2, 2: 2, 1: 1}
{8: 1, 6: 1, 5: 1, 3: 1} => (1,2,2,3)
{6: 1} => (1,2,2,3),(3,5),(8)

While in this case, both ways created 3 full groups, I doubt this will always be the case. However, notice this approach did reduce it down to only 1 party of 6, instead of a party of 3, 2 and 1.

This approach basically seeks to reduce the number of parties as much as possible. The other approach aims to maximize the number of full groups. Both have their uses and recommend you use both, just a question of how often you use one vs the other.

Hmmm a 3rd option might to attack it from both sides, though in this case, you have other issues, with it being biased against those in the middle.

1, 6, 3, 8, 2, 1, 5, 3, 2
{8: 1, 6: 1, 5: 1, 3: 2, 2: 2, 1: 1}
{6: 1, 5: 1, 3: 2, 2: 2, 1: 1} => (8)
{6: 1, 5: 1, 3: 1} => (8),(1,2,2,3)
{6: 1} => (8),(1,2,2,3),(5,3)

Really, you could randomly shuffle all of the partitions and run the algorithm that way too if you want to make things more interesting, though doubt that would yield above average results. Main point is that integer partitions of N are what you need to be looking at.

Mathematica enumerate combinations

if you need a "formula" for the 'nth' tuple it looks like this:

{ Floor[(# - 1)/16      ] + 1,
  Floor[Mod[# - 1, 16]/4] + 1 , 
  Floor[Mod[# - 1, 4]   ] + 1 } & /@ Range[64] ==
  Tuples[Range[4], 3]

True

so then if you want say the 12'th combination of your sets you could do something like this:

({ 
     Floor[(# - 1)/16] + 1,
     Floor[Mod[# - 1, 16]/4 + 1] ,
     Mod[# - 1, 4] + 1 } &@12);
{{a, b, c, d}[[%[[1]]]], {e, f, g, h}[[%[[2]]]], {i, j, k, 
   l}[[%[[3]]]]}

{a, g, l}

note that whatever you are doing it is almost always best to use the built in object oriented functions.

 Tuples[{{a, b, c, d}, {e, f, g, h}, {i, j, k, l}}][[12]]

{a, g, l}

Edit: for completeness a generalization of the first expression:

listlen = 6;
nsamp = 4;
Table[Floor[Mod[# - 1, listlen^i]/listlen^(i - 1) + 1], {i, nsamp, 
     1, -1}] & /@ Range[listlen^nsamp] ==
  Tuples[Range[listlen], nsamp]

True

Recursion vs bitmasking for getting all combinations of vector elements

The number of combinations you can get is 2^n, where n is the number of your elements. You can interpret every integer from 0 to 2^n -1 as a mask. In your example (elements 1, 2, 3) you have 3 elements and the masks would therefore be 000, 001, 010, 011, 100, 101, 110, and 111. Let every place in the mask represent one of your elements. For place that has a 1, take the corresponding element, otherwise if the place contains a 0, leave the element out. For example the the number 5 would be the mask 101 and it would generate this combination: 1, 3.

If you want to have a fast and relatively short code for it, you could do it like this:

#include <cstdio>
#include <vector>

using namespace std;

int main(){

    vector<int> elements;

    elements.push_back(1);
    elements.push_back(2);
    elements.push_back(3);

    //  1<<n is essentially pow(2, n), but much faster and only for integers
    //  the iterator i will be our mask i.e. its binary form will tell us which elements to use and which not
    for (int i=0;i<(1<<elements.size());++i){
        printf("Combination #%d:", i+1);
        for (int j=0;j<elements.size();++j){
            //  1<<j shifts the 1 for j places and then we check j-th binary digit of i
            if (i&(1<<j)){
                printf(" %d", elements[j]);
            }
        }
        printf("\n");
    }

    return 0;
}

Intelligently Generating Combinations of Combinations