Number of Combinations (N Choose R) in C++

Which is better way to calculate nCr

Both approaches will save time, but the first one is very prone to integer overflow.

Approach 1:

This approach will generate result in shortest time (in at most n/2 iterations), and the possibility of overflow can be reduced by doing the multiplications carefully:

long long C(int n, int r) {
if(r > n - r) r = n - r; // because C(n, r) == C(n, n - r)
long long ans = 1;
int i;

for(i = 1; i <= r; i++) {
ans *= n - r + i;
ans /= i;
}

return ans;
}

This code will start multiplication of the numerator from the smaller end, and as the product of any k consecutive integers is divisible by k!, there will be no divisibility problem. But the possibility of overflow is still there, another useful trick may be dividing n - r + i and i by their GCD before doing the multiplication and division (and still overflow may occur).

Approach 2:

In this approach, you'll be actually building up the Pascal's Triangle. The dynamic approach is much faster than the recursive one (the first one is O(n^2) while the other is exponential). However, you'll need to use O(n^2) memory too.

# define MAX 100 // assuming we need first 100 rows
long long triangle[MAX + 1][MAX + 1];

void makeTriangle() {
int i, j;

// initialize the first row
triangle[0][0] = 1; // C(0, 0) = 1

for(i = 1; i < MAX; i++) {
triangle[i][0] = 1; // C(i, 0) = 1
for(j = 1; j <= i; j++) {
triangle[i][j] = triangle[i - 1][j - 1] + triangle[i - 1][j];
}
}
}

long long C(int n, int r) {
return triangle[n][r];
}

Then you can look up any C(n, r) in O(1) time.

If you need a particular C(n, r) (i.e. the full triangle is not needed), then the memory consumption can be made O(n) by overwriting the same row of the triangle, top to bottom.

# define MAX 100
long long row[MAX + 1];

int C(int n, int r) {
int i, j;

// initialize by the first row
row[0] = 1; // this is the value of C(0, 0)

for(i = 1; i <= n; i++) {
for(j = i; j > 0; j--) {
// from the recurrence C(n, r) = C(n - 1, r - 1) + C(n - 1, r)
row[j] += row[j - 1];
}
}

return row[r];
}

The inner loop is started from the end to simplify the calculations. If you start it from index 0, you'll need another variable to store the value being overwritten.

Calculating the Amount of Combinations

Here's an ancient algorithm which is exact and doesn't overflow unless the result is to big for a long long

unsigned long long
choose(unsigned long long n, unsigned long long k) {
if (k > n) {
return 0;
}
unsigned long long r = 1;
for (unsigned long long d = 1; d <= k; ++d) {
r *= n--;
r /= d;
}
return r;
}

This algorithm is also in Knuth's "The Art of Computer Programming, 3rd Edition, Volume 2: Seminumerical Algorithms" I think.

UPDATE: There's a small possibility that the algorithm will overflow on the line:

r *= n--;

for very large n. A naive upper bound is sqrt(std::numeric_limits<long long>::max()) which means an n less than rougly 4,000,000,000.

Combinatoric 'N choose R' in java math?

The apache-commons "Math" supports this in
org.apache.commons.math4.util.CombinatoricsUtils

Algorithm to return all combinations of k elements from n

Art of Computer Programming Volume 4: Fascicle 3 has a ton of these that might fit your particular situation better than how I describe.

Gray Codes

An issue that you will come across is of course memory and pretty quickly, you'll have problems by 20 elements in your set -- 20C3 = 1140. And if you want to iterate over the set it's best to use a modified gray code algorithm so you aren't holding all of them in memory. These generate the next combination from the previous and avoid repetitions. There are many of these for different uses. Do we want to maximize the differences between successive combinations? minimize? et cetera.

Some of the original papers describing gray codes:

  1. Some Hamilton Paths and a Minimal Change Algorithm
  2. Adjacent Interchange Combination Generation Algorithm

Here are some other papers covering the topic:

  1. An Efficient Implementation of the Eades, Hickey, Read Adjacent Interchange Combination Generation Algorithm (PDF, with code in Pascal)
  2. Combination Generators
  3. Survey of Combinatorial Gray Codes (PostScript)
  4. An Algorithm for Gray Codes

Chase's Twiddle (algorithm)

Phillip J Chase, `Algorithm 382: Combinations of M out of N Objects' (1970)

The algorithm in C...

Index of Combinations in Lexicographical Order (Buckles Algorithm 515)

You can also reference a combination by its index (in lexicographical order). Realizing that the index should be some amount of change from right to left based on the index we can construct something that should recover a combination.

So, we have a set {1,2,3,4,5,6}... and we want three elements. Let's say {1,2,3} we can say that the difference between the elements is one and in order and minimal. {1,2,4} has one change and is lexicographically number 2. So the number of 'changes' in the last place accounts for one change in the lexicographical ordering. The second place, with one change {1,3,4} has one change but accounts for more change since it's in the second place (proportional to the number of elements in the original set).

The method I've described is a deconstruction, as it seems, from set to the index, we need to do the reverse – which is much trickier. This is how Buckles solves the problem. I wrote some C to compute them, with minor changes – I used the index of the sets rather than a number range to represent the set, so we are always working from 0...n.
Note:

  1. Since combinations are unordered, {1,3,2} = {1,2,3} --we order them to be lexicographical.
  2. This method has an implicit 0 to start the set for the first difference.

Index of Combinations in Lexicographical Order (McCaffrey)

There is another way:, its concept is easier to grasp and program but it's without the optimizations of Buckles. Fortunately, it also does not produce duplicate combinations:

The set x_k...x_1 in N that maximizes i = C(x_1,k) + C(x_2,k-1) + ... + C(x_k,1), where C(n,r) = {n choose r}.

For an example: 27 = C(6,4) + C(5,3) + C(2,2) + C(1,1). So, the 27th lexicographical combination of four things is: {1,2,5,6}, those are the indexes of whatever set you want to look at. Example below (OCaml), requires choose function, left to reader:

(* this will find the [x] combination of a [set] list when taking [k] elements *)
let combination_maccaffery set k x =
(* maximize function -- maximize a that is aCb *)
(* return largest c where c < i and choose(c,i) <= z *)
let rec maximize a b x =
if (choose a b ) <= x then a else maximize (a-1) b x
in
let rec iterate n x i = match i with
| 0 -> []
| i ->
let max = maximize n i x in
max :: iterate n (x - (choose max i)) (i-1)
in
if x < 0 then failwith "errors" else
let idxs = iterate (List.length set) x k in
List.map (List.nth set) (List.sort (-) idxs)

A small and simple combinations iterator

The following two algorithms are provided for didactic purposes. They implement an iterator and (a more general) folder overall combinations.
They are as fast as possible, having the complexity O(nCk). The memory consumption is bound by k.

We will start with the iterator, which will call a user provided function for each combination

let iter_combs n k f =
let rec iter v s j =
if j = k then f v
else for i = s to n - 1 do iter (i::v) (i+1) (j+1) done in
iter [] 0 0

A more general version will call the user provided function along with the state variable, starting from the initial state. Since we need to pass the state between different states we won't use the for-loop, but instead, use recursion,

let fold_combs n k f x =
let rec loop i s c x =
if i < n then
loop (i+1) s c @@
let c = i::c and s = s + 1 and i = i + 1 in
if s < k then loop i s c x else f c x
else x in
loop 0 0 [] x

What's wrong with this function that I wrote to find calculate NCR?

Consider what happens when you do comb(6, 2). In the first recursive call, the return expression becomes:

return (5 / 2) * comb(6, 1);

The (5 / 2) is going to do integer division and give 2 which is not correct.

Since the final answer of nCr is actually guaranteed to have a result that is an integer, you can fix the equation by simply computing all the numerators before dividing it by any of the denominators, like this:

return (n - r + 1) * comb(n , r - 1) / r ;

Here's a demo.

Note that if you are concerned with the numerator value overflowing an int, you can restructure the equation, or use another formula where it's easier to cancel out terms earlier.

Calculate value of n choose k

You could use the Multiplicative formula for this:

Sample Image

http://en.wikipedia.org/wiki/Binomial_coefficient#Multiplicative_formula



Related Topics



Leave a reply



Submit