Algorithm to Find Which Number in a List Sum Up to a Certain Number

Algorithm to find which number in a list sum up to a certain number

This problem reduces to the 0-1 Knapsack Problem, where you are trying to find a set with an exact sum. The solution depends on the constraints, in the general case this problem is NP-Complete.

However, if the maximum search sum (let's call it S) is not too high, then you can solve the problem using dynamic programming. I will explain it using a recursive function and memoization, which is easier to understand than a bottom-up approach.

Let's code a function f(v, i, S), such that it returns the number of subsets in v[i:] that sums exactly to S. To solve it recursively, first we have to analyze the base (i.e.: v[i:] is empty):

S == 0: The only subset of [] has sum 0, so it is a valid subset. Because of this, the function should return 1.
S != 0: As the only subset of [] has sum 0, there is not a valid subset. Because of this, the function should return 0.

Then, let's analyze the recursive case (i.e.: v[i:] is not empty). There are two choices: include the number v[i] in the current subset, or not include it. If we include v[i], then we are looking subsets that have sum S - v[i], otherwise, we are still looking for subsets with sum S. The function f might be implemented in the following way:

def f(v, i, S):
  if i >= len(v): return 1 if S == 0 else 0
  count = f(v, i + 1, S)
  count += f(v, i + 1, S - v[i])
  return count

v = [1, 2, 3, 10]
sum = 12
print(f(v, 0, sum))

By checking f(v, 0, S) > 0, you can know if there is a solution to your problem. However, this code is too slow, each recursive call spawns two new calls, which leads to an O(2^n) algorithm. Now, we can apply memoization to make it run in time O(n*S), which is faster if S is not too big:

def f(v, i, S, memo):
  if i >= len(v): return 1 if S == 0 else 0
  if (i, S) not in memo:  # <-- Check if value has not been calculated.
    count = f(v, i + 1, S, memo)
    count += f(v, i + 1, S - v[i], memo)
    memo[(i, S)] = count  # <-- Memoize calculated result.
  return memo[(i, S)]     # <-- Return memoized value.

v = [1, 2, 3, 10]
sum = 12
memo = dict()
print(f(v, 0, sum, memo))

Now, it is possible to code a function g that returns one subset that sums S. To do this, it is enough to add elements only if there is at least one solution including them:

def f(v, i, S, memo):
  # ... same as before ...

def g(v, S, memo):
  subset = []
  for i, x in enumerate(v):
    # Check if there is still a solution if we include v[i]
    if f(v, i + 1, S - x, memo) > 0:
      subset.append(x)
      S -= x
  return subset

v = [1, 2, 3, 10]
sum = 12
memo = dict()
if f(v, 0, sum, memo) == 0: print("There are no valid subsets.")
else: print(g(v, sum, memo))

Disclaimer: This solution says there are two subsets of [10, 10] that sums 10. This is because it assumes that the first ten is different to the second ten. The algorithm can be fixed to assume that both tens are equal (and thus answer one), but that is a bit more complicated.

Algorithm to find a list of numbers that sum up to a certain number

Python has an extensive standard library which is pretty fast and useful.
In this case, itertools' combinations_with_replacement does the trick.

You just need to pick the combinations that sum up to N

For example, this code

import itertools as it

N = 14
SIZE = 6

lst = range(N+1)
sum_n_combs = [
    comb for comb in it.combinations_with_replacement(lst, SIZE)
    if sum(comb) == N
]
print(sum_n_combs)

produces

[(0, 0, 0, 0, 0, 14), (0, 0, 0, 0, 1, 13), (0, 0, 0, 0, 2, 12), (0, 0, 0, 0, 3, 11), (0, 0, 0, 0, 4, 10), (0, 0, 0, 0, 5, 9), (0, 0, 0, 0, 6, 8), (0, 0, 0, 0, 7, 7), (0, 0, 0, 1, 1, 12), (0, 0, 0, 1, 2, 11), (0, 0, 0, 1, 3, 10), (0, 0, 0, 1, 4, 9), (0, 0, 0, 1, 5, 8), (0, 0, 0, 1, 6, 7), (0, 0, 0, 2, 2, 10), (0, 0, 0, 2, 3, 9), (0, 0, 0, 2, 4, 8), (0, 0, 0, 2, 5, 7), (0, 0, 0, 2, 6, 6), (0, 0, 0, 3, 3, 8), (0, 0, 0, 3, 4, 7), (0, 0, 0, 3, 5, 6), (0, 0, 0, 4, 4, 6), (0, 0, 0, 4, 5, 5), (0, 0, 1, 1, 1, 11), (0, 0, 1, 1, 2, 10), (0, 0, 1, 1, 3, 9), (0, 0, 1, 1, 4, 8), (0, 0, 1, 1, 5, 7), (0, 0, 1, 1, 6, 6), (0, 0, 1, 2, 2, 9), (0, 0, 1, 2, 3, 8), (0, 0, 1, 2, 4, 7), (0, 0, 1, 2, 5, 6), (0, 0, 1, 3, 3, 7), (0, 0, 1, 3, 4, 6), (0, 0, 1, 3, 5, 5), (0, 0, 1, 4, 4, 5), (0, 0, 2, 2, 2, 8), (0, 0, 2, 2, 3, 7), (0, 0, 2, 2, 4, 6), (0, 0, 2, 2, 5, 5), (0, 0, 2, 3, 3, 6), (0, 0, 2, 3, 4, 5), (0, 0, 2, 4, 4, 4), (0, 0, 3, 3, 3, 5), (0, 0, 3, 3, 4, 4), (0, 1, 1, 1, 1, 10), (0, 1, 1, 1, 2, 9), (0, 1, 1, 1, 3, 8), (0, 1, 1, 1, 4, 7), (0, 1, 1, 1, 5, 6), (0, 1, 1, 2, 2, 8), (0, 1, 1, 2, 3, 7), (0, 1, 1, 2, 4, 6), (0, 1, 1, 2, 5, 5), (0, 1, 1, 3, 3, 6), (0, 1, 1, 3, 4, 5), (0, 1, 1, 4, 4, 4), (0, 1, 2, 2, 2, 7), (0, 1, 2, 2, 3, 6), (0, 1, 2, 2, 4, 5), (0, 1, 2, 3, 3, 5), (0, 1, 2, 3, 4, 4), (0, 1, 3, 3, 3, 4), (0, 2, 2, 2, 2, 6), (0, 2, 2, 2, 3, 5), (0, 2, 2, 2, 4, 4), (0, 2, 2, 3, 3, 4), (0, 2, 3, 3, 3, 3), (1, 1, 1, 1, 1, 9), (1, 1, 1, 1, 2, 8), (1, 1, 1, 1, 3, 7), (1, 1, 1, 1, 4, 6), (1, 1, 1, 1, 5, 5), (1, 1, 1, 2, 2, 7), (1, 1, 1, 2, 3, 6), (1, 1, 1, 2, 4, 5), (1, 1, 1, 3, 3, 5), (1, 1, 1, 3, 4, 4), (1, 1, 2, 2, 2, 6), (1, 1, 2, 2, 3, 5), (1, 1, 2, 2, 4, 4), (1, 1, 2, 3, 3, 4), (1, 1, 3, 3, 3, 3), (1, 2, 2, 2, 2, 5), (1, 2, 2, 2, 3, 4), (1, 2, 2, 3, 3, 3), (2, 2, 2, 2, 2, 4), (2, 2, 2, 2, 3, 3)]

Finding all possible combinations of numbers to reach a given sum

This problem can be solved with a recursive combinations of all possible sums filtering out those that reach the target. Here is the algorithm in Python:

def subset_sum(numbers, target, partial=[]):
    s = sum(partial)

    # check if the partial sum is equals to target
    if s == target: 
        print "sum(%s)=%s" % (partial, target)
    if s >= target:
        return  # if we reach the number why bother to continue
    
    for i in range(len(numbers)):
        n = numbers[i]
        remaining = numbers[i+1:]
        subset_sum(remaining, target, partial + [n]) 
   

if __name__ == "__main__":
    subset_sum([3,9,8,4,5,7,10],15)

    #Outputs:
    #sum([3, 8, 4])=15
    #sum([3, 5, 7])=15
    #sum([8, 7])=15
    #sum([5, 10])=15

This type of algorithms are very well explained in the following Stanford's Abstract Programming lecture - this video is very recommendable to understand how recursion works to generate permutations of solutions.

Edit

The above as a generator function, making it a bit more useful. Requires Python 3.3+ because of yield from.

def subset_sum(numbers, target, partial=[], partial_sum=0):
    if partial_sum == target:
        yield partial
    if partial_sum >= target:
        return
    for i, n in enumerate(numbers):
        remaining = numbers[i + 1:]
        yield from subset_sum(remaining, target, partial + [n], partial_sum + n)

Here is the Java version of the same algorithm:

package tmp;

import java.util.ArrayList;
import java.util.Arrays;

class SumSet {
    static void sum_up_recursive(ArrayList<Integer> numbers, int target, ArrayList<Integer> partial) {
       int s = 0;
       for (int x: partial) s += x;
       if (s == target)
            System.out.println("sum("+Arrays.toString(partial.toArray())+")="+target);
       if (s >= target)
            return;
       for(int i=0;i<numbers.size();i++) {
             ArrayList<Integer> remaining = new ArrayList<Integer>();
             int n = numbers.get(i);
             for (int j=i+1; j<numbers.size();j++) remaining.add(numbers.get(j));
             ArrayList<Integer> partial_rec = new ArrayList<Integer>(partial);
             partial_rec.add(n);
             sum_up_recursive(remaining,target,partial_rec);
       }
    }
    static void sum_up(ArrayList<Integer> numbers, int target) {
        sum_up_recursive(numbers,target,new ArrayList<Integer>());
    }
    public static void main(String args[]) {
        Integer[] numbers = {3,9,8,4,5,7,10};
        int target = 15;
        sum_up(new ArrayList<Integer>(Arrays.asList(numbers)),target);
    }
}

It is exactly the same heuristic. My Java is a bit rusty but I think is easy to understand.

C# conversion of Java solution: (by @JeremyThompson)

public static void Main(string[] args)
{
    List<int> numbers = new List<int>() { 3, 9, 8, 4, 5, 7, 10 };
    int target = 15;
    sum_up(numbers, target);
}

private static void sum_up(List<int> numbers, int target)
{
    sum_up_recursive(numbers, target, new List<int>());
}

private static void sum_up_recursive(List<int> numbers, int target, List<int> partial)
{
    int s = 0;
    foreach (int x in partial) s += x;

    if (s == target)
        Console.WriteLine("sum(" + string.Join(",", partial.ToArray()) + ")=" + target);

    if (s >= target)
        return;

    for (int i = 0; i < numbers.Count; i++)
    {
        List<int> remaining = new List<int>();
        int n = numbers[i];
        for (int j = i + 1; j < numbers.Count; j++) remaining.Add(numbers[j]);

        List<int> partial_rec = new List<int>(partial);
        partial_rec.Add(n);
        sum_up_recursive(remaining, target, partial_rec);
    }
}

Ruby solution: (by @emaillenin)

def subset_sum(numbers, target, partial=[])
  s = partial.inject 0, :+
# check if the partial sum is equals to target

  puts "sum(#{partial})=#{target}" if s == target

  return if s >= target # if we reach the number why bother to continue

  (0..(numbers.length - 1)).each do |i|
    n = numbers[i]
    remaining = numbers.drop(i+1)
    subset_sum(remaining, target, partial + [n])
  end
end

subset_sum([3,9,8,4,5,7,10],15)

Edit: complexity discussion

As others mention this is an NP-hard problem. It can be solved in exponential time O(2^n), for instance for n=10 there will be 1024 possible solutions. If the targets you are trying to reach are in a low range then this algorithm works. So for instance:

subset_sum([1,2,3,4,5,6,7,8,9,10],100000) generates 1024 branches because the target never gets to filter out possible solutions.

On the other hand subset_sum([1,2,3,4,5,6,7,8,9,10],10) generates only 175 branches, because the target to reach 10 gets to filter out many combinations.

If N and Target are big numbers one should move into an approximate version of the solution.

Algorithm to find which numbers from a list of size n sum to another number

Interesting answers. Thank you for the pointers to Wikipedia - whilst interesting - they don't actually solve the problem as stated as I was looking for exact matches - more of an accounting/book balancing problem than a traditional bin-packing / knapsack problem.

I have been following the development of stack overflow with interest and wondered how useful it would be. This problem came up at work and I wondered whether stack overflow could provide a ready-made answer (or a better answer) quicker than I could write it myself. Thanks also for the comments suggesting this be tagged homework - I guess that is reasonably accurate in light of the above.

For those who are interested, here is my solution which uses recursion (naturally) I also changed my mind about the method signature and went for List> rather than decimal[][] as the return type:

public class Solver {

    private List<List<decimal>> mResults;

    public List<List<decimal>> Solve(decimal goal, decimal[] elements) {

        mResults = new List<List<decimal>>();
        RecursiveSolve(goal, 0.0m, 
            new List<decimal>(), new List<decimal>(elements), 0);
        return mResults; 
    }

    private void RecursiveSolve(decimal goal, decimal currentSum, 
        List<decimal> included, List<decimal> notIncluded, int startIndex) {

        for (int index = startIndex; index < notIncluded.Count; index++) {

            decimal nextValue = notIncluded[index];
            if (currentSum + nextValue == goal) {
                List<decimal> newResult = new List<decimal>(included);
                newResult.Add(nextValue);
                mResults.Add(newResult);
            }
            else if (currentSum + nextValue < goal) {
                List<decimal> nextIncluded = new List<decimal>(included);
                nextIncluded.Add(nextValue);
                List<decimal> nextNotIncluded = new List<decimal>(notIncluded);
                nextNotIncluded.Remove(nextValue);
                RecursiveSolve(goal, currentSum + nextValue,
                    nextIncluded, nextNotIncluded, startIndex++);
            }
        }
    }
}

If you want an app to test this works, try this console app code:

class Program {
    static void Main(string[] args) {

        string input;
        decimal goal;
        decimal element;

        do {
            Console.WriteLine("Please enter the goal:");
            input = Console.ReadLine();
        }
        while (!decimal.TryParse(input, out goal));

        Console.WriteLine("Please enter the elements (separated by spaces)");
        input = Console.ReadLine();
        string[] elementsText = input.Split(' ');
        List<decimal> elementsList = new List<decimal>();
        foreach (string elementText in elementsText) {
            if (decimal.TryParse(elementText, out element)) {
                elementsList.Add(element);
            }
        }

        Solver solver = new Solver();
        List<List<decimal>> results = solver.Solve(goal, elementsList.ToArray());
        foreach(List<decimal> result in results) {
            foreach (decimal value in result) {
                Console.Write("{0}\t", value);
            }
            Console.WriteLine();
        }

        Console.ReadLine();
    }
}

I hope this helps someone else get their answer more quickly (whether for homework or otherwise).

Cheers...

Given list of integers, find the sets of numbers whose sum = a target number, minimising the total amount that each set goes over the target

I got 238 as well, with backtracking.

The program below tries all available new combinations of the state:

(index, sum_a, count_a, sum_b, count_b...)

  where index is the index in prices, and sum_x and count_x
  are the distinct sum x and the count of how many bins
  with that sum we currently have.

It returns the recorded value achieved from that state if the state was previously seen, and avoids searching a branch known to lead to a result greater than or equal to the smallest recorded result so far. It recurses using a single object to store the state in memory and updates it as it backtracks.

There's also a tweak-able bound on how large a bin is allowed be (the line, if (new_sum >= 1.2 * 500)).

function getPrefixSums(A){
  let pfxs = new Array(A.length);
  pfxs[-1] = 0;
  A.map((x, i) => pfxs[i] = A[i] + pfxs[i-1]);
  return pfxs;
}

function add(sum, new_sum, dp){
  if (dp.state[sum] == 1)
    delete dp.state[sum];
  else if (dp.state.hasOwnProperty(sum))
    dp.state[sum] -= 1;
  if (dp.state.hasOwnProperty(new_sum))
    dp.state[new_sum] += 1;
  else
    dp.state[new_sum] = 1;
  if (new_sum > 500)
    dp.total += new_sum - 500;
  else if (new_sum < 500);
    dp.remaining -= new_sum - sum;
}

function remove(sum, new_sum, dp){
  if (sum > 0 && !dp.state.hasOwnProperty(sum))
    dp.state[sum] = 1;
  else if (sum > 0)
    dp.state[sum] += 1;
  if (dp.state[new_sum] == 1)
    delete dp.state[new_sum];
  else if (dp.state.hasOwnProperty(new_sum))
    dp.state[new_sum] -= 1;
  if (new_sum > 500)
    dp.total -= new_sum - 500;
  else if (new_sum < 500);
    dp.remaining += new_sum - sum;
}

function g(prices, pfxs, i, dp, memo){
  const sorted = Object.entries(dp.state).sort(([sum1, count1], [sum2, count2]) =>
    Number(sum1) - Number(sum2));
  const key = String([i, sorted]);
  
  if (memo.hasOwnProperty(key))
    return memo[key];

  if (dp.total >= dp.best)
    return memo[key] = Infinity;
    
  if (i == prices.length){
    if (Object.keys(dp.state).some(x => Number(x) < 500))
      return memo[key] = Infinity;
    dp.best = Math.min(dp.best, dp.total);
    return memo[key] = dp.total;
  }
    
  let best = Infinity;
  let bestSum = -1;
    
  // Add bin
  if (pfxs[pfxs.length-1] - pfxs[i-1] >= 500 + dp.remaining){
    dp.remaining += 500;
    add(0, prices[i], dp);
    
    const candidate = g(prices, pfxs, i+1, dp, memo);
      
    best = candidate;
    bestSum = 0;
    
    dp.remaining -= 500;
    remove(0, prices[i], dp);
  }

  // Add to existing bin;
  for (let sum in dp.state){
    const new_sum = Number(sum) + prices[i];
    
    if (new_sum >= 1.2 * 500)
      continue;
      
    add(sum, new_sum, dp);
      
    const candidate = g(prices, pfxs, i+1, dp, memo);

    if (candidate < best){
      best = candidate;
      bestSum = sum;
    }
      
    remove(sum, new_sum, dp);
  }
  
  return memo[key] = best;
}

function backtrack(prices, total, memo){
  let m = [];
  
  for (let i in memo)
    if (memo[i] == total)
      m.push(i);
 
  m = m.map(x => x.split(',').map(Number));
  m.sort((a, b) => a[0] - b[0]);
  
  function validate(added, removed){
    return added.length == 1 &&
      removed.length < 2 &&
      !added.some(([sum, count]) => count > 1) &&
      !removed.some(([sum, count]) => count > 1);
  }

  function back(i, prev_idx, dp){
    if (i == m.length)
      return dp;

    const [idx, ...cts] = m[i];
    const _dp = cts.reduce(function(acc, x, i){
      if (!(i & 1))
        acc[x] = cts[i+1];
      return acc;
    }, {});

    if (idx == prev_idx)
      return back(i + 1, prev_idx, dp);

    let added = [];
    let removed = [];

    for (let sum in _dp){
      if (!dp.hasOwnProperty(sum))
        added.push([sum, _dp[sum]]);
      else if (dp[sum] > _dp[sum])
        removed.push([sum, dp[sum].count - _dp[sum]]);
    }
    for (let sum in dp){
      if (!_dp.hasOwnProperty(sum))
        removed.push([sum, dp[sum]]);
    }

    if (!validate(added, removed))
      return back(i + 1, prev_idx, dp);

    const [[new_sum, _]] = added;
    let old_bin = [];

    if (removed.length){
      const [[old_sum, _]] = removed;
      const len = dp[old_sum].bins.length;
      old_bin = dp[old_sum].bins[len - 1];

      if (dp[old_sum].count == 1){
        delete dp[old_sum];

      } else {
        dp[old_sum].count -= 1;
        dp[old_sum].bins.length = len - 1;
      }
    }

    if (dp[new_sum]){
      dp[new_sum].count += 1;
      dp[new_sum].bins.push(old_bin.concat(prices[idx-1]));
    } else {
      dp[new_sum] = {
        count: 1,
        bins: [old_bin.concat(prices[idx-1])]
      }
    }

    return back(i + 1, idx, dp);
  }

  function get_dp(row){
    const [idx, ...cts] = row;
    return cts.reduce(function(acc, x, i){
      if (!(i & 1)){
        acc[x] = {
          count: cts[i+1],
          bins: new Array(cts[i+1]).fill(null).map(_ => [x])
        };
      }
      return acc;
    }, {});
  }
  
  const dp = get_dp(m[1]);
  
  return back(2, 1, dp);
}

function f(prices){
  const pfxs = getPrefixSums(prices);
  const dp = {
    state: {'0': 1},
    total: 0,
    remaining: 0,
    best: Infinity
  };
  const memo = {};
  const result = g(prices, pfxs, 0, dp, memo);

  const _dp = backtrack(prices, result, memo);
  const bins = Object.values(_dp).flatMap(x => x.bins);

  return [result, bins];
}

var prices = [425, 105, 185, 185, 185, 98, 145, 155, 125, 125, 135, 295, 295, 155, 125];

console.log(JSON.stringify(prices));
console.log(JSON.stringify(f(prices)));

list all the permutation to sum up some numbers to a dest number

First - note that even finding if there is any subset that sums to the desired number is NP-Complete and is known as the subset sum problem, so there is no known polynomial solution for it.

Now, regarding the specific issue, here are some options:

First there is of course the obvious "generate all subsets and check the sum" way. Note that if your elements are all non-negative, you can use branch and bound and terminate a large portion of the possibilities before you actually develop them (If you found a subset X with sum(X) == s, and you are looking for the number n < s - you can be sure any set containing X will NOT find the solution). Something along the lines of:

findSubsets(list,sol,n):
  if (list.empty() and n == 0): #found a feasible subset!
     print sol 
     return
  else if (n < 0): #bounding non feasible solutions
     return 
  else if (list.empty()): #a solution that sums to a smaller number then n
     return
  e <- list.removeAndReturnFirst()
  sol <- sol.add(e)
  findSubsets(list,sol,n-e)
  sol <- sol.removeLast()
  findSubsets(list,sol,n)
  list.addFirst(e) #cleanup, return the list to original state

Invoke with findSubsets(list,[],n) where list is your list of items, n is the desired number and [] is an empty list.

Note that it can be easily parallelized if needed, there is no real syncrhonization needed between two subsets explored.

Another alternative if the list contains only integers is using Dynamic Programming for solving the subset sum problem. Once you have the matrix you can re-create all the elements from the table by going back in the table. This similar question discusses how to get a list from the knapsack DP solution. The principles of the two problems are pretty much the same.

Algorithm to Find Which Number in a List Sum Up to a Certain Number