What Is Memoization and How to Use It in Python

Memoization Practice

Some issues:

  • The first function is overwritten by the second, so you actually only use the second. Hence, there is no memoization happening
  • The assignment says that your function should take an array, but you provide it a set. {2, 3} is a set in Python, while [2, 3] is a list.

If the first function were used:

  • It has a commented line that is actually needed.
  • It takes an argument that is initialised once (only once!) with ={}. This means that if your main code makes multiple calls (like in your example), memo will not be reset between calls, and therefore the results will be wrong.

So, use a different name for the first function and remove the initial value for memo. Avoid the repetition of code in the second function, and just let it call the first one, making sure it passes an empty memo dictionary:

def howSumRec(target_num, listt, memo):
if target_num in memo:
return memo[target_num]
if target_num == 0:
return []
if target_num < 0:
return None
for num in listt:
remainder = target_num - num
remainder_result = howSumRec(remainder, listt, memo)
if remainder_result is not None:
memo[target_num]= *remainder_result, num
return memo[target_num]
memo[target_num] = None
return None

def howSum(target_num, listt):
return howSumRec(target_num, listt, {})

You could still improve on this, if needed, as this does not take advantage of the fact that the order in which you add terms is not important. So you could make sure that a recursive call never looks at terms that are positioned earlier in the list than the last term that was added:

def howSumRec(target_num, listt, i, memo):
if target_num in memo:
return memo[target_num]
if target_num == 0:
return []
if target_num < 0:
return None
for j in range(i, len(listt)):
num = listt[j]
remainder = target_num - num
remainder_result = howSumRec(remainder, listt, j, memo)
if remainder_result is not None:
memo[target_num]= *remainder_result, num
return memo[target_num]
memo[target_num] = None
return None

def howSum(target_num, listt):
return howSumRec(target_num, listt, 0, {})

It is important that with this version, listt is really a list, and not a set, so call it as:

print(howSum(7, [2, 3]))
print(howSum(7, [5, 3, 4, 7]))
print(howSum(7, [2, 4]))
print(howSum(8, [2, 3, 5]))
print(howSum(300, [7, 14]))

What is memoization good for and is it really all that helpful?

The popular factorial answer here is something of a toy answer. Yes, memoization is useful for repeated invocations of that function, but the relationship is trivial — in the "print factorial(N) for 0..M" case you're simply reusing the last value.

Many of the other examples here are just 'caching'. Which is useful, but it ignores the awesome algorithmic implications that the word memoization carries for me.

Far more interesting are cases where different branches of single invocation of a recursive function hits identical sub-problems but in a non-trivial pattern such that actually indexing into some cache is actually useful.

For example, consider n dimensional arrays of integers whos absolute values sum to k. E.g. for n=3,k=5 [1,-4,0], [3,-1,1], [5,0,0], [0,5,0] would be some examples. Let V(n,k) be the number of possible unique arrays for a given n,k. Its definition is:

V(n,0)=1; V(0,k)=0; V(n,k) = V(n-1,k) + V(n,k-1) + V(n-1,k-1);

This function gives 102 for n=3,k=5.

Without memoization this quickly becomes very slow to compute for even fairly modest numbers. If you visualize the processing as a tree, each node an invocation of V() expanding to three children you'd have 186,268,135,991,213,676,920,832 V(n,0)=1 leaves in the computation of V(32,32)... Implemented naively this function rapidly becomes uncomputable on available hardware.

But many of the child branches in the tree are exact duplicates of each other though not in some trivial way that could easily be eliminated like the factorial function. With memoization we can merge all those duplicate branches. In fact, with memoization V(32,32) only executes V() 1024 (n*m) times which is a speedup of a factor of 10^21 (which gets larger as n,k grows, obviously) or so in exchange for a fairly small amount of memory. :) I find this kind of fundamental change to the complexity of an algorithm far more exciting than simple caching. It can make intractable problems easy.

Because python numbers are naturally bignums you can implement this formula in python with memoization using a dictionary and tuple keys in only 9 lines. Give it a shot and try it without the memoization.

Memoization Handler

You can memoize without having to resort to eval.

A (very basic) memoizer:

def memoized(f):
cache={}
def ret(*args):
if args in cache:
return cache[args]
else:
answer=f(*args)
cache[args]=answer
return answer
return ret

@memoized
def fibonacci(n):
if n==0 or n==1:
return 1
else:
return fibonacci(n-1)+fibonacci(n-2)

print fibonacci(100)


Related Topics



Leave a reply



Submit