Avoiding Nested for Loops

avoiding nested for loops python

Foreword

Nested loops are not a bad thing per se. They are only bad, if there are used for problems, for which better algorithm have been found (better and bad in terms of efficiency regarding the input size). Sorting of a list of integers for example is such a problem.

Analyzing the Problem

The size

In your case above you have three lists, all of size 4. This makes 4 * 4 * 4 = 64 possible combinations of them, if a comes always before b and b before c. So you need at least 64 iterations!

Your approach

In your approach we have 4 iterations for each possible value of a, 4 iterations for each possible value of b and the same for c. So we have 4 * 4 * 4 = 64 iterations in total. So in fact your solution is quite good!
As there is no faster way of listening all combinations, your way is also the best one.

The style

Regarding the style one can say that you can improve your code by better variable names and combining some of the for loops. E.g. like that:

def replaceVar(expressions):
"""
Takes a list of expressions and returns a list of expressions with
evaluated variables.
"""
evaluatedExpressions = list()

valuesOfA = [1, 8, 12, 13]
valuesOfB = [1, 2, 3, 4]
valuesOfC = [5, 9, 2, 7]

for expression in expressions:
for valueOfA in valuesOfA:
for valueOfB in valuesOfB:
for valueOfC in valuesOfC:
newExpression = expression.\
replace('a', str(valueOfA)).\
replace('b', str(valueOfB)).\
replace('c', str(valueOfC))
evaluatedExpressions.append(newExpression)

print(evaluatedExpressions)
return evaluatedExpressions

print(replaceVar(['b-16+(c-(a+11))', 'a-(c-5)+a-b-10']))

Notice however that the amount of iterations remain the same!

Itertools

As Kevin noticed, you could also use itertools to generate the cartesian product. Internally it will do the same as what you did with the combined for loops:

import itertools

def replaceVar(expressions):
"""
Takes a list of expressions and returns a list of expressions with
evaluated variables.
"""
evaluatedExpressions = list()

valuesOfA = [1, 8, 12, 13]
valuesOfB = [1, 2, 3, 4]
valuesOfC = [5, 9, 2, 7]

for expression in expressions:
for values in itertools.product(valuesOfA, valuesOfB, valuesOfC):
valueOfA = values[0]
valueOfB = values[1]
valueOfC = values[2]
newExpression = expression.\
replace('a', str(valueOfA)).\
replace('b', str(valueOfB)).\
replace('c', str(valueOfC))
evaluatedExpressions.append(newExpression)

print(evaluatedExpressions)
return evaluatedExpressions

print(replaceVar(['b-16+(c-(a+11))', 'a-(c-5)+a-b-10']))

How to avoid nested for loops in this particular case?

The reason that this bit of code is slow is not that you have nested loops, but that you implemented a O(n2) algorithm.

You should use a logical array to indicate nodes that are "closed". This will make the lookup much more efficient.

Consider you have N = 37901 nodes. Initialize your Closed array thus:

Closed = false(1,N);

Now to check if node F(j) is closed, you can simply do Closed(F(j)), which will be true if it's closed.

Now your loop can be replaced with

F(Closed(F)) = [];
Closed(current) = true;

because Closed(F) is an array of the same size as F that is true for the closed nodes. You can use this array to index into F, and remove all the closed nodes. I'm deleting those nodes, rather than assigning 0 to them, so that F can always be used to index into Closed. If we write 0 there, it's no longer a logical array. If you need to not change the shape of F, you'd have to do some additional tests before indexing.

Note that Closed = [Closed current] is also a lot slower than Closed(current) = true, because it creates a new array that the old Closed array is copied into.


Note that you can remove the nested loop in your code, as below, but it will not necessarily be any faster, as the algorithm is still O(n2) (you're comparing each of the elements in F to each of the elements of Closed).

for j=1:length(F)
if any(F(j) == Closed)
F(j) = 0;
end
end

how to avoid a nested for loop in this problem

Stream::flatMap should be used here to provide "access" to the lowest level of nested lists assuming the getters are implemented properly in the sample classes:

List<Test> input = ...; // define input data

List<AThirdObject> nestedList = input
.stream() // Stream<Test>
.flatMap(t -> t.getTestList().stream()) // Stream<AnotherObject>
.flatMap(ao -> ao.getAnotherList().stream()) // Stream<AThirdObject>
.collect(Collectors.toList());

Avoiding nested for loops

Here's how to use product:

x1 = range(min1, max1, step1)
x2 = range(min2, max2, step2)
x3 = range(min3, max3, step3)
...

for v1, v2, v3, v4, v5, v6 in itertools.product(x1, x2, x3, x4, x5, x6):
do_something_with(v1, v2, v3, v4, v5, v6)

or a bit more compactly:

ranges = [
range(min1, max1, step1),
range(min2, max2, step2),
range(min3, max3, step3),
...
]

for v1, v2, v3, v4, v5, v6 in itertools.product(*ranges):
do_something_with(v1, v2, v3, v4, v5, v6)

Method to avoid this nested for-loop

So, from what I've understood out of your code... You're looking to:

  1. Compute the difference between all values of x.
  2. Determine if it's less than error where error ranges from [0, 100).
  3. Select all subarrays of size 5.

Assuming my interpretation is correct... You can actually vectorize this and avoid for-loops, like your intuition led you to believe. Ultimately, if my interpretation is incorrect, this should, at least, give you a decent start to creating a vectorized version of your desired code. /p>

Updated Solution (considers 5-tuples)

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randint(1, 1000, 20), columns=["x"])

overlaps = {}

for margin in range(0, 100):
diffs = np.abs(df["x"].values - np.roll(df["x"], margin))
# np.convolve is analogous to a sliding window sum
quint = np.convolve(diffs == margin, np.ones(5), "valid")
index = np.nonzero(quint == 5)[0]
if index.size > 0:
overlaps[margin] = [list(range(i, i + 5)) for i in index]


Original Solution (doesn't consider 5-tuples)

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randint(1, 1000, 20), columns=["x"])

overlaps = {}

for margin in range(0, 100):
diffs = np.abs(df["x"].values - np.roll(df["x"], margin))
index = np.nonzero(diff == margin)[0]
if idx.size > 0:
overlaps[margin] = idx

In case you're unfamiliar with numpy, .size gives you the total size of the ndarray. (So a 3D array with shapes (10, 20, 30) has a size of 6000.)

Avoiding nested loops when comparing objects in Python

Given that matching objects have to have similar ids (ID in Script1Object and number in Script2Object), and assuming that script2_list items all have unique numbers, you can create a mapping from number to the object using a dictionary. Then, as you're iterating, you can get a Script2Object from the dictionary by its number directly (without having to loop through the entire script2_list). You can then call each match function on both objects as you were doing before:

script1_list = script1()
script2_list = script2()
script2_dict = {obj.number: obj for obj in script2_list}

for item1 in script1_list:
item2 = script2_dict.get(item1.ID, None)
if item2 is None:
print('no item2 found for this id: ', item1.ID)
# do something if there's no matching id
if is_match(item1, item2):
do_matching_action()
elif is_different_match(item1, item2):
do_other_matching_action()

if is_bad(item1):
do_error_action()

Trying to avoid nested for loops

Firstly you can cut down on the code in each loop and tidy up with os.path.join:

for i in folders:
for j in subfolders:
subfolder_path = os.path.join(path, f"folder{i}", f"subfolder{j}")
os.makedirs(subfolder_path)
for k in files:
file_path = os.path.join(subfolderpath, "files-{j}-{k}.txt")
open(file_path, "w")

And then the first 2 loops can be made into 1 with itertools.product:

import itertools

for i,j in itertools.product(folders, subfolders):
subfolder_path = os.path.join(path, f"folder{i}", f"subfolder{j}")
os.makedirs(subfolder_path)
for k in files:
file_path = os.path.join(subfolderpath, "files-{j}-{k}.txt")
open(file_path, "w")

But how about creating a function to create the file path if it doesn't exist? Then we can condense to a single for loop.

def open_and_create(folder_path, file_name, *a):
os.makedirs(folder_path, exist_ok=True)
return open(os.path.join(folder_path, file_name) *a)

for i,j,k in itertools.product(folders, subfolders, files):
subfolder_path = os.path.join(path, f"folder{i}", f"subfolder{j}")
open_and_create(subfolder_path, "files-{j}-{k}.txt", 'w')

how to avoid nested loop in python and check condition on all combination of elements in multi list

You need itertools.product:

import itertools
for cs in itertools.product(*[object1, object2, object3]):
if sum(cs) < 9:
print(cs[0], cs[1], cs[2])

# OR
# for c1, c2, c3 in itertools.product(*[object1, object2, object3]):
# if (c1+c2+c3) < 9:
# print(c1,c2,c3)

Output:

4 1 1
4 1 0
4 2 1
4 2 0
4 3 1
4 3 0
4 4 0
5 1 1
5 1 0
5 2 1
5 2 0
5 3 0
3 1 1
3 1 0
3 2 1
3 2 0
3 3 1
3 3 0
3 4 1
3 4 0


Related Topics



Leave a reply



Submit