avoiding nested for loops python
Foreword
Nested loops are not a bad thing per se. They are only bad, if there are used for problems, for which better algorithm have been found (better and bad in terms of efficiency regarding the input size). Sorting of a list of integers for example is such a problem.
Analyzing the Problem
The size
In your case above you have three lists, all of size 4. This makes 4 * 4 * 4 = 64 possible combinations of them, if a comes always before b and b before c. So you need at least 64 iterations!
Your approach
In your approach we have 4 iterations for each possible value of a, 4 iterations for each possible value of b and the same for c. So we have 4 * 4 * 4 = 64 iterations in total. So in fact your solution is quite good!
As there is no faster way of listening all combinations, your way is also the best one.
The style
Regarding the style one can say that you can improve your code by better variable names and combining some of the for loops. E.g. like that:
def replaceVar(expressions):
"""
Takes a list of expressions and returns a list of expressions with
evaluated variables.
"""
evaluatedExpressions = list()
valuesOfA = [1, 8, 12, 13]
valuesOfB = [1, 2, 3, 4]
valuesOfC = [5, 9, 2, 7]
for expression in expressions:
for valueOfA in valuesOfA:
for valueOfB in valuesOfB:
for valueOfC in valuesOfC:
newExpression = expression.\
replace('a', str(valueOfA)).\
replace('b', str(valueOfB)).\
replace('c', str(valueOfC))
evaluatedExpressions.append(newExpression)
print(evaluatedExpressions)
return evaluatedExpressions
print(replaceVar(['b-16+(c-(a+11))', 'a-(c-5)+a-b-10']))
Notice however that the amount of iterations remain the same!
Itertools
As Kevin noticed, you could also use itertools
to generate the cartesian product. Internally it will do the same as what you did with the combined for loops:
import itertools
def replaceVar(expressions):
"""
Takes a list of expressions and returns a list of expressions with
evaluated variables.
"""
evaluatedExpressions = list()
valuesOfA = [1, 8, 12, 13]
valuesOfB = [1, 2, 3, 4]
valuesOfC = [5, 9, 2, 7]
for expression in expressions:
for values in itertools.product(valuesOfA, valuesOfB, valuesOfC):
valueOfA = values[0]
valueOfB = values[1]
valueOfC = values[2]
newExpression = expression.\
replace('a', str(valueOfA)).\
replace('b', str(valueOfB)).\
replace('c', str(valueOfC))
evaluatedExpressions.append(newExpression)
print(evaluatedExpressions)
return evaluatedExpressions
print(replaceVar(['b-16+(c-(a+11))', 'a-(c-5)+a-b-10']))
How to avoid nested for loops in this particular case?
The reason that this bit of code is slow is not that you have nested loops, but that you implemented a O(n2) algorithm.
You should use a logical array to indicate nodes that are "closed". This will make the lookup much more efficient.
Consider you have N = 37901
nodes. Initialize your Closed
array thus:
Closed = false(1,N);
Now to check if node F(j)
is closed, you can simply do Closed(F(j))
, which will be true
if it's closed.
Now your loop can be replaced with
F(Closed(F)) = [];
Closed(current) = true;
because Closed(F)
is an array of the same size as F
that is true for the closed nodes. You can use this array to index into F
, and remove all the closed nodes. I'm deleting those nodes, rather than assigning 0 to them, so that F
can always be used to index into Closed
. If we write 0 there, it's no longer a logical array. If you need to not change the shape of F
, you'd have to do some additional tests before indexing.
Note that Closed = [Closed current]
is also a lot slower than Closed(current) = true
, because it creates a new array that the old Closed
array is copied into.
Note that you can remove the nested loop in your code, as below, but it will not necessarily be any faster, as the algorithm is still O(n2) (you're comparing each of the elements in F
to each of the elements of Closed
).
for j=1:length(F)
if any(F(j) == Closed)
F(j) = 0;
end
end
how to avoid a nested for loop in this problem
Stream::flatMap
should be used here to provide "access" to the lowest level of nested lists assuming the getters are implemented properly in the sample classes:
List<Test> input = ...; // define input data
List<AThirdObject> nestedList = input
.stream() // Stream<Test>
.flatMap(t -> t.getTestList().stream()) // Stream<AnotherObject>
.flatMap(ao -> ao.getAnotherList().stream()) // Stream<AThirdObject>
.collect(Collectors.toList());
Avoiding nested for loops
Here's how to use product
:
x1 = range(min1, max1, step1)
x2 = range(min2, max2, step2)
x3 = range(min3, max3, step3)
...
for v1, v2, v3, v4, v5, v6 in itertools.product(x1, x2, x3, x4, x5, x6):
do_something_with(v1, v2, v3, v4, v5, v6)
or a bit more compactly:
ranges = [
range(min1, max1, step1),
range(min2, max2, step2),
range(min3, max3, step3),
...
]
for v1, v2, v3, v4, v5, v6 in itertools.product(*ranges):
do_something_with(v1, v2, v3, v4, v5, v6)
Method to avoid this nested for-loop
So, from what I've understood out of your code... You're looking to:
- Compute the difference between all values of
x
. - Determine if it's less than
error
whereerror
ranges from[0, 100)
. - Select all subarrays of size 5.
Assuming my interpretation is correct... You can actually vectorize this and avoid for-loops, like your intuition led you to believe. Ultimately, if my interpretation is incorrect, this should, at least, give you a decent start to creating a vectorized version of your desired code. /p>
Updated Solution (considers 5-tuples)import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(1, 1000, 20), columns=["x"])
overlaps = {}
for margin in range(0, 100):
diffs = np.abs(df["x"].values - np.roll(df["x"], margin))
# np.convolve is analogous to a sliding window sum
quint = np.convolve(diffs == margin, np.ones(5), "valid")
index = np.nonzero(quint == 5)[0]
if index.size > 0:
overlaps[margin] = [list(range(i, i + 5)) for i in index]
Original Solution (doesn't consider 5-tuples)import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(1, 1000, 20), columns=["x"])
overlaps = {}
for margin in range(0, 100):
diffs = np.abs(df["x"].values - np.roll(df["x"], margin))
index = np.nonzero(diff == margin)[0]
if idx.size > 0:
overlaps[margin] = idx
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(1, 1000, 20), columns=["x"])
overlaps = {}
for margin in range(0, 100):
diffs = np.abs(df["x"].values - np.roll(df["x"], margin))
# np.convolve is analogous to a sliding window sum
quint = np.convolve(diffs == margin, np.ones(5), "valid")
index = np.nonzero(quint == 5)[0]
if index.size > 0:
overlaps[margin] = [list(range(i, i + 5)) for i in index]
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(1, 1000, 20), columns=["x"])
overlaps = {}
for margin in range(0, 100):
diffs = np.abs(df["x"].values - np.roll(df["x"], margin))
index = np.nonzero(diff == margin)[0]
if idx.size > 0:
overlaps[margin] = idx
In case you're unfamiliar with numpy
, .size
gives you the total size of the ndarray
. (So a 3D array with shapes (10, 20, 30)
has a size of 6000
.)
Avoiding nested loops when comparing objects in Python
Given that matching objects have to have similar ids (ID
in Script1Object
and number
in Script2Object
), and assuming that script2_list
items all have unique number
s, you can create a mapping from number
to the object using a dictionary. Then, as you're iterating, you can get a Script2Object
from the dictionary by its number
directly (without having to loop through the entire script2_list
). You can then call each match function on both objects as you were doing before:
script1_list = script1()
script2_list = script2()
script2_dict = {obj.number: obj for obj in script2_list}
for item1 in script1_list:
item2 = script2_dict.get(item1.ID, None)
if item2 is None:
print('no item2 found for this id: ', item1.ID)
# do something if there's no matching id
if is_match(item1, item2):
do_matching_action()
elif is_different_match(item1, item2):
do_other_matching_action()
if is_bad(item1):
do_error_action()
Trying to avoid nested for loops
Firstly you can cut down on the code in each loop and tidy up with os.path.join
:
for i in folders:
for j in subfolders:
subfolder_path = os.path.join(path, f"folder{i}", f"subfolder{j}")
os.makedirs(subfolder_path)
for k in files:
file_path = os.path.join(subfolderpath, "files-{j}-{k}.txt")
open(file_path, "w")
And then the first 2 loops can be made into 1 with itertools.product
:
import itertools
for i,j in itertools.product(folders, subfolders):
subfolder_path = os.path.join(path, f"folder{i}", f"subfolder{j}")
os.makedirs(subfolder_path)
for k in files:
file_path = os.path.join(subfolderpath, "files-{j}-{k}.txt")
open(file_path, "w")
But how about creating a function to create the file path if it doesn't exist? Then we can condense to a single for loop.
def open_and_create(folder_path, file_name, *a):
os.makedirs(folder_path, exist_ok=True)
return open(os.path.join(folder_path, file_name) *a)
for i,j,k in itertools.product(folders, subfolders, files):
subfolder_path = os.path.join(path, f"folder{i}", f"subfolder{j}")
open_and_create(subfolder_path, "files-{j}-{k}.txt", 'w')
how to avoid nested loop in python and check condition on all combination of elements in multi list
You need itertools.product
:
import itertools
for cs in itertools.product(*[object1, object2, object3]):
if sum(cs) < 9:
print(cs[0], cs[1], cs[2])
# OR
# for c1, c2, c3 in itertools.product(*[object1, object2, object3]):
# if (c1+c2+c3) < 9:
# print(c1,c2,c3)
Output:
4 1 1
4 1 0
4 2 1
4 2 0
4 3 1
4 3 0
4 4 0
5 1 1
5 1 0
5 2 1
5 2 0
5 3 0
3 1 1
3 1 0
3 2 1
3 2 0
3 3 1
3 3 0
3 4 1
3 4 0
Related Topics
I Don't Understand This Python _Del_ Behaviour
Get Human Readable Version of File Size
Generate Correlated Data in Python (3.3)
Ruby Equivalent of Python's "Dir"
Scripting Http More Effeciently
How to Avoid Http Error 429 (Too Many Requests) Python
Efficiently Convert Uneven List of Lists to Minimal Containing Array Padded with Nan
Renaming a Virtualenv Folder Without Breaking It
How to Get the Index of a Maximum Element in a Numpy Array Along One Axis
Nested Ssh Using Python Paramiko
SQL Join or R's Merge() Function in Numpy
Floating Point Math in Different Programming Languages
Learning Ruby from Python; Differences and Similarities
How to Convert a File into a Dictionary
How to Use Cookies in Python Requests
Python Giving Filenotfounderror for File Name Returned by Os.Listdir