Iterate through pairs of items in a Python list
From the itertools
recipes:
from itertools import tee
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return zip(a, b)
for v, w in pairwise(a):
...
Iterating over every two elements in a list
You need a pairwise()
(or grouped()
) implementation.
def pairwise(iterable):
"s -> (s0, s1), (s2, s3), (s4, s5), ..."
a = iter(iterable)
return zip(a, a)
for x, y in pairwise(l):
print("%d + %d = %d" % (x, y, x + y))
Or, more generally:
def grouped(iterable, n):
"s -> (s0,s1,s2,...sn-1), (sn,sn+1,sn+2,...s2n-1), (s2n,s2n+1,s2n+2,...s3n-1), ..."
return zip(*[iter(iterable)]*n)
for x, y in grouped(l, 2):
print("%d + %d = %d" % (x, y, x + y))
In Python 2, you should import izip
as a replacement for Python 3's built-in zip()
function.
All credit to martineau for his answer to my question, I have found this to be very efficient as it only iterates once over the list and does not create any unnecessary lists in the process.
N.B: This should not be confused with the pairwise
recipe in Python's own itertools
documentation, which yields s -> (s0, s1), (s1, s2), (s2, s3), ...
, as pointed out by @lazyr in the comments.
Little addition for those who would like to do type checking with mypy on Python 3:
from typing import Iterable, Tuple, TypeVar
T = TypeVar("T")
def grouped(iterable: Iterable[T], n=2) -> Iterable[Tuple[T, ...]]:
"""s -> (s0,s1,s2,...sn-1), (sn,sn+1,sn+2,...s2n-1), ..."""
return zip(*[iter(iterable)] * n)
Iterate through pairs of items
Here use a loop where all elements are connected np.hstack. In df.loc the necessary indices are given in square brackets (on the left). On the last iteration, the code ends up in the else block, use last and first index are served.
import numpy as np
import pandas as pd
df = pd.DataFrame({'x': [1.0, 2.0, 2.0, 1.5], 'y': [1.1, 1.0, 2.0, 3.0]})
hist = len(df)
for i in range(0, hist):
if i <= (hist - 2):
print(np.hstack(df.loc[[i, i + 1], :].values))
else:
print(np.hstack(df.loc[[i, 0], :].values))
Output
[1. 1.1 2. 1. ]
[2. 1. 2. 2.]
[2. 2. 1.5 3. ]
[1.5 3. 1. 1.1]
Iterate over a list, getting multiple items at once
The way that works is usually the proper way...
You could in principle zip the list with a deferred slice of itself:
myList = [1,2,3,4,5]
for one,two in zip(myList, myList[1:]):
print(one,two, sep=",")
Note that zip
ends on the shortest given iterable, so it will finish on the shorter slice; no need to also shorten the full myList
parameter.
Iterate over all pairs of consecutive items in a list
Just use zip
>>> l = [1, 7, 3, 5]
>>> for first, second in zip(l, l[1:]):
... print first, second
...
1 7
7 3
3 5
If you use Python 2 (not suggested) you might consider using the izip
function in itertools
for very long lists where you don't want to create a new list.
import itertools
for first, second in itertools.izip(l, l[1:]):
...
How can I iterate over overlapping (current, next) pairs of values from a list?
Here's a relevant example from the itertools module docs:
import itertools
def pairwise(iterable):
"s -> (s0, s1), (s1, s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return zip(a, b)
For Python 2, you need itertools.izip
instead of zip
:
import itertools
def pairwise(iterable):
"s -> (s0, s1), (s1, s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return itertools.izip(a, b)
How this works:
First, two parallel iterators, a
and b
are created (the tee()
call), both pointing to the first element of the original iterable. The second iterator, b
is moved 1 step forward (the next(b, None)
) call). At this point a
points to s0 and b
points to s1. Both a
and b
can traverse the original iterator independently - the izip function takes the two iterators and makes pairs of the returned elements, advancing both iterators at the same pace.
One caveat: the tee()
function produces two iterators that can advance independently of each other, but it comes at a cost. If one of the iterators advances further than the other, then tee()
needs to keep the consumed elements in memory until the second iterator comsumes them too (it cannot 'rewind' the original iterator). Here it doesn't matter because one iterator is only 1 step ahead of the other, but in general it's easy to use a lot of memory this way.
And since tee()
can take an n
parameter, this can also be used for more than two parallel iterators:
def threes(iterator):
"s -> (s0, s1, s2), (s1, s2, s3), (s2, s3, 4), ..."
a, b, c = itertools.tee(iterator, 3)
next(b, None)
next(c, None)
next(c, None)
return zip(a, b, c)
Operation on every pair of element in a list
Check out product()
in the itertools
module. It does exactly what you describe.
import itertools
my_list = [1,2,3,4]
for pair in itertools.product(my_list, repeat=2):
foo(*pair)
This is equivalent to:
my_list = [1,2,3,4]
for x in my_list:
for y in my_list:
foo(x, y)
Edit: There are two very similar functions as well, permutations()
and combinations()
. To illustrate how they differ:
product()
generates every possible pairing of elements, including all duplicates:
1,1 1,2 1,3 1,4
2,1 2,2 2,3 2,4
3,1 3,2 3,3 3,4
4,1 4,2 4,3 4,4
permutations()
generates all unique orderings of each unique pair of elements, eliminating the x,x
duplicates:
. 1,2 1,3 1,4
2,1 . 2,3 2,4
3,1 3,2 . 3,4
4,1 4,2 4,3 .
Finally, combinations()
only generates each unique pair of elements, in lexicographic order:
. 1,2 1,3 1,4
. . 2,3 2,4
. . . 3,4
. . . .
All three of these functions were introduced in Python 2.6.
How to iterate over two sorted lists in largest pairs order in Python
Using the roundrobin
recipe that Karl mentioned (copied verbatim from the recipes, could also import it from more-itertools). I think this will be faster, since all the hard work is done in C code of various itertools.
from itertools import repeat, chain, cycle, islice
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))
def pairs(a, b):
aseen = []
bseen = []
def agen():
for aa in a:
aseen.append(aa)
yield zip(repeat(aa), bseen)
def bgen():
for bb in b:
bseen.append(bb)
yield zip(aseen, repeat(bb))
return chain.from_iterable(roundrobin(agen(), bgen()))
a = ['C', 'B', 'A']
b = [3, 2, 1]
print(*pairs(a, b))
Output (Try it online!):
('C', 3) ('B', 3) ('C', 2) ('B', 2) ('A', 3) ('A', 2) ('C', 1) ('B', 1) ('A', 1)
Benchmark with two iterables of 2000 elements each (Try it online!):
50 ms 50 ms 50 ms Kelly
241 ms 241 ms 242 ms Karl
Alternatively, if the two iterables can be iterated multiple times, we don't need to save what we've seen (Try it online!):
def pairs(a, b):
def agen():
for i, x in enumerate(a):
yield zip(repeat(x), islice(b, i))
def bgen():
for i, x in enumerate(b, 1):
yield zip(islice(a, i), repeat(x))
return chain.from_iterable(roundrobin(agen(), bgen()))
(Will add to the benchmark later... Should be a little slower than my first solution.)
An extreme map/itertools version of that (Try it online!):
def pairs(a, b):
return chain.from_iterable(roundrobin(
map(zip,
map(repeat, a),
map(islice, repeat(b), count())),
map(zip,
map(islice, repeat(a), count(1)),
map(repeat, b))
))
Related Topics
Threading.Timer - Repeat Function Every 'N' Seconds
Writing a Python List of Lists to a CSV File
How to Import a Module That Is Definitely Installed
Iterate an Iterator by Chunks (Of N) in Python
Staleelementexception When Iterating with Python
How to Use a Dot "." to Access Members of Dictionary
Using Pip to Install Packages to Anaconda Environment
How to Get the Caller's Method Name in the Called Method
How to Process Sigterm Signal Gracefully
Find All Occurrences of a Key in Nested Dictionaries and Lists
Sending "User-Agent" Using Requests Library in Python
How to Configure Chromedriver to Initiate Chrome Browser in Headless Mode Through Selenium
Django Multivaluedictkeyerror Error, How to Deal with It
How to Enable Cors on Django Rest Framework
How to Parse Dates with -0400 Timezone String in Python
"Command Not Found" Using Line in Argument to Os.System Using Python