Iterate Through Pairs of Items in a Python List

Iterate through pairs of items in a Python list

From the itertools recipes:

from itertools import tee

def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return zip(a, b)

for v, w in pairwise(a):
...

Iterating over every two elements in a list

You need a pairwise() (or grouped()) implementation.

def pairwise(iterable):
"s -> (s0, s1), (s2, s3), (s4, s5), ..."
a = iter(iterable)
return zip(a, a)

for x, y in pairwise(l):
print("%d + %d = %d" % (x, y, x + y))

Or, more generally:

def grouped(iterable, n):
"s -> (s0,s1,s2,...sn-1), (sn,sn+1,sn+2,...s2n-1), (s2n,s2n+1,s2n+2,...s3n-1), ..."
return zip(*[iter(iterable)]*n)

for x, y in grouped(l, 2):
print("%d + %d = %d" % (x, y, x + y))

In Python 2, you should import izip as a replacement for Python 3's built-in zip() function.

All credit to martineau for his answer to my question, I have found this to be very efficient as it only iterates once over the list and does not create any unnecessary lists in the process.

N.B: This should not be confused with the pairwise recipe in Python's own itertools documentation, which yields s -> (s0, s1), (s1, s2), (s2, s3), ..., as pointed out by @lazyr in the comments.

Little addition for those who would like to do type checking with mypy on Python 3:

from typing import Iterable, Tuple, TypeVar

T = TypeVar("T")

def grouped(iterable: Iterable[T], n=2) -> Iterable[Tuple[T, ...]]:
"""s -> (s0,s1,s2,...sn-1), (sn,sn+1,sn+2,...s2n-1), ..."""
return zip(*[iter(iterable)] * n)

Iterate through pairs of items

Here use a loop where all elements are connected np.hstack. In df.loc the necessary indices are given in square brackets (on the left). On the last iteration, the code ends up in the else block, use last and first index are served.

import numpy as np
import pandas as pd

df = pd.DataFrame({'x': [1.0, 2.0, 2.0, 1.5], 'y': [1.1, 1.0, 2.0, 3.0]})

hist = len(df)
for i in range(0, hist):
if i <= (hist - 2):
print(np.hstack(df.loc[[i, i + 1], :].values))
else:
print(np.hstack(df.loc[[i, 0], :].values))

Output

[1.  1.1 2.  1. ]
[2. 1. 2. 2.]
[2. 2. 1.5 3. ]
[1.5 3. 1. 1.1]

Iterate over a list, getting multiple items at once

The way that works is usually the proper way...

You could in principle zip the list with a deferred slice of itself:

myList = [1,2,3,4,5]
for one,two in zip(myList, myList[1:]):
print(one,two, sep=",")

Note that zip ends on the shortest given iterable, so it will finish on the shorter slice; no need to also shorten the full myList parameter.

Iterate over all pairs of consecutive items in a list

Just use zip

>>> l = [1, 7, 3, 5]
>>> for first, second in zip(l, l[1:]):
... print first, second
...
1 7
7 3
3 5

If you use Python 2 (not suggested) you might consider using the izip function in itertools for very long lists where you don't want to create a new list.

import itertools

for first, second in itertools.izip(l, l[1:]):
...

How can I iterate over overlapping (current, next) pairs of values from a list?

Here's a relevant example from the itertools module docs:

import itertools
def pairwise(iterable):
"s -> (s0, s1), (s1, s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return zip(a, b)

For Python 2, you need itertools.izip instead of zip:

import itertools
def pairwise(iterable):
"s -> (s0, s1), (s1, s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return itertools.izip(a, b)

How this works:

First, two parallel iterators, a and b are created (the tee() call), both pointing to the first element of the original iterable. The second iterator, b is moved 1 step forward (the next(b, None)) call). At this point a points to s0 and b points to s1. Both a and b can traverse the original iterator independently - the izip function takes the two iterators and makes pairs of the returned elements, advancing both iterators at the same pace.

One caveat: the tee() function produces two iterators that can advance independently of each other, but it comes at a cost. If one of the iterators advances further than the other, then tee() needs to keep the consumed elements in memory until the second iterator comsumes them too (it cannot 'rewind' the original iterator). Here it doesn't matter because one iterator is only 1 step ahead of the other, but in general it's easy to use a lot of memory this way.

And since tee() can take an n parameter, this can also be used for more than two parallel iterators:

def threes(iterator):
"s -> (s0, s1, s2), (s1, s2, s3), (s2, s3, 4), ..."
a, b, c = itertools.tee(iterator, 3)
next(b, None)
next(c, None)
next(c, None)
return zip(a, b, c)

Operation on every pair of element in a list

Check out product() in the itertools module. It does exactly what you describe.

import itertools

my_list = [1,2,3,4]
for pair in itertools.product(my_list, repeat=2):
foo(*pair)

This is equivalent to:

my_list = [1,2,3,4]
for x in my_list:
for y in my_list:
foo(x, y)

Edit: There are two very similar functions as well, permutations() and combinations(). To illustrate how they differ:

product() generates every possible pairing of elements, including all duplicates:

1,1  1,2  1,3  1,4
2,1 2,2 2,3 2,4
3,1 3,2 3,3 3,4
4,1 4,2 4,3 4,4

permutations() generates all unique orderings of each unique pair of elements, eliminating the x,x duplicates:

 .   1,2  1,3  1,4
2,1 . 2,3 2,4
3,1 3,2 . 3,4
4,1 4,2 4,3 .

Finally, combinations() only generates each unique pair of elements, in lexicographic order:

 .   1,2  1,3  1,4
. . 2,3 2,4
. . . 3,4
. . . .

All three of these functions were introduced in Python 2.6.

How to iterate over two sorted lists in largest pairs order in Python

Using the roundrobin recipe that Karl mentioned (copied verbatim from the recipes, could also import it from more-itertools). I think this will be faster, since all the hard work is done in C code of various itertools.

from itertools import repeat, chain, cycle, islice

def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))

def pairs(a, b):
aseen = []
bseen = []
def agen():
for aa in a:
aseen.append(aa)
yield zip(repeat(aa), bseen)
def bgen():
for bb in b:
bseen.append(bb)
yield zip(aseen, repeat(bb))
return chain.from_iterable(roundrobin(agen(), bgen()))

a = ['C', 'B', 'A']
b = [3, 2, 1]
print(*pairs(a, b))

Output (Try it online!):

('C', 3) ('B', 3) ('C', 2) ('B', 2) ('A', 3) ('A', 2) ('C', 1) ('B', 1) ('A', 1)

Benchmark with two iterables of 2000 elements each (Try it online!):

 50 ms   50 ms   50 ms  Kelly
241 ms 241 ms 242 ms Karl

Alternatively, if the two iterables can be iterated multiple times, we don't need to save what we've seen (Try it online!):

def pairs(a, b):
def agen():
for i, x in enumerate(a):
yield zip(repeat(x), islice(b, i))
def bgen():
for i, x in enumerate(b, 1):
yield zip(islice(a, i), repeat(x))
return chain.from_iterable(roundrobin(agen(), bgen()))

(Will add to the benchmark later... Should be a little slower than my first solution.)

An extreme map/itertools version of that (Try it online!):

def pairs(a, b):
return chain.from_iterable(roundrobin(
map(zip,
map(repeat, a),
map(islice, repeat(b), count())),
map(zip,
map(islice, repeat(a), count(1)),
map(repeat, b))
))


Related Topics



Leave a reply



Submit