How Does Zip(*[Iter(S)]*N) Work in Python

How does zip(*[iter(s)]*n) work in Python?

iter() is an iterator over a sequence. [x] * n produces a list containing n quantity of x, i.e. a list of length n, where each element is x. *arg unpacks a sequence into arguments for a function call. Therefore you're passing the same iterator 3 times to zip(), and it pulls an item from the iterator each time.

x = iter([1,2,3,4,5,6,7,8,9])
print zip(x, x, x)

Python zip function with iterator vs iterable

The difference is that for [iter(string)] * 3, zip creates aliases of a single iterator. For [string] * 3, zip creates unique iterators per argument. The shorter output without duplicates is zip exhausting the single aliased iterator.

See what is meaning of [iter(list)]*2 in python? for more details on how [iter(...)] * 2 works and causes potentially unexpected results.

See the canonical answer List of lists changes reflected across sublists unexpectedly if the [...] * 3 aliasing behavior is surprising.

Python iterator and zip

An iterator is like a stream of items. You can only look at the items in the stream one at a time and you only ever have access to the first element. To look at something in the stream, you need to remove it from the stream and once you take something from the top of the stream, it's gone from the stream for good.

When you call zip(i, i), zip first looks at the first stream and takes an item out. Then it looks at the second stream (which happens to be the same stream as the first one) and takes an item out. Then it makes a tuple out of those two items and repeats this over and over until there is nothing left in the stream.

Maybe it's easier to see if I were to write the zip function in pure python (with only 2 arguments for simplicity). It would look something like1:

def zip(a, b):
out = []
try:
while True:
item1 = next(a)
item2 = next(b)
out.append((item1, item2))
except StopIteration:
return out

Now imagine the case that you are talking about where a and b are the same object. In that case, we just call next twice on the iterator (i in your example case) which will just take the first two items from i in sequence and pack them into a tuple.

Once we've understood why zip(i, i) behaves the way it does, zip(*([i] * 2)) isn't too hard. Lets read the expression from the inside out...

[i] * 2

That just creates a new list (of length 2) where both of the elements are references to the iterator i. So it's the same thing as zip(*[i, i]) (it's just more convenient to write when you want to repeat something many more than 2 times). * unpacking is a common idiom in python and you can find more information in the python tutorial. The gist of it is that python takes the iterable and "unpacks" it as if each item of the iterable was a separate positional argument to the function. So:

zip(*[i, i])

does the same thing as:

zip(i, i)

And now Bob's our uncle. We've just come full-circle since zip(i, i) is where this discussion started.

1This example code is definitely simplified more than just the afore-mentioned only accepting 2 arguments. For example, zip is probably going to call iter on the input arguments so that it works for any iterable (not just iterators), but this should be enough to get the point across...

zip iterator missing last elements how to join lines together

 $'import sys,itertools\nfor x in itertools.izip_longest(*[iter(sys.stdin)]*50): print(",".join(x).replace("\\n",""))\n'

maybe? this uses izip_longest to not drop extra items at the end

How does iterating over 3 elements at the same time using a combination of zip, * and *3 work?

Unwinding layers of "cleverness", you may find this equivalent spelling easier to follow:

x = iter(accounts_iter)
for a, b, c in zip(*[x, x, x]):
print(a, b, c)

which is, in turn, equivalent to the even less-clever:

x = iter(accounts_iter)
for a, b, c in zip(x, x, x):
print(a, b, c)

Now it should start to become clear. There is only a single iterator object, x. On each iteration, zip(), under the covers, calls next(x) 3 times, once for each iterator object passed to it. But it's the same iterator object here each time. So it delivers the first 3 next(x) results, and leaves the shared iterator object waiting to deliver its 4th result next. Lather, rinse, repeat.

BTW, I suspect you're parsing *[iter(accounts_iter)]*3 incorrectly in your head. The trailing *3 happens first, and then the prefix * is applied to the 3-element list *3 created. f(*iterable) is a shortcut for calling f() with a variable number of arguments, one for each object iterable delivers.

Why does x,y = zip(*zip(a,b)) work in Python?

The asterisk in Python is documented in the Python tutorial, under Unpacking Argument Lists.

What does this function do? (Python iterators)

Given:

>>> li
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

There is a common Python idiom of using zip in combination with iter and * operator to partition a list a flat list into a list of lists of n length:

>>> n=3
>>> zip(*([iter(li)] * n))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14), (15, 16, 17), (18, 19, 20)]

However, if n is not an even multiple of the overall length, the final list is truncated:

>>> n=4
>>> zip(*([iter(li)] * n))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15), (16, 17, 18, 19)]

You can use izip_longest to use the complete list filled in with a selected value for the incomplete sub lists:

>>> list(izip_longest(*([iter(li)] * n)))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15), (16, 17, 18, 19), (20, None, None, None)]

What exactly is going on here? (Python 3.7.6)

To understand what is happening, we want to analyze the statement

for chunk in map(''.join, zip(*[iter(strng)]*sz))

Inside out:

  1. iter(strng) returns an iterator that each time is accessed using next or in an loop consumes an element (a character) of strng and returns said element.
  2. [iter(strng)] is a list, its unique element is the iterator
  3. [iter(strng)]*sz is the concatenation of sz copies of the list, [iter(strng), ..., iter(strng)] containing sz times the same iterator object, I mean literally the same iterator object.
  4. *[iter(strng)]*sz is equivalent to *[iter(strng), ..., iter(strng)] and, when used in a function argument list, unpacks its contents: the function sees its list of arguments as (iter(strng), ..., iter(strng)).
  5. zip(*[iter(strng)]*sz) is hence equivalent to zip(iter(strng), ..., iter(strng)).
  6. At each iteration zip takes the first element of each of its arguments and place them in a tuple, but because the various references to iter all refer to the same, original instance of iter(strng) the first tuple returned by zip contains the first sz characters of strng, the second contains the sz+1 to 2*sz characters etc etc.
  7. Finally, each of this tuples is the argument of ''.join(), so we have a series of strings each long sz characters, spanning the original strng.

That's it.

Is it possible to use zip() with a step parameter?

No zip but
[row[:3] for row in grid[:3]]



Related Topics



Leave a reply



Submit