How does zip(*[iter(s)]*n) work in Python?
iter()
is an iterator over a sequence. [x] * n
produces a list containing n
quantity of x
, i.e. a list of length n
, where each element is x
. *arg
unpacks a sequence into arguments for a function call. Therefore you're passing the same iterator 3 times to zip()
, and it pulls an item from the iterator each time.
x = iter([1,2,3,4,5,6,7,8,9])
print zip(x, x, x)
Python zip function with iterator vs iterable
The difference is that for [iter(string)] * 3
, zip
creates aliases of a single iterator. For [string] * 3
, zip
creates unique iterators per argument. The shorter output without duplicates is zip
exhausting the single aliased iterator.
See what is meaning of [iter(list)]*2 in python? for more details on how [iter(...)] * 2
works and causes potentially unexpected results.
See the canonical answer List of lists changes reflected across sublists unexpectedly if the [...] * 3
aliasing behavior is surprising.
Python iterator and zip
An iterator is like a stream of items. You can only look at the items in the stream one at a time and you only ever have access to the first element. To look at something in the stream, you need to remove it from the stream and once you take something from the top of the stream, it's gone from the stream for good.
When you call zip(i, i)
, zip
first looks at the first stream and takes an item out. Then it looks at the second stream (which happens to be the same stream as the first one) and takes an item out. Then it makes a tuple out of those two items and repeats this over and over until there is nothing left in the stream.
Maybe it's easier to see if I were to write the zip
function in pure python (with only 2 arguments for simplicity). It would look something like1:
def zip(a, b):
out = []
try:
while True:
item1 = next(a)
item2 = next(b)
out.append((item1, item2))
except StopIteration:
return out
Now imagine the case that you are talking about where a
and b
are the same object. In that case, we just call next
twice on the iterator (i
in your example case) which will just take the first two items from i
in sequence and pack them into a tuple.
Once we've understood why zip(i, i)
behaves the way it does, zip(*([i] * 2))
isn't too hard. Lets read the expression from the inside out...
[i] * 2
That just creates a new list (of length 2) where both of the elements are references to the iterator i
. So it's the same thing as zip(*[i, i])
(it's just more convenient to write when you want to repeat something many more than 2 times). *
unpacking is a common idiom in python and you can find more information in the python tutorial. The gist of it is that python takes the iterable and "unpacks" it as if each item of the iterable was a separate positional argument to the function. So:
zip(*[i, i])
does the same thing as:
zip(i, i)
And now Bob's our uncle. We've just come full-circle since zip(i, i)
is where this discussion started.
1This example code is definitely simplified more than just the afore-mentioned only accepting 2 arguments. For example, zip
is probably going to call iter
on the input arguments so that it works for any iterable (not just iterators), but this should be enough to get the point across...
zip iterator missing last elements how to join lines together
$'import sys,itertools\nfor x in itertools.izip_longest(*[iter(sys.stdin)]*50): print(",".join(x).replace("\\n",""))\n'
maybe? this uses izip_longest to not drop extra items at the end
How does iterating over 3 elements at the same time using a combination of zip, * and *3 work?
Unwinding layers of "cleverness", you may find this equivalent spelling easier to follow:
x = iter(accounts_iter)
for a, b, c in zip(*[x, x, x]):
print(a, b, c)
which is, in turn, equivalent to the even less-clever:
x = iter(accounts_iter)
for a, b, c in zip(x, x, x):
print(a, b, c)
Now it should start to become clear. There is only a single iterator object, x
. On each iteration, zip()
, under the covers, calls next(x)
3 times, once for each iterator object passed to it. But it's the same iterator object here each time. So it delivers the first 3 next(x)
results, and leaves the shared iterator object waiting to deliver its 4th result next. Lather, rinse, repeat.
BTW, I suspect you're parsing *[iter(accounts_iter)]*3
incorrectly in your head. The trailing *3
happens first, and then the prefix *
is applied to the 3-element list *3
created. f(*iterable)
is a shortcut for calling f()
with a variable number of arguments, one for each object iterable
delivers.
Why does x,y = zip(*zip(a,b)) work in Python?
The asterisk in Python is documented in the Python tutorial, under Unpacking Argument Lists.
What does this function do? (Python iterators)
Given:
>>> li
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
There is a common Python idiom of using zip in combination with iter and * operator to partition a list a flat list into a list of lists of n length:
>>> n=3
>>> zip(*([iter(li)] * n))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14), (15, 16, 17), (18, 19, 20)]
However, if n
is not an even multiple of the overall length, the final list is truncated:
>>> n=4
>>> zip(*([iter(li)] * n))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15), (16, 17, 18, 19)]
You can use izip_longest to use the complete list filled in with a selected value for the incomplete sub lists:
>>> list(izip_longest(*([iter(li)] * n)))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15), (16, 17, 18, 19), (20, None, None, None)]
What exactly is going on here? (Python 3.7.6)
To understand what is happening, we want to analyze the statement
for chunk in map(''.join, zip(*[iter(strng)]*sz))
Inside out:
iter(strng)
returns an iterator that each time is accessed usingnext
or in an loop consumes an element (a character) ofstrng
and returns said element.[iter(strng)]
is a list, its unique element is the iterator[iter(strng)]*sz
is the concatenation ofsz
copies of the list,[iter(strng), ..., iter(strng)]
containingsz
times the same iterator object, I mean literally the same iterator object.*[iter(strng)]*sz
is equivalent to*[iter(strng), ..., iter(strng)]
and, when used in a function argument list, unpacks its contents: the function sees its list of arguments as(iter(strng), ..., iter(strng))
.zip(*[iter(strng)]*sz)
is hence equivalent tozip(iter(strng), ..., iter(strng))
.- At each iteration
zip
takes the first element of each of its arguments and place them in a tuple, but because the various references toiter
all refer to the same, original instance ofiter(strng)
the first tuple returned byzip
contains the firstsz
characters ofstrng
, the second contains thesz+1
to2*sz
characters etc etc. - Finally, each of this tuples is the argument of
''.join()
, so we have a series of strings each longsz
characters, spanning the originalstrng
.
That's it.
Is it possible to use zip() with a step parameter?
No zip but[row[:3] for row in grid[:3]]
Related Topics
I Have a Problem with Sending Mail:Typeerror: _Init_() Got an Unexpected Keyword Argument 'Context'
Usb Automatic Detection in Python for Linux Env
Add Custom Method to String Object
How to Upload File with Python Requests
Creating a JSON Response Using Django and Python
Psycopg2: Insert Multiple Rows with One Query
How to Determine a Python Variable's Type
How to Convert a String with Dot and Comma into a Float in Python
Differencebetween a String and a Byte String
Compare Two Files for Differences in Python
Tutorial or Guide for Scripting Xcode Build Phases
Custom Sorting in Pandas Dataframe
How to Create a Zip Archive of a Directory
Dynamically Evaluate an Expression from a Formula in Pandas