How to Split an Iterable in Constant-Size Chunks

how to split an iterable in constant-size chunks

This is probably more efficient (faster)

def batch(iterable, n=1):
l = len(iterable)
for ndx in range(0, l, n):
yield iterable[ndx:min(ndx + n, l)]

for x in batch(range(0, 10), 3):
print x

Example using list

data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # list of data 

for x in batch(data, 3):
print(x)

# Output

[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
[9, 10]

It avoids building new lists.

How do I split a list into equally-sized chunks?

Here's a generator that yields evenly-sized chunks:

def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
for i in range(0, len(lst), n):
yield lst[i:i + n]
import pprint
pprint.pprint(list(chunks(range(10, 75), 10)))
[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]

For Python 2, using xrange instead of range:

def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
for i in xrange(0, len(lst), n):
yield lst[i:i + n]

Below is a list comprehension one-liner. The method above is preferable, though, since using named functions makes code easier to understand. For Python 3:

[lst[i:i + n] for i in range(0, len(lst), n)]

For Python 2:

[lst[i:i + n] for i in xrange(0, len(lst), n)]

How can I consume an iterable in batches (equally sized chunks)?

I had to write one for myself, which I'm sharing here for me and the others to find here easily:

// subsequently yield iterators of given `size`
// these have to be fully consumed
function* batches(iterable, size) {
const it = iterable[Symbol.iterator]();
while (true) {
// this is for the case when batch ends at the end of iterable
// (we don't want to yield empty batch)
let {value, done} = it.next();
if (done) return value;

yield function*() {
yield value;
for (let curr = 1; curr < size; curr++) {
({value, done} = it.next());
if (done) return;

yield value;
}
}();
if (done) return value;
}
}

It yields generators, not Arrays for example. You have to fully consume each batch before calling next() on it again.

Split a generator into chunks without pre-walking it

One way would be to peek at the first element, if any, and then create and return the actual generator.

def head(iterable, max=10):
first = next(iterable) # raise exception when depleted
def head_inner():
yield first # yield the extracted first element
for cnt, el in enumerate(iterable):
yield el
if cnt + 1 >= max: # cnt + 1 to include first
break
return head_inner()

Just use this in your chunk generator and catch the StopIteration exception like you did with your custom exception.


Update: Here's another version, using itertools.islice to replace most of the head function, and a for loop. This simple for loop in fact does exactly the same thing as that unwieldy while-try-next-except-break construct in the original code, so the result is much more readable.

def chunks(iterable, size=10):
iterator = iter(iterable)
for first in iterator: # stops when iterator is depleted
def chunk(): # construct generator for next chunk
yield first # yield element from for loop
for more in islice(iterator, size - 1):
yield more # yield more elements from the iterator
yield chunk() # in outer generator, yield next chunk

And we can get even shorter than that, using itertools.chain to replace the inner generator:

def chunks(iterable, size=10):
iterator = iter(iterable)
for first in iterator:
yield chain([first], islice(iterator, size - 1))

How do I split a list into equally-sized chunks?

Here's a generator that yields evenly-sized chunks:

def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
for i in range(0, len(lst), n):
yield lst[i:i + n]
import pprint
pprint.pprint(list(chunks(range(10, 75), 10)))
[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]

For Python 2, using xrange instead of range:

def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
for i in xrange(0, len(lst), n):
yield lst[i:i + n]

Below is a list comprehension one-liner. The method above is preferable, though, since using named functions makes code easier to understand. For Python 3:

[lst[i:i + n] for i in range(0, len(lst), n)]

For Python 2:

[lst[i:i + n] for i in xrange(0, len(lst), n)]

How do I split a list into equally-sized chunks?

Here's a generator that yields evenly-sized chunks:

def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
for i in range(0, len(lst), n):
yield lst[i:i + n]
import pprint
pprint.pprint(list(chunks(range(10, 75), 10)))
[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]

For Python 2, using xrange instead of range:

def chunks(lst, n):
"""Yield successive n-sized chunks from lst."""
for i in xrange(0, len(lst), n):
yield lst[i:i + n]

Below is a list comprehension one-liner. The method above is preferable, though, since using named functions makes code easier to understand. For Python 3:

[lst[i:i + n] for i in range(0, len(lst), n)]

For Python 2:

[lst[i:i + n] for i in xrange(0, len(lst), n)]

Splitting up a python list in chunks based on length of items

Something like this should work

def _splicegen(maxchars, stringlist):
"""
Return a list of slices to print based on maxchars string-length boundary.
"""
runningcount = 0 # start at 0
tmpslice = [] # tmp list where we append slice numbers.
for i, item in enumerate(stringlist):
runningcount += len(item)
if runningcount <= int(maxchars):
tmpslice.append(i)
else:
yield tmpslice
tmpslice = [i]
runningcount = len(item)
yield(tmpslice)

Also see the textwrap module



Related Topics



Leave a reply



Submit