How do I make a flat list out of a list of lists?
Given a list of lists l
,
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l
is the list to flatten.)
Here is the corresponding function:
def flatten(l):
return [item for sublist in l for item in sublist]
As evidence, you can use the timeit
module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on +
(including the implied use in sum
) are, of necessity, O(L**2)
when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2
.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.
How to flatten list of lists?
Use SelectMany
:
var legalEntityIds =
query.SelectMany(x => x.LegalEntities).Select(y => y.LegalEntityId).ToList();
or, using query syntax:
var legalEntityIds = (
from item in query
from legalEntity in item
select legalEntity.LegalEntityId
).ToList();
Flatten an irregular (arbitrarily nested) list of lists
Using generator functions can make your example easier to read and improve performance.
Python 2
Using the Iterable
ABC added in 2.6:
from collections import Iterable
def flatten(xs):
for x in xs:
if isinstance(x, Iterable) and not isinstance(x, basestring):
for item in flatten(x):
yield item
else:
yield x
Python 3
In Python 3, basestring
is no more, but the tuple (str, bytes)
gives the same effect. Also, the yield from
operator returns an item from a generator one at a time.
from collections.abc import Iterable
def flatten(xs):
for x in xs:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
How can I completely flatten a list (of lists (of lists) ... )
Unfortunately there's no direct built-in that completely flattens a data structure even when sub-lists are wrapped in item containers.
Some possible solutions:
Gather/take
You've already come up with a solution like this, but deepmap
can take care of all the tree iteration logic to simplify it. Its callback is called once for every leaf node of the data structure, so using take
as the callback means that gather
will collect a flat list of the leaf values:
sub reallyflat (+@list) { gather @list.deepmap: *.take }
Custom recursive function
You could use a subroutine like this to recursively slip
lists into their parent:
multi reallyflat (@list) { @list.map: { slip reallyflat $_ } }
multi reallyflat (\leaf) { leaf }
Another approach would be to recursively apply <>
to sub-lists to free them of any item containers they're wrapped in, and then call flat
on the result:
sub reallyflat (+@list) {
flat do for @list {
when Iterable { reallyflat $_<> }
default { $_ }
}
}
Multi-dimensional array indexing
The postcircumfix [ ]
operator can be used with a multi-dimensional subscript to get a flat list of leaf nodes up to a certain depth, though unfortunately the "infinite depth" version is not yet implemented:
say @ab[*;*]; # (a (b c) (d) e f [a (b c)] x (y z) w)
say @ab[*;*;*]; # (a b c d e f a (b c) x y z w)
say @ab[*;*;*;*]; # (a b c d e f a b c x y z w)
say @ab[**]; # HyperWhatever in array index not yet implemented. Sorry.
Still, if you know the maximum depth of your data structure this is a viable solution.
Avoiding containerization
The built-in flat
function can flatten a deeply nested lists of lists just fine. The problem is just that it doesn't descend into item containers (Scalar
s). Common sources of unintentional item containers in nested lists are:
An
Array
(but notList
) wraps each of its elements in a fresh item container, no matter if it had one before.- How to avoid: Use Lists of Lists instead of Arrays of Arrays, if you don't need the mutability that Array provides. Binding with
:=
can be used instead of assignment, to store aList
in a@
variable without turning it into anArray
:
my @a := 'a', ('b', 'c' );
my @b := ('d',), 'e', 'f', @a;
say flat @b; # (d e f a b c)
- How to avoid: Use Lists of Lists instead of Arrays of Arrays, if you don't need the mutability that Array provides. Binding with
$
variables are item containers.- How to avoid: When storing a list in a
$
variable and then inserting it as an element into another list, use<>
to decontainerize it. The parent list's container can also be bypassed using|
when passing it toflat
:
my $a = (3, 4, 5);
my $b = (1, 2, $a<>, 6);
say flat |$b; # (1 2 3 4 5 6)
- How to avoid: When storing a list in a
Flattening a list of strings and lists to work on each item
A more concise approach is:
for entry in initial_list:
for term in ([entry] if isinstance(entry, str) else entry):
do_something(term)
Related Topics
Multi-Level Defaultdict with Variable Depth
Make 2 Functions Run at the Same Time
Does Python Urllib2 Automatically Uncompress Gzip Data Fetched from Webpage
How to Add Percentages on Top of Bars in Seaborn
Python CSV Error: Line Contains Null Byte
Pandas Select from Dataframe Using Startswith
Run a .Bat File Using Python Code
Label Python Data Points on Plot
How to Get First Element in a List of Tuples
Python Sharing a Lock Between Processes
Cmd Opens Windows Store When I Type 'Python'
How to Decode Base64 Data in Python
Matplotlib Scatter Plot Legend
Differencebetween Slice Assignment That Slices the Whole List and Direct Assignment
"Importerror: No Module Named Site" on Windows
How to Write a Multidimensional Array to a Text File