Flatten Nested Lists in a List

How do I make a flat list out of a list of lists?

Given a list of lists l,

flat_list = [item for sublist in l for item in sublist]

which means:

flat_list = []
for sublist in l:
    for item in sublist:
        flat_list.append(item)

is faster than the shortcuts posted so far. (l is the list to flatten.)

Here is the corresponding function:

def flatten(l):
    return [item for sublist in l for item in sublist]

As evidence, you can use the timeit module in the standard library:

$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop

Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.

The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.

Flatten nested lists in a list

Loop through the list, unlist recursively, then return as a list:

lapply(LIST2, function(i) list(unlist(i, recursive = TRUE)))

flatten nested list in pandas containing nan

This should work for any nested lists

from collections.abc import Iterable
def flatten(l):
    for el in l:
        if isinstance(el, Iterable) and not isinstance(el, (str, bytes)):
            yield from flatten(el)
        else:
            yield el

So recreating your df

import pandas as pd
df = pd.DataFrame([[[[float('nan')],[float('nan'), 'DE']]],
                   [[[float('nan'), ['IT', 'DE']]]],
                   [[[['FR']]]],
                   [[[['AE'], float('nan'), ['AE',  'MT'], ['MX']]]]],columns=['country'])

df['country'] = df['country'].apply(lambda x:list(set(flatten(x)))).apply(lambda x: [i for i in x if str(i) != 'nan'])

gives the following output

    country
0   [DE]
1   [IT, DE]
2   [FR]
3   [AE, MT, MX]

Flattening List of Dict containing multiple nested lists using pandas json_normalize

You can use:

df = pd.json_normalize(users_info)
df_addresses = df['Addresses.records'].explode().apply(pd.Series)
df_addresses.rename(columns={col:f'Addresses.{col}' for col in df_addresses.columns}, inplace=True)

df_education = df['Education.records'].explode().apply(pd.Series)
df_education.rename(columns={col:f'Education.{col}' for col in df_education.columns}, inplace=True)

cols = [col for col in df.columns if col not in ['Addresses.records', 'Education.records']]
df = df[cols].join(df_addresses).join(df_education)
df.dropna(axis=1, how='all', inplace=True)
print(df)

OUTPUT

Id Name Country.Name Addresses.addressId Addresses.line1 Addresses.city Education.Degree Education.Id
0    21  ABC    Country 1                  12        xyz, 102            PQR        Bachelors           45
0    21  ABC    Country 1                  12        xyz, 102            PQR          Masters           49
0    21  ABC    Country 1                  13        YTR, 102            NMS        Bachelors           45
0    21  ABC    Country 1                  13        YTR, 102            NMS          Masters           49
1    26  PEW    Country 2                  10         BTR, 12            UYT        Bachelors           45
1    26  PEW    Country 2                  10         BTR, 12            UYT          Masters           49
1    26  PEW    Country 2                 123         MEQW, 6            KJH        Bachelors           45
1    26  PEW    Country 2                 123         MEQW, 6            KJH          Masters           49
2   214  TUF          NaN                 NaN             NaN            NaN        Bachelors           45
2   214  TUF          NaN                 NaN             NaN            NaN          Masters           49
3  2609  JJU    Country 2                  10         BTR, 12            UYT              NaN          NaN
3  2609  JJU    Country 2                 123         MEQW, 6            KJH              NaN          NaN

Flatten Nested Lists in a List

How do I make a flat list out of a list of lists?

Flatten nested lists in a list

flatten nested list in pandas containing nan

Flattening List of Dict containing multiple nested lists using pandas json_normalize

Related Topics

Leave a reply