List of Lists into Numpy Array

Converting a list of lists into a 2D numpy array

If your lists are NOT of the same length (in each nested dimension) you CANT do a traditional conversion to a NumPy array because it's necessary for a NumPy array of 2D or above to have the same number of elements in its first dimension.

So you cant convert [[1,2],[3,4,5]] to a numpy array directly. Applying np.array will give you a 2 element numpy array where each element is a list object as - array([list([1, 2]), list([3, 4, 5])], dtype=object). I believe this is the issue you are facing.

You cant create a 2D matrix for example that looks like -

[[1,2,3,?],
[4,5,6,7]]

What you may need to do is pad the elements of each list of lists of lists to a fixed length (equal lengths for each dimension) before converting to a NumPy array.

I would recommend iterating over each of the lists of lists of lists as done in the code I have written below to flatten your data, then transforming it the way you want.


If your lists are of the same length, then should not be a problem with numpy version 1.18.5 or above.

a = [[[1,2],[3,4]],[[5,6],[7,8]]]
np.array(a)
array([[[1, 2],
[3, 4]],

[[5, 6],
[7, 8]]])

However, if you are unable to still work with the list of list of lists, then you may need to iterate over each element first to flatten the list and then change it into a numpy array with the required shape as below -

a = [[[1,2],[3,4]],[[5,6],[7,8]]]
flat_a = [item for sublist in a for subsublist in sublist for item in subsublist]
np.array(flat_a).reshape(2,2,2)
array([[[1, 2],
[3, 4]],

[[5, 6],
[7, 8]]])

converting list of lists into 1-D numpy array of lists

In your first case, np.array gives us a warning (in new enough numpy versions). That should tell us something - using np.array to make ragged arrays is not ideal. np.array is meant to create regular multidimensional arrays, with numeric (or string) dtypes. Creating an object dtype array like this a fallback option.

In [96]: sample_list = [["hello", "world"], ["foo"], ["alpha", "beta", "gamma"], []]
In [97]: arr = np.array(sample_list)
<ipython-input-97-ec7d58f98892>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
arr = np.array(sample_list)
In [98]: arr
Out[98]:
array([list(['hello', 'world']), list(['foo']),
list(['alpha', 'beta', 'gamma']), list([])], dtype=object)

In many ways such an array is a debased list, not a true array.

In the second case it can work as intended (by the developers, if not you!):

In [99]: sample_list = [["hello"], ["world"], ["foo"], ["bar"]]
In [100]: arr = np.array(sample_list)
In [101]: arr
Out[101]:
array([['hello'],
['world'],
['foo'],
['bar']], dtype='<U5')

To work around that, I recommend making an object dtype array of the right size, and populating it from the list:

In [102]: arr = np.empty(len(sample_list), object)
In [103]: arr
Out[103]: array([None, None, None, None], dtype=object)
In [104]: arr[:] = sample_list
In [105]: arr
Out[105]:
array([list(['hello']), list(['world']), list(['foo']), list(['bar'])],
dtype=object)

Converting a List of Lists into a numpy array

To create a list of numpy arrays:

np_arrays = []

for array in arrays:
np_arrays.append(numpy.array(array))

Make a numpy array of sets from a list of lists

This can be done efficienly using Union-Find algorithm from graphs (see https://www.geeksforgeeks.org/union-find-algorithm-set-2-union-by-rank/)

We consider each sublist as a vertex in a graph.

Two vertexes are connected if their sublists overlap (i.e. intersect).

Union-find provides an efficient method of finding all disjoint subsets of non-overlapping vertices.

from collections import defaultdict 

# a structure to represent a graph
class Graph:

def __init__(self, num_of_v):
self.num_of_v = num_of_v
self.edges = defaultdict(list)

# graph is represented as an
# array of edges
def add_edge(self, u, v):
self.edges[u].append(v)

class Subset:
def __init__(self, parent, rank):
self.parent = parent
self.rank = rank

def __repr__(self):
return {'name':self.parent, 'age':self.rank}

def __str__(self):
return 'Subset(parent='+str(self.parent)+', rank='+str(self.rank)+ ')'

# A utility function to find set of an element
# node(uses path compression technique)
def find(subsets, node):
if subsets[node].parent != node:
subsets[node].parent = find(subsets, subsets[node].parent)
return subsets[node].parent

# A function that does union of two sets
# of u and v(uses union by rank)
def union(subsets, u, v):

# Attach smaller rank tree under root
# of high rank tree(Union by Rank)
if subsets[u].rank > subsets[v].rank:
subsets[v].parent = u
elif subsets[v].rank > subsets[u].rank:
subsets[u].parent = v

# If ranks are same, then make one as
# root and increment its rank by one
else:
subsets[v].parent = u
subsets[u].rank += 1

def find_disjoint_sets(graph):

# Allocate memory for creating sets
subsets = []

for u in range(graph.num_of_v):
subsets.append(Subset(u, 0))

# Iterate through all edges of graph,
# find sets of both vertices of every
# edge, if sets are same, then there
# is cycle in graph.
for u in graph.edges:
u_rep = find(subsets, u)

for v in graph.edges[u]:
v_rep = find(subsets, v)

if u_rep == v_rep:
continue
else:
union(subsets, u_rep, v_rep)

return subsets

def generate_groups(lst):
""" Finds disjoint sublists in lst. Performs a union of sublists that intersect """
# Generate graph
g = Graph(len(lst))

# Loop over all pairs of subists,
# Place an edge in the graph for sublists that intersect
for i1, v1 in enumerate(lst):
set_v1 = set(v1)
for i2, v2 in enumerate(lst):
if i2 > i1 and set_v1.intersection(v2):
g.add_edge(i1, i2)

# Disjoint subsets of sublists
subsets = find_disjoint_sets(g)

# Union of sublists which are non-disjoint (i.e. have the same parent)
d = {}
for i in range(len(lst)):
sublist_index = find(subsets, i)
if not sublist_index in d:
d[sublist_index] = set()

d[sublist_index] = d[sublist_index].union(lst[i])

return d

# Test Code
lst = [[2],[5],[5,8,16],[7,9,12],[9,20]]

d = generate_groups(lst)
print(d)

Output

{0: {2}, 1: {8, 16, 5}, 3: {9, 12, 20, 7}}

Convert a list of lists to numpy array in python

You can set the dtype to object.

>>> import numpy as np
>>> np.array([[1, 2, 3, (2, 4)], [3, 4, 8, 9], [2, 3, 5, (3, 7)]], dtype=object)
array([[1, 2, 3, (2, 4)],
[3, 4, 8, 9],
[2, 3, 5, (3, 7)]], dtype=object)

Note that there's probably not a good reason to create this array in the first place. The main strength of numpy is fast operations on flat sequences of numeric data, with dtype=object you are storing pointers to full fledged Python objects - just like in a list.

Here is a good answer explaining the object dtype.

Convert list of lists of lists to 2D np array

Pandas dataframe constructor is really flexible. You can cast any list to a dataframe.

df = pd.DataFrame(lst)
df.shape # (4, 5)
df

result1

But as the other comments say, there's not much you could do with this dataframe. One of the main reasons to store data as a df is to use vectorized methods but that's not possible with this.

A more sensible approach is to construct a multi index dataframe where each "column" in lst is its own column.

# reshape 3D -> 2D + build df
df = pd.DataFrame(np.reshape(lst, (len(lst), -1)))
# convert the columns to a 5x3 multi-index
df.columns = pd.MultiIndex.from_arrays(np.divmod(df.columns, len(lst[0][0])))
df

result2



Related Topics



Leave a reply



Submit