Getting list of lists into pandas DataFrame
Call the pd.DataFrame
constructor directly:
df = pd.DataFrame(table, columns=headers)
df
Heading1 Heading2
0 1 2
1 3 4
How to turn a list of lists into columns of a pandas dataframe?
You can try using df.explode
and df.apply
:
import pandas as pd
df = pd.DataFrame(data= {'Generation': 0, 'Route_set':[[[20., 19., 47., 56.], [21., 34., 78., 34.]]]})
df['route1']=df['Route_set'].apply(lambda x: x[0])
df['route2']=df['Route_set'].apply(lambda x: x[1])
df = df.explode(['route1', 'route2'], ignore_index=True)
df2 = df[df.columns.difference(['Route_set', 'Generation'])]
| | route1 | route2 |
|---:|---------:|---------:|
| 0 | 20 | 21 |
| 1 | 19 | 34 |
| 2 | 47 | 78 |
| 3 | 56 | 34 |
Or you can just create a new dataframe with the values like this:
import pandas as pd
df = pd.DataFrame(data= {'Generation': 0, 'Route_set':[[[20., 19., 47., 56.], [21., 34., 78., 34.]]]})
df1 = pd.DataFrame.from_dict(dict(zip(['route1', 'route2'], df.Route_set.to_numpy()[0])), orient='index').transpose()
| | route1 | route2 |
|---:|---------:|---------:|
| 0 | 20 | 21 |
| 1 | 19 | 34 |
| 2 | 47 | 78 |
| 3 | 56 | 34 |
Update 1:
import pandas as pd
df = pd.DataFrame(data= {'Generation': 0, 'Route_set':[
[[20.0, 19.0, 47.0, 56.0, 43.0, 53.0, 18.0, -1.0, -1.0, -1.0, -1.0, -1.0], [20.0, 51.0, 46.0, 37.0, 2.0, 57.0, 49.0, 36.0, 25.0, 5.0, 4.0, 34.0], [54.0, 23.0, 5.0, 46.0, 34.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [57.0, 48.0, 46.0, 35.0, 25.0, 27.0, 52.0, 8.0, 39.0, 22.0, 51.0, 28.0], [57.0, 16.0, 45.0, 25.0, 49.0, 38.0, 0.0, 46.0, 13.0, 18.0, 19.0, 20.0], [21.0, 11.0, 6.0, 33.0, 25.0, 49.0, 57.0, 29.0, 12.0, 3.0, -1.0, -1.0], [9.0, 15.0, 47.0, 42.0, 25.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [51.0, 25.0, 22.0, 14.0, 39.0, 8.0, 40.0, 0.0, 10.0, 26.0, 32.0, 47.0], [1.0, 33.0, 24.0, 46.0, 56.0, 30.0, 48.0, 51.0, -1.0, -1.0, -1.0, -1.0], [25.0, 31.0, 50.0, 17.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [57.0, 12.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [20.0, 41.0, 47.0, 15.0, 46.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [14.0, 44.0, 39.0, 25.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [20.0, 51.0, 25.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [57.0, 49.0, 5.0, 20.0, 37.0, 46.0, 36.0, 25.0, 39.0, 51.0, 48.0, -1.0], [5.0, 0.0, 33.0, 55.0, 25.0, 48.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [51.0, 32.0, 33.0, 24.0, 35.0, 8.0, 25.0, 4.0, 46.0, 1.0, 7.0, -1.0], [5.0, 25.0, 34.0, 46.0, 1.0, 9.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [38.0, 57.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [12.0, 57.0, 49.0, 25.0, 9.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0]],
]})
data = df.Route_set.to_numpy()[0]
df = pd.DataFrame.from_dict(dict(zip(['route{}'.format(i) for i in range(1, len(data)+1)], [data[i] for i in range(len(data))])), orient='index').transpose()
df = df.apply(lambda x: x.explode() if 'route' in x.name else x)
df[sorted(df.columns)]
print(df.to_markdown())
| | route1 | route2 | route3 | route4 | route5 | route6 | route7 | route8 | route9 | route10 | route11 | route12 | route13 | route14 | route15 | route16 | route17 | route18 | route19 | route20 |
|---:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| 0 | 20 | 20 | 54 | 57 | 57 | 21 | 9 | 51 | 1 | 25 | 57 | 20 | 14 | 20 | 57 | 5 | 51 | 5 | 38 | 12 |
| 1 | 19 | 51 | 23 | 48 | 16 | 11 | 15 | 25 | 33 | 31 | 12 | 41 | 44 | 51 | 49 | 0 | 32 | 25 | 57 | 57 |
| 2 | 47 | 46 | 5 | 46 | 45 | 6 | 47 | 22 | 24 | 50 | -1 | 47 | 39 | 25 | 5 | 33 | 33 | 34 | -1 | 49 |
| 3 | 56 | 37 | 46 | 35 | 25 | 33 | 42 | 14 | 46 | 17 | -1 | 15 | 25 | -1 | 20 | 55 | 24 | 46 | -1 | 25 |
| 4 | 43 | 2 | 34 | 25 | 49 | 25 | 25 | 39 | 56 | -1 | -1 | 46 | -1 | -1 | 37 | 25 | 35 | 1 | -1 | 9 |
| 5 | 53 | 57 | -1 | 27 | 38 | 49 | -1 | 8 | 30 | -1 | -1 | -1 | -1 | -1 | 46 | 48 | 8 | 9 | -1 | -1 |
| 6 | 18 | 49 | -1 | 52 | 0 | 57 | -1 | 40 | 48 | -1 | -1 | -1 | -1 | -1 | 36 | -1 | 25 | -1 | -1 | -1 |
| 7 | -1 | 36 | -1 | 8 | 46 | 29 | -1 | 0 | 51 | -1 | -1 | -1 | -1 | -1 | 25 | -1 | 4 | -1 | -1 | -1 |
| 8 | -1 | 25 | -1 | 39 | 13 | 12 | -1 | 10 | -1 | -1 | -1 | -1 | -1 | -1 | 39 | -1 | 46 | -1 | -1 | -1 |
| 9 | -1 | 5 | -1 | 22 | 18 | 3 | -1 | 26 | -1 | -1 | -1 | -1 | -1 | -1 | 51 | -1 | 1 | -1 | -1 | -1 |
| 10 | -1 | 4 | -1 | 51 | 19 | -1 | -1 | 32 | -1 | -1 | -1 | -1 | -1 | -1 | 48 | -1 | 7 | -1 | -1 | -1 |
| 11 | -1 | 34 | -1 | 28 | 20 | -1 | -1 | 47 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 |
pandas: convert list of lists to dataframe
The apostrophe means that the data is string type in the list, but can be extracted as the first element using my_list[0]
. Need to process each list using list comprehension before putting into the dataframe.
There seems some typo (missing coordinates) in the last line of data, so I corrected it by adding 'null'.
import pandas as pd
data = [['1,er,2,Fado de Padd,1\'18"1,H,6,2600,J. Dekker,17 490 €,A. De Wrede,1,6'],
['2,e,7,Elixir Normand,1\'18"2,H,7,2600,S. Schoonhoven,24 755 €,S. Schoonhoven,14'],
['3,e,3,Give You All of Me,1\'18"2,H,5,2600,JF. Van Dooyeweerd,17 600 €,JF. Van Dooyeweerd,10'],
['4,e,4,Gouritch,1\'18"3,H,5,2600,BJ. Crebas,20 700 €,BJ. Crebas,32'],
['5,e,1,Franky du Cap Vert,1\'18"4,H,6,2600,JH. Mieras,15 536 €,N. De Vreede,65'],
['6,e,10,Défi Magik,1\'18"0,H,8,2620,F. Verkaik,44 865 €,AW. Bosscha,6,3'],
['7,e,9,Fleuron,1\'18"2,H,6,2620,M. Brouwer,44 830 €,D. Brouwer,7,3'],
['8,e,8,Dream Gibus,1\'18"6,H,8,2620,R. Ebbinge,33 330 €,Mme A. Lehmann,36'],
['9,e,5,Beau Gaillard,1\'19"5,H,10,2600,A. Bakker,20 140 €,N. De Vreede,44'],
['0,DAI,6,Bikini de Larcy,null,H,10,2600,D. Den Dubbelden,21 834 €,N. Rip,52']]
df = pd.DataFrame([line[0].split(',') for line in data])
print(df)
Output
0 1 2 3 4 5 6 7 8 \
0 1 er 2 Fado de Padd 1'18"1 H 6 2600 J. Dekker
1 2 e 7 Elixir Normand 1'18"2 H 7 2600 S. Schoonhoven
2 3 e 3 Give You All of Me 1'18"2 H 5 2600 JF. Van Dooyeweerd
3 4 e 4 Gouritch 1'18"3 H 5 2600 BJ. Crebas
4 5 e 1 Franky du Cap Vert 1'18"4 H 6 2600 JH. Mieras
5 6 e 10 Défi Magik 1'18"0 H 8 2620 F. Verkaik
6 7 e 9 Fleuron 1'18"2 H 6 2620 M. Brouwer
7 8 e 8 Dream Gibus 1'18"6 H 8 2620 R. Ebbinge
8 9 e 5 Beau Gaillard 1'19"5 H 10 2600 A. Bakker
9 0 DAI 6 Bikini de Larcy null H 10 2600 D. Den Dubbelden
9 10 11 12
0 17 490 € A. De Wrede 1 6
1 24 755 € S. Schoonhoven 14 None
2 17 600 € JF. Van Dooyeweerd 10 None
3 20 700 € BJ. Crebas 32 None
4 15 536 € N. De Vreede 65 None
5 44 865 € AW. Bosscha 6 3
6 44 830 € D. Brouwer 7 3
7 33 330 € Mme A. Lehmann 36 None
8 20 140 € N. De Vreede 44 None
9 21 834 € N. Rip 52 None
Second method with the same output:
df = pd.DataFrame(data)[0].str.split(',', expand=True)
Third method with similar output:
from io import StringIO
stringdata = StringIO('\n'.join([line[0] for line in data]))
df = pd.read_csv(stringdata, sep=',', header=None)
However, please note that the first method (list comprehension) is still the most efficient!
convert list of lists to pandas data frame
You can flatten the list and drop duplicates from your dataframe.
# import toolboxes
import pandas as pd
from itertools import chain
# get data
data = [[{'timestamp': 1648558320942, 'price': 47876.0},
{'timestamp': 1648558320942, 'price': 47876.0}],
[{'timestamp': 1648558321945, 'price': 47881.0},
{'timestamp': 1648558321945, 'price': 47881.0},
{'timestamp': 1648558321945, 'price': 47881.0}],
[{'timestamp': 1648558326768, 'price': 47876.0}]]
# flatten, create df and drop duplicates
a = list(chain.from_iterable(data))
df = pd.DataFrame(a)
df = df.drop_duplicates()
Output:
print(df)
timestamp price
0 1648558320942 47876.0
2 1648558321945 47881.0
5 1648558326768 47876.0
Pandas DataFrame to List of Lists
You could access the underlying array and call its tolist
method:
>>> df = pd.DataFrame([[1,2,3],[3,4,5]])
>>> lol = df.values.tolist()
>>> lol
[[1L, 2L, 3L], [3L, 4L, 5L]]
How to convert list of lists into a Pandas dataframe in python
If need only columns pass mylist
:
df = pd.DataFrame(mylist,columns=columns)
print (df)
year score_1 score_2 score_3 score_4 score_5
0 2000 0.5 0.3 0.8 0.9 0.8
1 2001 0.5 0.6 0.8 0.9 0.9
2 2002 0.5 0.3 0.8 0.8 0.8
3 2003 0.9 0.9 0.9 0.9 0.8
But if need index by years use dictionary comprehension with DataFrame.from_dict
:
df = pd.DataFrame.from_dict({x[0]: x[1:] for x in mylist},columns=columns[1:], orient='index')
print (df)
score_1 score_2 score_3 score_4 score_5
2000 0.5 0.3 0.8 0.9 0.8
2001 0.5 0.6 0.8 0.9 0.9
2002 0.5 0.3 0.8 0.8 0.8
2003 0.9 0.9 0.9 0.9 0.8
And if need set index names add DataFrame.rename_axis
:
d = {x[0]: x[1:] for x in mylist}
df = pd.DataFrame.from_dict(d,columns=columns[1:], orient='index').rename_axis(columns[0])
print (df)
score_1 score_2 score_3 score_4 score_5
year
2000 0.5 0.3 0.8 0.9 0.8
2001 0.5 0.6 0.8 0.9 0.9
2002 0.5 0.3 0.8 0.8 0.8
2003 0.9 0.9 0.9 0.9 0.8
How do I extract a list of lists from a Pandas DataFrame?
Try:
print(df.groupby("Person").agg(list)["Movies"].to_list())
Prints:
[['ET'], ['Apollo 13', '12 Angry Men'], ['Citizen Kane']]
Related Topics
How to Access the Request Object or Any Other Variable in a Form's Clean() Method
How to Exchange Keys with Values in a Dictionary
Running an Outside Program (Executable) in Python
Matplotlib: Annotating a 3D Scatter Plot
How to Pass Optional Parameters to a Function
Applying Udfs on Groupeddata in Pyspark (With Functioning Python Example)
Possible to Share In-Memory Data Between 2 Separate Processes
Pythonic Way of Checking If a Condition Holds for Any Element of a List
Import Pandas Dataframe Column as String Not Int
How to Change the Default MySQL Connection Timeout When Connecting Through Python
Multiprocessing Example Giving Attributeerror
Python Popen Command. Wait Until the Command Is Finished
How to Use Cookies in Python Requests
Merge Pandas Dataframes Where One Value Is Between Two Others