Pandas DataFrame to List of Lists
You could access the underlying array and call its tolist
method:
>>> df = pd.DataFrame([[1,2,3],[3,4,5]])
>>> lol = df.values.tolist()
>>> lol
[[1L, 2L, 3L], [3L, 4L, 5L]]
How to convert a Python Dataframe to List of Lists?
Loop through all columns in your dataframe, index that column, and convert it to a list:
lst = [df[i].tolist() for i in df.columns]
Example:
df = pd.DataFrame({'a' : [1, 2, 3, 4],
'b' : [5, 6, 7, 8]})
print(df)
print('list', [df[i].tolist() for i in df.columns])
Output:
a b
0 1 5
1 2 6
2 3 7
3 4 8
'list' [[1, 2, 3, 4], [5, 6, 7, 8]]
pandas: convert list of lists to dataframe
The apostrophe means that the data is string type in the list, but can be extracted as the first element using my_list[0]
. Need to process each list using list comprehension before putting into the dataframe.
There seems some typo (missing coordinates) in the last line of data, so I corrected it by adding 'null'.
import pandas as pd
data = [['1,er,2,Fado de Padd,1\'18"1,H,6,2600,J. Dekker,17 490 €,A. De Wrede,1,6'],
['2,e,7,Elixir Normand,1\'18"2,H,7,2600,S. Schoonhoven,24 755 €,S. Schoonhoven,14'],
['3,e,3,Give You All of Me,1\'18"2,H,5,2600,JF. Van Dooyeweerd,17 600 €,JF. Van Dooyeweerd,10'],
['4,e,4,Gouritch,1\'18"3,H,5,2600,BJ. Crebas,20 700 €,BJ. Crebas,32'],
['5,e,1,Franky du Cap Vert,1\'18"4,H,6,2600,JH. Mieras,15 536 €,N. De Vreede,65'],
['6,e,10,Défi Magik,1\'18"0,H,8,2620,F. Verkaik,44 865 €,AW. Bosscha,6,3'],
['7,e,9,Fleuron,1\'18"2,H,6,2620,M. Brouwer,44 830 €,D. Brouwer,7,3'],
['8,e,8,Dream Gibus,1\'18"6,H,8,2620,R. Ebbinge,33 330 €,Mme A. Lehmann,36'],
['9,e,5,Beau Gaillard,1\'19"5,H,10,2600,A. Bakker,20 140 €,N. De Vreede,44'],
['0,DAI,6,Bikini de Larcy,null,H,10,2600,D. Den Dubbelden,21 834 €,N. Rip,52']]
df = pd.DataFrame([line[0].split(',') for line in data])
print(df)
Output
0 1 2 3 4 5 6 7 8 \
0 1 er 2 Fado de Padd 1'18"1 H 6 2600 J. Dekker
1 2 e 7 Elixir Normand 1'18"2 H 7 2600 S. Schoonhoven
2 3 e 3 Give You All of Me 1'18"2 H 5 2600 JF. Van Dooyeweerd
3 4 e 4 Gouritch 1'18"3 H 5 2600 BJ. Crebas
4 5 e 1 Franky du Cap Vert 1'18"4 H 6 2600 JH. Mieras
5 6 e 10 Défi Magik 1'18"0 H 8 2620 F. Verkaik
6 7 e 9 Fleuron 1'18"2 H 6 2620 M. Brouwer
7 8 e 8 Dream Gibus 1'18"6 H 8 2620 R. Ebbinge
8 9 e 5 Beau Gaillard 1'19"5 H 10 2600 A. Bakker
9 0 DAI 6 Bikini de Larcy null H 10 2600 D. Den Dubbelden
9 10 11 12
0 17 490 € A. De Wrede 1 6
1 24 755 € S. Schoonhoven 14 None
2 17 600 € JF. Van Dooyeweerd 10 None
3 20 700 € BJ. Crebas 32 None
4 15 536 € N. De Vreede 65 None
5 44 865 € AW. Bosscha 6 3
6 44 830 € D. Brouwer 7 3
7 33 330 € Mme A. Lehmann 36 None
8 20 140 € N. De Vreede 44 None
9 21 834 € N. Rip 52 None
Second method with the same output:
df = pd.DataFrame(data)[0].str.split(',', expand=True)
Third method with similar output:
from io import StringIO
stringdata = StringIO('\n'.join([line[0] for line in data]))
df = pd.read_csv(stringdata, sep=',', header=None)
However, please note that the first method (list comprehension) is still the most efficient!
How do I extract a list of lists from a Pandas DataFrame?
Try:
print(df.groupby("Person").agg(list)["Movies"].to_list())
Prints:
[['ET'], ['Apollo 13', '12 Angry Men'], ['Citizen Kane']]
Convert pandas df to list of lists with varying length
groupby object is dict, you may use it to avoid agg
to speed up more
In [229]: [v.tolist() for v in df.set_index('1').groupby('0').groups.values()]
Out[229]: [[4.3, 3.2, 2.1], [9.1, 2.0], [2.8, 1.7, 0.8, 0.2]]
Timing on 90K rows
df = pd.concat([df] * 10000)
%timeit [v.tolist() for v in df.set_index('1').groupby('0').groups.values()]
15.2 ms ± 425 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df.groupby('0')['1'].agg(list).tolist()
32.8 ms ± 623 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [236]: %%timeit
...: d_tuples = [*list(zip(df['0'],df['1']))]
...: keys = df['0'].unique()
...: list_of_lists = []
...: for key in keys:
...: list_of_lists+=[[tup[1] for tup in d_tuples if tup[0] == key]]
...:
69.4 ms ± 754 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
comparing two list of lists with a dataframe column python
Answer
result = []
for l1, l2 in zip(list1, list2):
res = df.loc[df["rid"].isin(l1) & df["pid"].isin(l2)]["value"].tolist()
result.append(res)
[['chocolate', 'milk'], ['bread']]
Explain
zip
will combine the two lists, equivalent to
for i in range(len(list1)):
l1 = list1[i]
l2 = list2[i]
df["rid"].isin(l1) & df["pid"].isin(l2)
will combine the condition withand operator
&
Attation
- The length of list1 and list2 must be equal, otherwise,
zip
will ignore the rest element of the longer list.
Compare each element in list of lists with a column in a dataframe python
Build a dict:
d = df.set_index('rid').to_dict()['pid']
And use it to build the Dataframe:
pd.DataFrame(((x, [d[el] for el in x]) for x in groups_rids), columns=['groups_rid', 'pid'])
groups_rid pid
0 [AX1, AX2] [P2, P0]
1 [AX6, AX5, AX17] [P3, P9, P13]
Filtering a pandas dataframe based of list of lists
Use tuples for filtering in both - column and also convert list to tuples:
t = [tuple(x) for x in slist]
df = df[df['path'].apply(lambda x: tuple(eval(str(x).lower()))).isin(t)]
Or:
df = df[df['path'].apply(lambda x: tuple([y.lower() for y in x])).isin(t)]
print (df)
id path
1 102 [Activities (DEV), public, behavior_trackers]
2 103 [Activities (DEV), public, journal_entries]
4 105 [pg-prd (DEV-RR), public, activities]
Create a DataFrame from list in lists (Pandas)
you could fix this with a for loop
overly_nested = [[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]]
for i, sub_list in enumerate(overly_nested):
overly_nested[i]=sub_list[0]
df = pd.DataFrame(overly_nested)
print(df)
I'm sure theres a way to do this with zip()
, let me experiment and I'll edit if I find it
Related Topics
Getting the Class Name of an Instance
Comprehensive Beginner's Virtualenv Tutorial
Create Pandas Dataframe from Txt File with Specific Pattern
How to Save a New Sheet in an Existing Excel File, Using Pandas
How to Delete a Character from a String Using Python
Python App Does Not Print Anything When Running Detached in Docker
How to Make Smooth Movement in Pygame
How to Handle the Window Close Event in Tkinter
Getting Rid of \N When Using .Readlines()
Create a Directly-Executable Cross-Platform Gui App Using Python
Download File from Web in Python 3
How to Get Variable Data from a Class
How to Jump to a Particular Line in a Huge Text File
How to Get the Logical Xor of Two Variables in Python
What's the Simplest Way of Detecting Keyboard Input in a Script from the Terminal