Converting a Data.Frame to a List of Lists

Pandas DataFrame to List of Lists

You could access the underlying array and call its tolist method:

>>> df = pd.DataFrame([[1,2,3],[3,4,5]])
>>> lol = df.values.tolist()
>>> lol
[[1L, 2L, 3L], [3L, 4L, 5L]]

How to convert a Python Dataframe to List of Lists?

Loop through all columns in your dataframe, index that column, and convert it to a list:

lst = [df[i].tolist() for i in df.columns]

Example:

df = pd.DataFrame({'a' : [1, 2, 3, 4],
'b' : [5, 6, 7, 8]})

print(df)
print('list', [df[i].tolist() for i in df.columns])

Output:

   a  b
0 1 5
1 2 6
2 3 7
3 4 8
'list' [[1, 2, 3, 4], [5, 6, 7, 8]]

Converting a data.frame to a list of lists

Using plyr , you can do this

dlply(df,.(id),c)

To avoid grouping by id , if there are multiple ( maybe you need to change column name , id is unique for me)

dlply(df,1,c)

How to convert dataframe column into list of lists json format?

You can try:

    df = df.groupby('dt',as_index=False).agg({'sales':list})
df['sales'] = df['sales'].apply(lambda x: [[e] for e in x])
df.apply(lambda row: pd.Series(row['sales']).to_json(
f"{out_path}/sales_{row['dt'].replace('-','_')}.json",
orient='values',indent=2), axis=1)

Pandas dataframe: converting column of lists to a list

Idea is first remove missing values by Series.dropna, then convert list repr by ast.literal_eval to lists and flatten nested lists in list comprehension:

df = pd.DataFrame({'hashtags':[np.nan, np.nan, 
"['COVID19']", "['COVID19']",
"['CoronaVirusUpdates', 'COVID19']"]})

import ast

out = [y for x in df['hashtags'].dropna() for y in ast.literal_eval(x)]
print (out)
['COVID19', 'COVID19', 'CoronaVirusUpdates', 'COVID19']

Convert a Pandas Dataframe in a list of lists

Use DataFrame.groupby with lambda function by both columns for Series:

predictors = ['col_1','col_2','col_3','col_4']
s = (df.groupby(['person_id','label'], sort=False)[predictors]
.apply(lambda x: x.values.tolist()))
print (s)
person_id label
1 1 [[4, 5, 7, 8], [1, 3, 4, 6]]
0 [[1, 2, 6, 5], [1, 3, 3, 6]]
2 1 [[3, 5, 1, 3], [3, 2, 6, 8], [3, 1, 0, 4]]
dtype: object

And then convert Series to lists:

bags = s.tolist()
print (bags)
[[[4, 5, 7, 8], [1, 3, 4, 6]],
[[1, 2, 6, 5], [1, 3, 3, 6]],
[[3, 5, 1, 3], [3, 2, 6, 8], [3, 1, 0, 4]]]

And second level of MultiIndex by Index.get_level_values too:

labels = s.index.get_level_values(1).tolist()
print (labels)
[1, 0, 1]

convert list of lists to pandas data frame

You can flatten the list and drop duplicates from your dataframe.

# import toolboxes
import pandas as pd
from itertools import chain

# get data
data = [[{'timestamp': 1648558320942, 'price': 47876.0},
{'timestamp': 1648558320942, 'price': 47876.0}],
[{'timestamp': 1648558321945, 'price': 47881.0},
{'timestamp': 1648558321945, 'price': 47881.0},
{'timestamp': 1648558321945, 'price': 47881.0}],
[{'timestamp': 1648558326768, 'price': 47876.0}]]

# flatten, create df and drop duplicates
a = list(chain.from_iterable(data))
df = pd.DataFrame(a)
df = df.drop_duplicates()

Output:

print(df)
timestamp price
0 1648558320942 47876.0
2 1648558321945 47881.0
5 1648558326768 47876.0

pandas: convert list of lists to dataframe

The apostrophe means that the data is string type in the list, but can be extracted as the first element using my_list[0]. Need to process each list using list comprehension before putting into the dataframe.

There seems some typo (missing coordinates) in the last line of data, so I corrected it by adding 'null'.

import pandas as pd

data = [['1,er,2,Fado de Padd,1\'18"1,H,6,2600,J. Dekker,17 490 €,A. De Wrede,1,6'],
['2,e,7,Elixir Normand,1\'18"2,H,7,2600,S. Schoonhoven,24 755 €,S. Schoonhoven,14'],
['3,e,3,Give You All of Me,1\'18"2,H,5,2600,JF. Van Dooyeweerd,17 600 €,JF. Van Dooyeweerd,10'],
['4,e,4,Gouritch,1\'18"3,H,5,2600,BJ. Crebas,20 700 €,BJ. Crebas,32'],
['5,e,1,Franky du Cap Vert,1\'18"4,H,6,2600,JH. Mieras,15 536 €,N. De Vreede,65'],
['6,e,10,Défi Magik,1\'18"0,H,8,2620,F. Verkaik,44 865 €,AW. Bosscha,6,3'],
['7,e,9,Fleuron,1\'18"2,H,6,2620,M. Brouwer,44 830 €,D. Brouwer,7,3'],
['8,e,8,Dream Gibus,1\'18"6,H,8,2620,R. Ebbinge,33 330 €,Mme A. Lehmann,36'],
['9,e,5,Beau Gaillard,1\'19"5,H,10,2600,A. Bakker,20 140 €,N. De Vreede,44'],
['0,DAI,6,Bikini de Larcy,null,H,10,2600,D. Den Dubbelden,21 834 €,N. Rip,52']]

df = pd.DataFrame([line[0].split(',') for line in data])
print(df)

Output

   0    1   2                   3       4  5   6     7                   8   \
0 1 er 2 Fado de Padd 1'18"1 H 6 2600 J. Dekker
1 2 e 7 Elixir Normand 1'18"2 H 7 2600 S. Schoonhoven
2 3 e 3 Give You All of Me 1'18"2 H 5 2600 JF. Van Dooyeweerd
3 4 e 4 Gouritch 1'18"3 H 5 2600 BJ. Crebas
4 5 e 1 Franky du Cap Vert 1'18"4 H 6 2600 JH. Mieras
5 6 e 10 Défi Magik 1'18"0 H 8 2620 F. Verkaik
6 7 e 9 Fleuron 1'18"2 H 6 2620 M. Brouwer
7 8 e 8 Dream Gibus 1'18"6 H 8 2620 R. Ebbinge
8 9 e 5 Beau Gaillard 1'19"5 H 10 2600 A. Bakker
9 0 DAI 6 Bikini de Larcy null H 10 2600 D. Den Dubbelden

9 10 11 12
0 17 490 € A. De Wrede 1 6
1 24 755 € S. Schoonhoven 14 None
2 17 600 € JF. Van Dooyeweerd 10 None
3 20 700 € BJ. Crebas 32 None
4 15 536 € N. De Vreede 65 None
5 44 865 € AW. Bosscha 6 3
6 44 830 € D. Brouwer 7 3
7 33 330 € Mme A. Lehmann 36 None
8 20 140 € N. De Vreede 44 None
9 21 834 € N. Rip 52 None

Second method with the same output:

df = pd.DataFrame(data)[0].str.split(',', expand=True)

Third method with similar output:

from io import StringIO

stringdata = StringIO('\n'.join([line[0] for line in data]))
df = pd.read_csv(stringdata, sep=',', header=None)

However, please note that the first method (list comprehension) is still the most efficient!

Convert a dataframe to a list of lists based on common features

We could use split.default to split the columns based on names of the dataframe and then use as.list to create lists of list.

lapply(split.default(df1, sub("(TP\\d+).*", "\\1", names(df1))), as.list)

#$TP1
#$TP1$TP1.expression
#[1] 3 8 2

#$TP1$TP1.pval
#[1] 0.04 0.03 0.01

#$TP1$TP1.log2fc
#[1] 1.0 0.3 2.1

#$TP2
#$TP2$TP2.expression
#[1] 2.0 4.0 2.1

#$TP2$TP2.pval
#[1] 0.024 0.020 0.010

#$TP2$TP2.log2fc
#[1] -1.0 0.1 3.1


Related Topics



Leave a reply



Submit