Pandas DataFrame to List of Lists
You could access the underlying array and call its tolist
method:
>>> df = pd.DataFrame([[1,2,3],[3,4,5]])
>>> lol = df.values.tolist()
>>> lol
[[1L, 2L, 3L], [3L, 4L, 5L]]
How to convert a Python Dataframe to List of Lists?
Loop through all columns in your dataframe, index that column, and convert it to a list:
lst = [df[i].tolist() for i in df.columns]
Example:
df = pd.DataFrame({'a' : [1, 2, 3, 4],
'b' : [5, 6, 7, 8]})
print(df)
print('list', [df[i].tolist() for i in df.columns])
Output:
a b
0 1 5
1 2 6
2 3 7
3 4 8
'list' [[1, 2, 3, 4], [5, 6, 7, 8]]
Converting a data.frame to a list of lists
Using plyr
, you can do this
dlply(df,.(id),c)
To avoid grouping by id , if there are multiple ( maybe you need to change column name , id is unique for me)
dlply(df,1,c)
How to convert dataframe column into list of lists json format?
You can try:
df = df.groupby('dt',as_index=False).agg({'sales':list})
df['sales'] = df['sales'].apply(lambda x: [[e] for e in x])
df.apply(lambda row: pd.Series(row['sales']).to_json(
f"{out_path}/sales_{row['dt'].replace('-','_')}.json",
orient='values',indent=2), axis=1)
Pandas dataframe: converting column of lists to a list
Idea is first remove missing values by Series.dropna
, then convert list repr by ast.literal_eval
to lists and flatten nested lists in list comprehension:
df = pd.DataFrame({'hashtags':[np.nan, np.nan,
"['COVID19']", "['COVID19']",
"['CoronaVirusUpdates', 'COVID19']"]})
import ast
out = [y for x in df['hashtags'].dropna() for y in ast.literal_eval(x)]
print (out)
['COVID19', 'COVID19', 'CoronaVirusUpdates', 'COVID19']
Convert a Pandas Dataframe in a list of lists
Use DataFrame.groupby
with lambda function by both columns for Series
:
predictors = ['col_1','col_2','col_3','col_4']
s = (df.groupby(['person_id','label'], sort=False)[predictors]
.apply(lambda x: x.values.tolist()))
print (s)
person_id label
1 1 [[4, 5, 7, 8], [1, 3, 4, 6]]
0 [[1, 2, 6, 5], [1, 3, 3, 6]]
2 1 [[3, 5, 1, 3], [3, 2, 6, 8], [3, 1, 0, 4]]
dtype: object
And then convert Series
to lists:
bags = s.tolist()
print (bags)
[[[4, 5, 7, 8], [1, 3, 4, 6]],
[[1, 2, 6, 5], [1, 3, 3, 6]],
[[3, 5, 1, 3], [3, 2, 6, 8], [3, 1, 0, 4]]]
And second level of MultiIndex
by Index.get_level_values
too:
labels = s.index.get_level_values(1).tolist()
print (labels)
[1, 0, 1]
convert list of lists to pandas data frame
You can flatten the list and drop duplicates from your dataframe.
# import toolboxes
import pandas as pd
from itertools import chain
# get data
data = [[{'timestamp': 1648558320942, 'price': 47876.0},
{'timestamp': 1648558320942, 'price': 47876.0}],
[{'timestamp': 1648558321945, 'price': 47881.0},
{'timestamp': 1648558321945, 'price': 47881.0},
{'timestamp': 1648558321945, 'price': 47881.0}],
[{'timestamp': 1648558326768, 'price': 47876.0}]]
# flatten, create df and drop duplicates
a = list(chain.from_iterable(data))
df = pd.DataFrame(a)
df = df.drop_duplicates()
Output:
print(df)
timestamp price
0 1648558320942 47876.0
2 1648558321945 47881.0
5 1648558326768 47876.0
pandas: convert list of lists to dataframe
The apostrophe means that the data is string type in the list, but can be extracted as the first element using my_list[0]
. Need to process each list using list comprehension before putting into the dataframe.
There seems some typo (missing coordinates) in the last line of data, so I corrected it by adding 'null'.
import pandas as pd
data = [['1,er,2,Fado de Padd,1\'18"1,H,6,2600,J. Dekker,17 490 €,A. De Wrede,1,6'],
['2,e,7,Elixir Normand,1\'18"2,H,7,2600,S. Schoonhoven,24 755 €,S. Schoonhoven,14'],
['3,e,3,Give You All of Me,1\'18"2,H,5,2600,JF. Van Dooyeweerd,17 600 €,JF. Van Dooyeweerd,10'],
['4,e,4,Gouritch,1\'18"3,H,5,2600,BJ. Crebas,20 700 €,BJ. Crebas,32'],
['5,e,1,Franky du Cap Vert,1\'18"4,H,6,2600,JH. Mieras,15 536 €,N. De Vreede,65'],
['6,e,10,Défi Magik,1\'18"0,H,8,2620,F. Verkaik,44 865 €,AW. Bosscha,6,3'],
['7,e,9,Fleuron,1\'18"2,H,6,2620,M. Brouwer,44 830 €,D. Brouwer,7,3'],
['8,e,8,Dream Gibus,1\'18"6,H,8,2620,R. Ebbinge,33 330 €,Mme A. Lehmann,36'],
['9,e,5,Beau Gaillard,1\'19"5,H,10,2600,A. Bakker,20 140 €,N. De Vreede,44'],
['0,DAI,6,Bikini de Larcy,null,H,10,2600,D. Den Dubbelden,21 834 €,N. Rip,52']]
df = pd.DataFrame([line[0].split(',') for line in data])
print(df)
Output
0 1 2 3 4 5 6 7 8 \
0 1 er 2 Fado de Padd 1'18"1 H 6 2600 J. Dekker
1 2 e 7 Elixir Normand 1'18"2 H 7 2600 S. Schoonhoven
2 3 e 3 Give You All of Me 1'18"2 H 5 2600 JF. Van Dooyeweerd
3 4 e 4 Gouritch 1'18"3 H 5 2600 BJ. Crebas
4 5 e 1 Franky du Cap Vert 1'18"4 H 6 2600 JH. Mieras
5 6 e 10 Défi Magik 1'18"0 H 8 2620 F. Verkaik
6 7 e 9 Fleuron 1'18"2 H 6 2620 M. Brouwer
7 8 e 8 Dream Gibus 1'18"6 H 8 2620 R. Ebbinge
8 9 e 5 Beau Gaillard 1'19"5 H 10 2600 A. Bakker
9 0 DAI 6 Bikini de Larcy null H 10 2600 D. Den Dubbelden
9 10 11 12
0 17 490 € A. De Wrede 1 6
1 24 755 € S. Schoonhoven 14 None
2 17 600 € JF. Van Dooyeweerd 10 None
3 20 700 € BJ. Crebas 32 None
4 15 536 € N. De Vreede 65 None
5 44 865 € AW. Bosscha 6 3
6 44 830 € D. Brouwer 7 3
7 33 330 € Mme A. Lehmann 36 None
8 20 140 € N. De Vreede 44 None
9 21 834 € N. Rip 52 None
Second method with the same output:
df = pd.DataFrame(data)[0].str.split(',', expand=True)
Third method with similar output:
from io import StringIO
stringdata = StringIO('\n'.join([line[0] for line in data]))
df = pd.read_csv(stringdata, sep=',', header=None)
However, please note that the first method (list comprehension) is still the most efficient!
Convert a dataframe to a list of lists based on common features
We could use split.default
to split the columns based on names of the dataframe and then use as.list
to create lists of list.
lapply(split.default(df1, sub("(TP\\d+).*", "\\1", names(df1))), as.list)
#$TP1
#$TP1$TP1.expression
#[1] 3 8 2
#$TP1$TP1.pval
#[1] 0.04 0.03 0.01
#$TP1$TP1.log2fc
#[1] 1.0 0.3 2.1
#$TP2
#$TP2$TP2.expression
#[1] 2.0 4.0 2.1
#$TP2$TP2.pval
#[1] 0.024 0.020 0.010
#$TP2$TP2.log2fc
#[1] -1.0 0.1 3.1
Related Topics
How to Draw Half-Filled Points in R (Preferably Using Ggplot)
Simple R 3D Interpolation/Surface Plot
R - Scaling Numeric Values Only in a Dataframe with Mixed Types
Reduce Space Between Grid.Arrange Plots
R Leaflet - Use Date or Character Legend Labels with Colornumeric() Palette
Match Dataframes Excluding Last Non-Na Value and Disregarding Order
Random Sampling to Give an Exact Sum
Identify Consecutive Sequences Based on a Given Variable
In R, How to Suppress "Note: No Visible Binding for Global Variable"
Error in Chol.Default(Cxx):The Leading Minor of Order Is Not Positive Definite
Freezing Header and First Column Using Data.Table in Shiny
How to Access the Name of the Variable Assigned to the Result of a Function Within the Function
Fill in Data Frame with Values from Rows Above
Filter Groups in Dplyr That Exclusively Contain Specific Combinations of Values
R - How to Use Selectinput in Shiny to Change the X and Fill Variables in a Ggplot Renderplot
What If I Want to Web Scrape with R for a Page with Parameters
Separate a Column into 2 Columns at the Last Underscore in R
Str_Extract_All: Return All Patterns Found in String Concatenated as Vector