Take Multiple Lists into Dataframe

Take multiple lists into dataframe

I think you're almost there, try removing the extra square brackets around the lst's (Also you don't need to specify the column names when you're creating a dataframe from a dict like this):

import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
percentile_list = pd.DataFrame(
{'lst1Title': lst1,
'lst2Title': lst2,
'lst3Title': lst3
})

percentile_list
lst1Title lst2Title lst3Title
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
...

If you need a more performant solution you can use np.column_stack rather than zip as in your first attempt, this has around a 2x speedup on the example here, however comes at bit of a cost of readability in my opinion:

import numpy as np
percentile_list = pd.DataFrame(np.column_stack([lst1, lst2, lst3]),
columns=['lst1Title', 'lst2Title', 'lst3Title'])

Pandas, Python - Assembling a Data Frame with multiple lists from loop

The problem is that you are overwriting the number variable in the loop, so is no longer available at the end of each iteration, I add a solution adding the column Index in each iteration.

# create an empty dataframe
df = pd.DataFrame()

#Selecting Data into List
i=1
target = f'{pathway}\calls_{i}.json'
with open(target,'r') as f: #Reading JSON file
data = json.load(f)

specsA=('PreviousDraws',['DrawNumber'])
draw=(glom(data,specsA)) #list type; glom is a package to access nested data in JSON file.
print(draw)

# insert the draw to the dataframe
df['DrawNumbers'] = draw

for j in range(0,5):
specsB=('PreviousDraws',['WinningNumbers'],[f'{j}'],['Number'])
number=(glom(data,specsB)) #list type; glom is a package to access nested data in JSON file.
print(number)
# insert each number to the dataframe
df[f'Index{j}'] = number

create a dataframe from multiple list

You can create a Pandas DataFrame from a list using the code below:

import pandas as pd

a = [1, 2, 3, 4]
b = [5, 6, 7, 8]
c = [7, 8, 9, 10]

df = pd.DataFrame(list(zip(a, b, c)),
columns =['a', 'b', 'c'])

print(df)

Hope this helps!

Python, Take Multiple Lists and Putting into pd.Dataframe

Given

list1 = ['Rank', 'Athlete', 'Distance', 'Runs', 'Longest', 'Avg. Pace', 'Elev. Gain']
list2 = (['1', 'Jack', '57.4 km', '4', '21.7 km', '5:57 /km', '994 m'],
['2', 'Jill', '34.0 km', '2', '17.9 km', '5:27 /km', '152 m'],
['3', 'Kelsey', '32.6 km', '2', '21.3 km', '5:46 /km', '141 m'])

do

pd.DataFrame(list2, columns=list1)

which returns

  Rank Athlete Distance Runs  Longest Avg. Pace Elev. Gain
0 1 Jack 57.4 km 4 21.7 km 5:57 /km 994 m
1 2 Jill 34.0 km 2 17.9 km 5:27 /km 152 m
2 3 Kelsey 32.6 km 2 21.3 km 5:46 /km 141 m

Multiple lists to Pandas DataFrame

This is not easily supported, but it can be done. DataFrame.from_dict will with the "index" orient. Assuming your lists are A, B, and C:

pd.DataFrame([A, B, C]).T

0 1 2
0 1.0 5.0 1.0
1 2.0 4.0 2.0
2 3.0 6.0 4.0
3 4.0 7.0 5.0
4 5.0 2.0 6.0
5 NaN NaN 7.0
6 NaN NaN 8.0
7 NaN NaN 9.0
8 NaN NaN 0.0

Another option is using DataFrame.from_dict:

pd.DataFrame.from_dict({'A' : A, 'B' : B, 'C' : C}, orient='index').T

A B C
0 1.0 5.0 1.0
1 2.0 4.0 2.0
2 3.0 6.0 4.0
3 4.0 7.0 5.0
4 5.0 2.0 6.0
5 NaN NaN 7.0
6 NaN NaN 8.0
7 NaN NaN 9.0
8 NaN NaN 0.0

A third solution with zip_longest and DataFrame.from_records:

from itertools import zip_longest
pd.DataFrame.from_records(zip_longest(A, B, C), columns=['A', 'B', 'C'])
# pd.DataFrame.from_records(list(zip_longest(A, B, C)), columns=['A', 'B', 'C'])

A B C
0 1.0 5.0 1
1 2.0 4.0 2
2 3.0 6.0 4
3 4.0 7.0 5
4 5.0 2.0 6
5 NaN NaN 7
6 NaN NaN 8
7 NaN NaN 9
8 NaN NaN 0

How to use multiple lists of lists to append new rows to a dataframe?

You can use itertools.chain to flatten each list, construct a dictionary with the flattened lists and cast it to a DataFrame:

from itertools import chain
A, B, C = [list(chain.from_iterable(lst)) for lst in [List_a, List_b, List_c]]
out = pd.DataFrame({'A': A, 'B': B, 'C': C})

Output:

     A   B   C
0 1 16 31
1 2 17 32
2 3 18 33
3 4 19 34
4 5 20 35
5 6 21 36
6 7 22 37
7 8 23 38
8 9 24 39
9 10 25 40
10 11 26 41
11 12 27 42
12 13 28 43
13 14 29 44
14 15 30 45

How can I convert two lists into a dataframe, having one as a list of lists?

  1. You can either pre-process the data using zip then build the DF

    names = ['ABCD', 'LTAP', 'DEFG', 'FFEE']
    list_ids = [[100, 200], [3333], [1500, 99, 870], [2]]

    flat_list = [(item, name) for sublist, name in zip(list_ids, names) for item in sublist]
    id_df = pd.DataFrame(flat_list, columns=['id', 'name'])

    Intermediate flat_list is

    flat_list > [(100, 'ABCD'), (200, 'ABCD'), (3333, 'LTAP'), ...



  1. Or build the df with raw data, then use explode

    id_df = pd.DataFrame({'id': list_ids, 'name': names}).explode('id')

    Intermediate pd.DataFrame({'id': list_ids, 'name': names} is

                 id  name
    0 [100, 200] ABCD
    1 [3333] LTAP
    2 [1500, 99, 870] DEFG
    3 [2] FFEE

Combine multiple rows of lists into one big list using pandas

use df.explode()

lst = df['fruits'].explode().to_list()

Read text files with multiple lists with spacings and commas exist between elements in the lists into pandas dataframe

You can use read_csv, by specifying a separator which will not occur in the lines (e.g. \0) (so that each line will be read as a whole) and ast.literal_eval as a converter for the values:

import ast

pd.read_csv('tropical.txt', header=None, sep='\0', names=['fruits'], converters={ 'fruits' : ast.literal_eval })

Output:

                         fruits
0 [papaya, mangosteen, banana]
1 []
2 [coconut, mango]
3 [mangosteen, papaya]


Related Topics



Leave a reply



Submit