Create a Data.Frame Where a Column Is a List

Create a data.frame where a column is a list

Slightly obscurely, from ?data.frame:

If a list or data frame or matrix is passed to ‘data.frame’ it is as
if each component or column had been passed as a separate argument
(except for matrices of class ‘"model.matrix"’ and those protected by
‘I’
).

(emphasis added).

So

data.frame(a=1:3,b=I(list(1,1:2,1:3)))

seems to work.

Get list from pandas dataframe column or row?

Pandas DataFrame columns are Pandas Series when you pull them out, which you can then call x.tolist() on to turn them into a Python list. Alternatively you cast it with list(x).

import pandas as pd

data_dict = {'one': pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two': pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(data_dict)

print(f"DataFrame:\n{df}\n")
print(f"column types:\n{df.dtypes}")

col_one_list = df['one'].tolist()

col_one_arr = df['one'].to_numpy()

print(f"\ncol_one_list:\n{col_one_list}\ntype:{type(col_one_list)}")
print(f"\ncol_one_arr:\n{col_one_arr}\ntype:{type(col_one_arr)}")

Output:

DataFrame:
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4

column types:
one float64
two int64
dtype: object

col_one_list:
[1.0, 2.0, 3.0, nan]
type:<class 'list'>

col_one_arr:
[ 1. 2. 3. nan]
type:<class 'numpy.ndarray'>

Make dataframe from list of lists but each element a column

If the names are always a match one-by-one, you can simply do,

do.call(rbind, lapply(myList, unlist))
# L1 L2 L3 a1 a2 a3 b1 b2 b3
#[1,] 1 2 3 1 2 3 1 2 3
#[2,] 4 5 6 4 5 6 4 5 6
#[3,] 7 8 9 7 8 9 7 8 9

Pandas Create Data Frame Column from List

Wouldn't that be:

list=['a','b','c']
df= pd.DataFrame({'header': list})


header
0 a
1 b
2 c

Create a dataframe of a list of dataframes

IIUC, the code below should be equivalent of your loop and what you expect:

out = df[df['Store'].between(1, 45)].groupby(['Store', 'Date'])['Weekly_Sales'].sum() \
.unstack(level='Store').reset_index(drop=True) \
.rename_axis(columns=None).add_prefix('Store')

Convert List to Pandas Dataframe Column

Use:

L = ['Thanks You', 'Its fine no problem', 'Are you sure']

#create new df
df = pd.DataFrame({'col':L})
print (df)

col
0 Thanks You
1 Its fine no problem
2 Are you sure

df = pd.DataFrame({'oldcol':[1,2,3]})

#add column to existing df
df['col'] = L
print (df)
oldcol col
0 1 Thanks You
1 2 Its fine no problem
2 3 Are you sure

Thank you DYZ:

#default column name 0
df = pd.DataFrame(L)
print (df)
0
0 Thanks You
1 Its fine no problem
2 Are you sure

Python: create a pandas data frame from a list

DataFrame.from_records treats string as a character list. so it needs as many columns as length of string.

You could simply use the DataFrame constructor.

In [3]: pd.DataFrame(q_list, columns=['q_data'])
Out[3]:
q_data
0 112354401
1 116115526
2 114909312
3 122425491
4 131957025
5 111373473

How to create a DataFrame of a single column from a list where the first element is the column name in python

I think need read_csv here with usecols parameter for filter second column:

df = pd.read_csv('df_seg_sample.csv', usecols=[1])
print (df)
rev
0 15.31
1 64.90
2 18.36
3 62.85
4 10.31
5 12.84
6 69.95
7 32.81

But if want use your solution is necssary add [] for one item list for column name and use only DataFrame contructor:

data = [x[1] for x in c_reader]
print (data)
['rev', '15.31', '64.9', '18.36', '62.85', '10.31', '12.84', '69.95', '32.81']

df = pd.DataFrame(data[1:], columns=[data[0]])
print (df)
rev
0 15.31
1 64.9
2 18.36
3 62.85
4 10.31
5 12.84
6 69.95
7 32.81

Add column in dataframe from list

IIUC, if you make your (unfortunately named) List into an ndarray, you can simply index into it naturally.

>>> import numpy as np
>>> m = np.arange(16)*10
>>> m[df.A]
array([ 0, 40, 50, 60, 150, 150, 140, 130])
>>> df["D"] = m[df.A]
>>> df
A B C D
0 0 NaN NaN 0
1 4 NaN NaN 40
2 5 NaN NaN 50
3 6 NaN NaN 60
4 15 NaN NaN 150
5 15 NaN NaN 150
6 14 NaN NaN 140
7 13 NaN NaN 130

Here I built a new m, but if you use m = np.asarray(List), the same thing should work: the values in df.A will pick out the appropriate elements of m.


Note that if you're using an old version of numpy, you might have to use m[df.A.values] instead-- in the past, numpy didn't play well with others, and some refactoring in pandas caused some headaches. Things have improved now.



Related Topics



Leave a reply



Submit