Pandas Convert Dataframe to Array of Tuples

Pandas convert dataframe to array of tuples

list(data_set.itertuples(index=False))

As of 17.1, the above will return a list of namedtuples.

If you want a list of ordinary tuples, pass name=None as an argument:

list(data_set.itertuples(index=False, name=None))

How to convert dataframe into list of tuples with respect to category?

Use GroupBy.agg with tuple like:

print (df.groupby('id_customer', sort=False)['product_id'].agg(tuple).tolist())
print (df.groupby('id_customer', sort=False)['product_id'].apply(tuple).tolist())

print (list(df.groupby('id_customer', sort=False)['product_id'].agg(tuple)))
print (list(df.groupby('id_customer', sort=False)['product_id'].apply(tuple)))
[('5', '7', '8'), ('5', '30')]

Pandas convert dataframe to array of tuples without None

Use nested list comprehension with filtering:

data = [tuple([y for y in x if y is not None]) for x in data.values]
print (data)
[('a', 'b', 'c', 'd'), ('a', 'b', 'c')]

Slowier alternative if large data - reshape for remove Nones and aggregate by first level of MultiIndex for tuples:

data = data.stack().groupby(level=0).apply(tuple).tolist()
print (data)
[('a', 'b', 'c', 'd'), ('a', 'b', 'c')]

List of tuples for each pandas dataframe slice

I think this should work too:

import pandas as pd
import itertools

df = pd.DataFrame({"A": [1, 2, 3, 1], "B": [2, 2, 2, 2], "C": ["A", "B", "C", "B"]})

tuples_in_df = sorted(tuple(df.to_records(index=False)), key=lambda x: x[0])
output = [[tuple(x)[1:] for x in group] for _, group in itertools.groupby(tuples_in_df, lambda x: x[0])]
print(output)

Out:

[[(2, 'A'), (2, 'B')], [(2, 'B')], [(2, 'C')]]

Converting pandas dataframe into list of tuples with index

You can iterate over the result of to_records(index=True).

Say you start with this:

In [6]: df = pd.DataFrame({'a': range(3, 7), 'b': range(1, 5), 'c': range(2, 6)}).set_index('a')

In [7]: df
Out[7]:
b c
a
3 1 2
4 2 3
5 3 4
6 4 5

then this works, except that it does not include the index (a):

In [8]: [tuple(x) for x in df.to_records(index=False)]
Out[8]: [(1, 2), (2, 3), (3, 4), (4, 5)]

However, if you pass index=True, then it does what you want:

In [9]: [tuple(x) for x in df.to_records(index=True)]
Out[9]: [(3, 1, 2), (4, 2, 3), (5, 3, 4), (6, 4, 5)]

Converting dictionary with tuple as key and values as array to a pandas dataframe

Use:

d =  {(0, 6): np.array([[1, 2, 3, 0, 1 ,1]]), 
(0, 9): np.array([[1, 2, 3, 0, 1, 1]])}

df = pd.DataFrame.from_dict({k: v[0] for k, v in d.items()}, orient='index')
df = df.rename_axis('Key').rename(columns=lambda x: f'v{x+1}').reset_index()
print (df)
Key v1 v2 v3 v4 v5 v6
0 (0, 6) 1 2 3 0 1 1
1 (0, 9) 1 2 3 0 1 1

Or:

df = pd.DataFrame(np.vstack(list(d.values()))).rename(columns=lambda x: f'v{x+1}')
df.insert(0,'Key',list(d.keys()))
print (df)
Key v1 v2 v3 v4 v5 v6
0 (0, 6) 1 2 3 0 1 1
1 (0, 9) 1 2 3 0 1 1


Related Topics



Leave a reply



Submit