"Unpacking" a Factor List from a Data.Frame

unpacking a factor list from a data.frame

The answer will depend on the format of category_list. If in fact it is a list for each row

Something like

mydf <- data.frame(ID = paste0('ID',1:3), 
 category_list = I(list(c('cat1','cat2','cat3'),  c('cat2','cat3'), c('cat1'))), 
 xval = 1:3, yval = 1:3)

library(data.table)
mydf <- as.data.frame(data.table(ID = paste0('ID',1:3), 
 category_list = list(c('cat1','cat2','cat3'),  c('cat2','cat3'), c('cat1')), 
 xval = 1:3, yval = 1:3) )

Then you can use plyr and merge to create your long form data

 newdf <- merge(mydf, ddply(mydf, .(ID), summarize, cat_list = unlist(category_list)), by = 'ID')

   ID    category_list xval yval cat_list
1 ID1 cat1, cat2, cat3    1    1     cat1
2 ID1 cat1, cat2, cat3    1    1     cat2
3 ID1 cat1, cat2, cat3    1    1     cat3
4 ID2       cat2, cat3    2    2     cat2
5 ID2       cat2, cat3    2    2     cat3
6 ID3             cat1    3    3     cat1

or a non-plyr approach that doesn't require merge

 do.call(rbind,lapply(split(mydf, mydf$ID), transform, cat_list = unlist(category_list)))

Unpacking and merging lists in a column in data.frame

Here's a possible data.table approach

library(data.table)
setDT(dat)[, .(name = c(name, unlist(altNames))), by = id]
#       id  name
#  1: 1001  Joan
#  2: 1002  Jane
#  3: 1002 Janie
#  4: 1002 Janet
#  5: 1002   Jan
#  6: 1003  John
#  7: 1003   Jon
#  8: 1004  Bill
#  9: 1004  Will
# 10: 1005   Tom

Unpacking a list-column of multi-row tibbles while preserving the number of rows

Here's one solution using unnest_wider

library(tidyr)
unnest_wider(tmp, y) %>% 
      unnest_wider(a, names_repair = ~gsub('...', 'a', .)) %>% 
      unnest_wider(b, names_repair = ~gsub('...', 'b', .))

New names:
* `` -> ...1
...
New names:
* `` -> ...1
* `` -> ...2
# A tibble: 2 x 5
      x    a1    a2    b1    b2
  <dbl> <dbl> <int> <dbl> <int>
1     1     1    NA     2    NA
2     2     4     5     6     7

Unpacking unknown object from within DataFrame

It seems you have object/class with properties id and name so you can try to get

{'id': st.id, 'name': st.name}

which means

df['customer_details'] = df['customer_details'].apply(lambda x: {'id': x.id, 'name': x.name})

or directly to separated columns

df['id']   = df['customer_details'].apply(lambda x: x.id)
df['name'] = df['customer_details'].apply(lambda x: x.name)

Example code:

import pandas as pd

class customer:
    def __init__(self, id_, name):
        self.id = id_
        self.name = name
    def __str__(self):
        return '<customer {{id: {}, name: {}}} as x>'.format(self.id, self.name)

data = {
    'trasaction_id': [1,2,3],
    'customer_details': [
        customer('A123', 'Tina'),
        customer('B456', 'Tony'),
        customer('C789', 'Tim')
    ],
}

df = pd.DataFrame(data)
print(df)

# ---

df['id'] = df['customer_details'].apply(lambda x: x.id)
df['name'] = df['customer_details'].apply(lambda x: x.name)
print(df)

df['customer_details'] = df['customer_details'].apply(lambda x: {'id': x.id, 'name': x.name})
print(df)

#new_df = pd.DataFrame( df['customer_details'].to_list() )

Result:

   trasaction_id                        customer_details
0              1  <customer {id: A123, name: Tina} as x>
1              2  <customer {id: B456, name: Tony} as x>
2              3   <customer {id: C789, name: Tim} as x>

   trasaction_id                        customer_details    id  name
0              1  <customer {id: A123, name: Tina} as x>  A123  Tina
1              2  <customer {id: B456, name: Tony} as x>  B456  Tony
2              3   <customer {id: C789, name: Tim} as x>  C789   Tim

   trasaction_id                customer_details    id  name
0              1  {'id': 'A123', 'name': 'Tina'}  A123  Tina
1              2  {'id': 'B456', 'name': 'Tony'}  B456  Tony
2              3   {'id': 'C789', 'name': 'Tim'}  C789   Tim

EDIT: If you have strings then you can use regex to get values from string

import pandas as pd
import re

data = {
    'trasaction_id': [1,2,3],
    'customer_details': [
        "<customer {id:'A123', name: 'Tina'} as x >",
        "<customer {id:'B456', name: 'Tony'} as x >",
        "<customer {id:'C789', name: 'Tim'} as x >",
    ]
}

df = pd.DataFrame(data)
print(df)

# ---

df['id'] = df['customer_details'].apply(lambda x: re.search("id:'(.*)',", x)[1])
df['name'] = df['customer_details'].apply(lambda x: re.search("name: '(.*)'}", x)[1])
print(df)

def myfunc(x):
    r = re.search("id:'(.*)', name: '(.*)'}", x)
    return {'id': r[1], 'name': r[2]}

df['customer_details'] = df['customer_details'].apply(myfunc)
print(df)

#new_df = pd.DataFrame( df['customer_details'].to_list() )

How to unpack a Series of tuples in Pandas?

maybe this is most strightforward (most pythonic i guess):

out.apply(pd.Series)

if you would want to rename the columns to something more meaningful, than:

out.columns=['Kstats','Pvalue']

if you do not want the default name for the index:

out.index.name=None

"Unpacking" a Factor List from a Data.Frame