How to Write a Data-Frame with One Column a List to a File

How to write a data-frame with one column a list to a file?

I can think a few options, depending on what you're trying to achieve.

If it is for display only, then you might simply want capture.output() or sink(); neither of these would be very convenient to read back into R:

capture.output(dataset, file="myfile.txt")
### Result is a text file that looks like this:
# a b c l
# 1 1 a HI a, b
# 2 2 b DD 2, 3, 4
# 3 3 c gg 44, 33, 11, 22
# 4 4 d ff chr, ID, i, II
sink("myfile.txt")
dataset
sink()
## Same result as `capture.output()` approach

If you want to be able to read the resulting table back into R (albeit without preserving the fact that column "l" is a list), you can take an approach similar to what @DWin suggested.

In the code below, the dataset2[sapply... line identifies which variables are lists and concatenates them into a single string. Thus, they become simple character variables, allowing you to use write.table().

dataset2 <- dataset # make a copy just to be on the safe side
dataset2[sapply(dataset2, is.list)] <- apply(dataset2[sapply(dataset2, is.list)],
1, function(x)
paste(unlist(x),
sep=", ", collapse=", "))
str(dataset2)
# 'data.frame': 4 obs. of 4 variables:
# $ a: num 1 2 3 4
# $ b: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
# $ c: Factor w/ 4 levels "DD","ff","gg",..: 4 1 3 2
# $ l: chr "a, b" "2, 3, 4" "44, 33, 11, 22" "chr, ID, i, II"
write.table(dataset2, "myfile.txt", quote=FALSE, sep="\t")
# can be read back in with: dataset3 <- read.delim("myfile.txt")

Write Pandas DataFrame with List in Column to a File

You are almost there, just use ' '.join as the aggregating function for the Receiver column:

import numpy as np
import pandas as pd

df = pd.DataFrame({'Sender': ['Alice', 'Alice', 'Bob', 'Carl', 'Bob', 'Alice'],
'Receiver': ['David', 'Eric', 'Frank', 'Ginger', 'Holly', 'Ingrid'],
'Emails': [9, 3, 5, 1, 6, 7]
})

grouped = df.groupby('Sender')
result = grouped.agg({'Receiver': ' '.join,
'Emails': np.sum
})

print(result)

Output

                 Receiver  Emails
Sender
Alice David Eric Ingrid 19
Bob Frank Holly 11
Carl Ginger 1

For the sake of completeness, if the Receiver column where int instead of strings you could transform to string first and then join:

df = pd.DataFrame({'Sender': ['Alice', 'Alice', 'Bob', 'Carl', 'Bob', 'Alice'],
'Receiver': [1, 2, 3, 4, 5, 6],
'Emails': [9, 3, 5, 1, 6, 7]
})

grouped = df.groupby('Sender')
result = grouped.agg({'Receiver': lambda x: ' '.join(map(str, x)),
'Emails': np.sum
})

print(result)

Output

       Receiver  Emails
Sender
Alice 1 2 6 19
Bob 3 5 11
Carl 4 1

Write a data frame containing a list to csv file

train.user$age <- unlist(train.user$age)

Technically, a data.frame is a list of equal-length vectors, but most functions will assume that all of the columns are atomic vectors and will fail when you try to use a list.

NB: Don't edit an answer into your question.

Save a data frame with list-columns as csv file

Create a tibble containing list columns:

library(tibble)

clinic_name <- c('bobo center', 'yoyo plaza', 'lolo market')
drop_in_hours <- list(c("Monday: 2 pm - 5 pm", "Tuesday: 4 pm - 7 pm"))
appointment_hours <- list(c("Monday: 1 pm - 2 pm", "Tuesday: 2 pm - 3 pm"))
services <- list(c("skin graft", "chicken heart replacement"))

tibb <- data_frame(clinic_name, drop_in_hours, appointment_hours, services)

print(tibb)

Sample Image

Write a general-purpose function that converts any list columns to character type:

set_lists_to_chars <- function(x) {
if(class(x) == 'list') {
y <- paste(unlist(x[1]), sep='', collapse=', ')
} else {
y <- x
}
return(y)
}

Apply function to tibble with list columns:

new_frame <- data.frame(lapply(tibb, set_lists_to_chars), stringsAsFactors = F)

new_frame

Sample Image

Write newly formatted dataframe as csv file:

write.csv(new_frame, file='Desktop/clinics.csv')

Sample Image

This is a csv file with the list columns expanded as regular strings.

Here is an all-encompassing function. Just pass in your tibble and a filename:

tibble_with_lists_to_csv <- function(tibble_object, file_path_name) {
set_lists_to_chars <- function(x) {
if(class(x) == 'list') { y <- paste(unlist(x[1]), sep='', collapse=', ') } else { y <- x }
return(y) }
new_frame <- data.frame(lapply(tibble_object, set_lists_to_chars), stringsAsFactors = F)
write.csv(new_frame, file=file_path_name)
}

Usage:

tibble_with_lists_to_csv(tibb, '~/Desktop/tibb.csv')

Writing lists of data to a text file column by column

You could just use pandas?
e.g.

import pandas as pd
df = pd.DataFrame({'time(s)' : [0, 0.005, 0.001], 'voltage(V)' : ['0000000000','0000110001','0001100000'], 'current(A)' : ['101101010','101011000','101011000']})

or, if

a = [0, 0.005, 0.001] 
b = ['0000000000','0000110001','0001100000']
c = ['101101010','101011000','101011000']

you could just do

df = pd.DataFrame({'time(s)' : a, 'voltage(V)' : b, 'current(A)' : c})

then

df.to_csv('powdata.txt', sep='\t')

Save each dataframe in a list as a CSV, and make the filename the same as the df name - R; lapply

If you want to iterate both names and data.frames, you are better off using mapply to walk both lists at the same time

mapply(function(dname, data) 
write.csv(data, file = paste0("C:/home/", dname, ".csv"), row.names = FALSE),
names(df), df)

when iterating a list via lapply() the current value name is not available. An altertative using lapply is to iterate the names, not the values

lapply(names(df), function(dname) 
write.csv(df[[dname]], file = paste0("C:/home/", dname, ".csv"), row.names = FALSE))

How to export a column from a dataframe to a text file with left alignment in Python Pandas?

If your column you want to write is named COLUMN_NAME, you can do:

with open("output.txt", "w") as f_out:
f_out.write("\n".join(df["COLUMN_NAME"]))

This creates output.txt:

Michael Jordan
Scottie Pippen
Dirk

Convert List to Pandas Dataframe Column

Use:

L = ['Thanks You', 'Its fine no problem', 'Are you sure']

#create new df
df = pd.DataFrame({'col':L})
print (df)

col
0 Thanks You
1 Its fine no problem
2 Are you sure

df = pd.DataFrame({'oldcol':[1,2,3]})

#add column to existing df
df['col'] = L
print (df)
oldcol col
0 1 Thanks You
1 2 Its fine no problem
2 3 Are you sure

Thank you DYZ:

#default column name 0
df = pd.DataFrame(L)
print (df)
0
0 Thanks You
1 Its fine no problem
2 Are you sure


Related Topics



Leave a reply



Submit