How to Append a Whole Dataframe to a CSV in R

How to append a whole dataframe to a CSV in R

Ok so I realised that append=T does work with write.table - but write.table needs the sep switch to be used so this works:

write.table(myDF, "myDF.csv", sep = ",", col.names = !file.exists("myDF.csv"), append = T)

Appending a new line into an existing csv file

You cannot append using write.csv(). Instead you need to use write.table() and specify a few additional parameters. The following will append a new row of data to your csv file, omitting the column headers for the append. That means you only need to include column headers when you write the table the first time; after that, the data should flow in under the same headers.

write.table( dataplus,  
file="./finances.csv",
append = T,
sep=',',
row.names=F,
col.names=F )

append dataframes of different size in same csv file (R)

There isn't really an easy way to do exactly what you want.
But one hacky way is to use the sink() function to redirect all console output to a file.

df1 <- data.frame( A= c(1,2,3,4), B= c("a", "b", "c", "d"))
df2 <- data.frame( A= c(1,2,3))

# start redirecting output
sink(file = "file1.csv")
df1
df2
# close the file
sink()

This of course will not give you a native csv, but you could then open and write the file again.

file2 <- read.table(file = "file1.csv", sep = " ", fill = TRUE)
write.csv(file2 ,file= "file2.csv")

Progressive appending of data from read.csv

If the data is fairly small relative to your available memory, just read the data in and don't worry about it. After you have read in all the data and done some cleaning, save the file using save() and have your analysis scripts read in that file using load(). Separating reading/cleaning scripts from analysis clips is a good way to reduce this problem.

A feature to speed up the reading of read.csv is to use the nrow and colClass arguments. Since you say that you know that number of rows in each file, telling R this will help speed up the reading. You can extract the column classes using

colClasses <- sapply(read.csv(file, nrow=100), class)

then give the result to the colClass argument.

If the data is getting close to being too large, you may consider processing individual files and saving intermediate versions. There are a number of related discussions to managing memory on the site that cover this topic.

On memory usage tricks:
Tricks to manage the available memory in an R session

On using the garbage collector function:
Forcing garbage collection to run in R with the gc() command

R: How do I append data frames to a list?

Congrats on choosing a good way to iteratively add frames to an object :-)

Two ways, depending on how you are working:

csv_list <- vector(mode = "list", length = 3)
csv_list[[1]] <- mtcars[1:2,]
csv_list[[2]] <- mtcars[1:2,]
csv_list
# [[1]]
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
# [[2]]
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
# [[3]]
# NULL

or

csv_list <- list()
csv_list <- c(csv_list, list(mtcars[1:2,]))
csv_list <- c(csv_list, list(mtcars[1:2,]))
csv_list
# [[1]]
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
# [[2]]
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4

Notice that this last one is open-ended, its length is not necessarily what you need it to be in the end. Even if you pre-allocate length 36 as in your code, if you try to assign [[37]] (first example) or append a 37th frame (second example), it will happily work, there is not bounds-checking in this use-case.

(BTW: unlike data.frames and some other objects, arbitrarily appending to list objects does not scale poorly. For instance, if you tracemem a frame and append a row, you'll see a memory-shift indicating the copy of all of the frame's data (regardless of how many rows you appended. Counter to that, though, if you tracemem(csv_list), you can append to it efficiently with either of the above methods, and the memory address of the list never changes, suggesting storage adjustment is a bit more efficient. That's not to say that it's invulnerable, but it's generally quite good.)

how to write multiple dataframe to a single csv file in a loop in R?

Two methods; the first is likely to be a little faster if neighbours_dataframe is a long list (though I haven't tested this).

Method 1: Convert the list of data frames to a single data frame first

As suggested by jbaums.

library(dplyr)
neighbours_dataframe_all <- rbind_all(neighbours_dataframe)
write.csv(neighbours_dataframe_all, "karate3.csv", row.names = FALSE)

Method 2: use a loop, appending

As suggested by Neal Fultz.

for(i in seq_along(neighbours_dataframe))
{
write.table(
neighbours_dataframe[[i]],
"karate3.csv",
append = i > 1,
sep = ",",
row.names = FALSE,
col.names = i == 1
)
}


Related Topics



Leave a reply



Submit