R: Losing Column Names When Adding Rows to an Empty Data Frame

How to add rows to empty data frames with header in R?

Adding to a zero-row data.frame will act differently to adding to an data.frame that already contains rows

From ?rbind

The rbind data frame method first drops all zero-column and zero-row arguments. (If that leaves none, it returns the first argument with columns otherwise a zero-column zero-row data frame.) It then takes the classes of the columns from the first data frame, and matches columns by name (rather than by position). Factors have their levels expanded as necessary (in the order of the levels of the levelsets of the factors encountered) and the result is an ordered factor if and only if all the components were ordered factors. (The last point differs from S-PLUS.) Old-style categories (integer vectors with levels) are promoted to factors.

You have a number of options --

the most straightforward

 compData[1, ] <- c(5, 443)

more complicated

Or you could coerce c(5,433) to a list or data.frame

rbind(compData,setNames(as.list(c(5,443)), names(compData)))

or

rbind(compData,do.call(data.frame,setNames(as.list(c(5,443)), names(compData))))

But in this case you might as well do

do.call(data.frame,setNames(as.list(c(5,443)), names(compData)))

data.table option

You could use the data.table function rbindlist which does less checking and thus preserves the names of the first data.frame

library(data.table)
rbindlist(list(compData, as.list(c(5,443))

R: losing column names when adding rows to an empty data frame

The rbind help pages specifies that :

For ‘cbind’ (‘rbind’), vectors of zero
length (including ‘NULL’) are ignored
unless the result would have zero rows
(columns), for S compatibility.
(Zero-extent matrices do not occur in
S3 and are not ignored in R.)

So, in fact, a is ignored in your rbind instruction. Not totally ignored, it seems, because as it is a data frame the rbind function is called as rbind.data.frame :

rbind.data.frame(c(5,6))
# X5 X6
#1 5 6

Maybe one way to insert the row could be :

a[nrow(a)+1,] <- c(5,6)
a
# one two
#1 5 6

But there may be a better way to do it depending on your code.

Create empty data frame with column names by assigning a string vector?

How about:

df <- data.frame(matrix(ncol = 3, nrow = 0))
x <- c("name", "age", "gender")
colnames(df) <- x

To do all these operations in one-liner:

setNames(data.frame(matrix(ncol = 3, nrow = 0)), c("name", "age", "gender"))

#[1] name age gender
#<0 rows> (or 0-length row.names)

Or

data.frame(matrix(ncol=3,nrow=0, dimnames=list(NULL, c("name", "age", "gender"))))

dataframe column names for empty data.frame

Normaly, data.frames can be joined only if they have the same colnames.

data1 <- data.frame(x = 1, y = 1)
data2 <- data.frame(x = 2, y = 2)
rbind(data1, data2)

Otherwise, you will get an error.

data1 <- data.frame(xa = 1, ya = 1)
data2 <- data.frame(x = 2, y = 2)
rbind(data1, data2)
# Error in match.names(clabs, names(xi)) : names do not match previous names

However, if one of the data.frames is empty, the non-empty data.frame will govern the features of the new data.frame.

data1 <- data.frame(x = numeric(), y = numeric())
data2 <- data.frame(xa = 2, ya = 2)
rbind(data1, data2)

data1 <- data.frame(xa = 2, ya = 2)
data2 <- data.frame(x = numeric(), y = numeric())
rbind(data1, data2)

In your case c("a", "b") is coerced to data.frame before joining it with the other data.frame. Then it creates an automatic colnames for the coerced data.frame and it will govern the features of the new data.frame given that it is not empty.

A newbie has a question about data frame column names

  1. How did you import the data? You should be able to fix this while importing the data itself. Maybe adding header = TRUE would be enough.

  2. While importing use stringsAsFactors = FALSE which will avoid turning string value to factors.

  3. Finally if you can't do anything in step 1 and 2 here's a way which can fix the data in your current setup.

#Assign column names
colnames(a) <- as.character(unlist(a[1,]))
#Remove 1st row
a <- a[-1, ]
#Change to respective classes
a <- type.convert(a)

Programmatically creating a data frame and adding rows to it

empty <- data.frame(a = numeric(), b = factor(), c = character())
filled <- rbind(empty, data.frame(a = 1, b = factor("abc"), c = "def"))

Here it is in action:

> empty <- data.frame(a = numeric(), b = factor(), c = character())
> empty
[1] a b c
<0 rows> (or 0-length row.names)
> empty$a
numeric(0)
> empty$b
factor(0)
Levels:
> empty$c
character(0)
> filled <- rbind(empty, data.frame(a = 1, b = factor("abc"), c = "def"))
> summary(filled)
a b c
Min. :1 abc:1 Length:1
1st Qu.:1 Class :character
Median :1 Mode :character
Mean :1
3rd Qu.:1
Max. :1

Fill missing values in R data frame without losing row names and column names?

Just try:

df1[is.na(df1)]<-df2[is.na(df1)]
df1
# a b c
#1 1 10 6
#2 2 5 NA
#3 3 2 1


Related Topics



Leave a reply



Submit