Use First Row Data as Column Names in R

How to change the first row to be the header in R?

If you don't want to re-read the data into R (which it seems like you don't from the comments), you can do the following. I had to add some zeros to get your data to read completely, so disregard those.

dat
## V2 V3 V4 V5 V6 V7 V8 V9 V10
## 17 Zip CuCurrent PaCurrent PoCurrent Contact Ext Fax email Status
## 18 74136 0 1 0 918-491-6998 0 918-491-6659 0 1
## 19 30329 1 0 0 404-321-5711 0 0 0 1
## 20 74136 1 0 0 918-523-2516 0 918-523-2522 0 1
## 21 80203 0 1 0 303-864-1919 0 0 0 1
## 22 80120 1 0 0 345-098-8890 456 0 0 1

First take the first row as the column names. Next remove the first row. Finish it off by converting the columns to their appropriate types.

names(dat) <- as.matrix(dat[1, ])
dat <- dat[-1, ]
dat[] <- lapply(dat, function(x) type.convert(as.character(x)))
dat
## Zip CuCurrent PaCurrent PoCurrent Contact Ext Fax email Status
## 1 74136 0 1 0 918-491-6998 0 918-491-6659 0 1
## 2 30329 1 0 0 404-321-5711 0 0 0 1
## 3 74136 1 0 0 918-523-2516 0 918-523-2522 0 1
## 4 80203 0 1 0 303-864-1919 0 0 0 1
## 5 80120 1 0 0 345-098-8890 456 0 0 1

Put the first row as the column names of my dataframe with dplyr in R

Try this:

library(dplyr)
library(tidyr)

x <- data.frame(
A = c(letters[1:10]),
M1 = c(11:20),
M2 = c(31:40),
M3 = c(41:50))

x %>%
gather(key = key, value = value, 2:ncol(x)) %>%
spread(key = names(x)[1], value = "value")
key a b c d e f g h i j
1 M1 11 12 13 14 15 16 17 18 19 20
2 M2 31 32 33 34 35 36 37 38 39 40
3 M3 41 42 43 44 45 46 47 48 49 50

First row as column names in a list of data frames(2)

The issue is because the columns were factor. So, we unlist and convert to character class

names(x) <- as.character(unlist(x[1,]))

Pasting the first row to the column name within a list

You can use lapply()

rename_col <- function(x){
colnames(x) <- paste0(colnames(x),x[1,],sep="_")
x[-1,]
}

#df_list as your list of data.frames
lapply(df_list,rename_col)

How do I stop r from using the first row of data as the column name?

You say:

After this point it just says that there are 29 rows and 1 column, which is not what I want!

What that is telling you is that you don't have a tab-separated file. There's not a way to tell which delimiter is being assumed, but it's not a tab. You can tell that by paying attention to the number of columns. Since you got only one column, the read_tsv function didn’t find any tabs. And then you have the issue that your colnames are all different. That well could mean that your files do not have a header line. If you wanted to see what was in your files you could do something like:

 df.list <- lapply(file.list, function(x) readLines(x)[1])
df.list[[1]]

If there are tabs, then they should reveal themselves by getting expanded into spaces when printed to the console.

Generally it is better to determine what delimiters exist by looking at the file with a text editor (but not MS Word).



Related Topics



Leave a reply



Submit