What Does The "More Columns Than Column Names" Error Mean

R read.csv More columns than column names error

That's one wonky CSV file. Multiple headers tossed about (try pasting it to CSV Fingerprint) to see what I mean.

Since I don't know the data, it's impossible to be sure the following produces accurate results for you, but it involves using readLines and other R functions to pre-process the text:

# use readLines to get the data
dat <- readLines("N0_07312014.CSV")

# i had to do this to fix grep errors
Sys.setlocale('LC_ALL','C')

# filter out the repeating, and wonky headers
dat_2 <- grep("Node Name,RTC_date", dat, invert=TRUE, value=TRUE)

# turn that vector into a text connection for read.csv
dat_3 <- read.csv(textConnection(paste0(dat_2, collapse="\n")),
                  header=FALSE, stringsAsFactors=FALSE)

str(dat_3)
## 'data.frame':    308 obs. of  37 variables:
##  $ V1 : chr  "Node 0" "Node 0" "Node 0" "Node 0" ...
##  $ V2 : chr  "07/31/2014" "07/31/2014" "07/31/2014" "07/31/2014" ...
##  $ V3 : chr  "08:58:18" "08:59:22" "08:59:37" "09:00:06" ...
##  $ V4 : chr  "" "" "" "" ...
## .. more
##  $ V36: chr  "" "" "" "" ...
##  $ V37: chr  "0" "0" "0" "0" ...

# grab the headers
headers <- strsplit(dat[1], ",")[[1]]

# how many of them are there?
length(headers)
## [1] 32

# limit it to the 32 columns you want (Which matches)
dat_4 <- dat_3[,1:32]

# and add the headers
colnames(dat_4) <- headers

str(dat_4)
## 'data.frame':    308 obs. of  32 variables:
##  $ Node Name         : chr  "Node 0" "Node 0" "Node 0" "Node 0" ...
##  $ RTC_date          : chr  "07/31/2014" "07/31/2014" "07/31/2014" "07/31/2014" ...
##  $ RTC_time          : chr  "08:58:18" "08:59:22" "08:59:37" "09:00:06" ...
##  $ N1 Bat (VDC)      : chr  "" "" "" "" ...
##  $ N1 Shinyei (ug/m3): chr  "" "" "0.23" "null" ...
##  $ N1 CC (ppb)       : chr  "" "" "null" "null" ...
##  $ N1 Aeroq (ppm)    : chr  "" "" "null" "null" ...
## ... continues

more columns than column name on txt file

If myFile contains the path/filename then replace each of the first 4 stretches of whitespace on every line with a comma and then re-read using read.csv. No packages are used.

L <- readLines(myFile) ##
for(i in 1:4) L <- sub("\\s+", ",", L)
DF <- read.csv(text = L)

giving:

> DF
   height Shoesize gender    Location
1     181       44   male city center
4     170       43 female city center
5     172       43 female city center
13    175       42   male out of city
14    181       44   male out of city
15    180       43   male out of city
16    177       43 female out of city
17    133       41   male out of city

Note: For purposes of testing we can use this in place of the line marked ## above. (Note that SO can introduce spaces at the beginnings of the lines so we remove them.)

Lines <- " height Shoesize gender Location
1   181 44   male      city center
4   170 43   female    city center
5   172 43   female    city center
13  175 42   male      out of city
14  181 44   male      out of city
15  180 43   male      out of city
16  177 43   female    out of city
17  133 41   male      out of city"

L <- readLines(textConnection(Lines))
L[-1] <- sub("^\\s+", "", L[-1])

import csv-table into R and got multiple errors

You have to define a separator otherwise R fail to read data properly. Suppose your data structure is the following:

structure(list(month = 2:5, titles_tmp = structure(c(1L, 1L, 
1L, 1L), .Label = "some text", class = "factor"), info_tmp = structure(c(1L, 
1L, 1L, 1L), .Label = "More text", class = "factor"), unlist.text = structure(c(1L, 
1L, 1L, 1L), .Label = "http://somelink.com", class = "factor")), .Names = c("month", 
"titles_tmp", "info_tmp", "unlist.text"), class = "data.frame", row.names = c(NA, 
-4L))

That means you separate each columns with single tab. Meaning you need to use sep = " " as a data separator. Provided your data file name is "df.csv" the following should import your data nicely:

df = read.csv("Sz-Iraki2.csv", sep= " ", fileEncoding = "UTF-8")

Issues importing a csv in R

I finally found the solution!
I was going nuts; even my instructor didn't know how to fix it!

This statement works:

o<-read.csv("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/Occ.txt", header=T, sep="\t", fileEncoding="UTF-16LE")

Like I said in my original question: I tried using fileEncoding="UTF-16LE" and it didn't help. After asking the question, I tried using sep="\t", and it didn't help. But using both of them did the trick!

What Does The "More Columns Than Column Names" Error Mean

R read.csv More columns than column names error

more columns than column name on txt file

import csv-table into R and got multiple errors

Issues importing a csv in R

Related Topics

Leave a reply