What Does The "More Columns Than Column Names" Error Mean

R read.csv More columns than column names error

That's one wonky CSV file. Multiple headers tossed about (try pasting it to CSV Fingerprint) to see what I mean.

Since I don't know the data, it's impossible to be sure the following produces accurate results for you, but it involves using readLines and other R functions to pre-process the text:

# use readLines to get the data
dat <- readLines("N0_07312014.CSV")

# i had to do this to fix grep errors
Sys.setlocale('LC_ALL','C')

# filter out the repeating, and wonky headers
dat_2 <- grep("Node Name,RTC_date", dat, invert=TRUE, value=TRUE)

# turn that vector into a text connection for read.csv
dat_3 <- read.csv(textConnection(paste0(dat_2, collapse="\n")),
header=FALSE, stringsAsFactors=FALSE)

str(dat_3)
## 'data.frame': 308 obs. of 37 variables:
## $ V1 : chr "Node 0" "Node 0" "Node 0" "Node 0" ...
## $ V2 : chr "07/31/2014" "07/31/2014" "07/31/2014" "07/31/2014" ...
## $ V3 : chr "08:58:18" "08:59:22" "08:59:37" "09:00:06" ...
## $ V4 : chr "" "" "" "" ...
## .. more
## $ V36: chr "" "" "" "" ...
## $ V37: chr "0" "0" "0" "0" ...

# grab the headers
headers <- strsplit(dat[1], ",")[[1]]

# how many of them are there?
length(headers)
## [1] 32

# limit it to the 32 columns you want (Which matches)
dat_4 <- dat_3[,1:32]

# and add the headers
colnames(dat_4) <- headers

str(dat_4)
## 'data.frame': 308 obs. of 32 variables:
## $ Node Name : chr "Node 0" "Node 0" "Node 0" "Node 0" ...
## $ RTC_date : chr "07/31/2014" "07/31/2014" "07/31/2014" "07/31/2014" ...
## $ RTC_time : chr "08:58:18" "08:59:22" "08:59:37" "09:00:06" ...
## $ N1 Bat (VDC) : chr "" "" "" "" ...
## $ N1 Shinyei (ug/m3): chr "" "" "0.23" "null" ...
## $ N1 CC (ppb) : chr "" "" "null" "null" ...
## $ N1 Aeroq (ppm) : chr "" "" "null" "null" ...
## ... continues

more columns than column name on txt file

If myFile contains the path/filename then replace each of the first 4 stretches of whitespace on every line with a comma and then re-read using read.csv. No packages are used.

L <- readLines(myFile) ##
for(i in 1:4) L <- sub("\\s+", ",", L)
DF <- read.csv(text = L)

giving:

> DF
height Shoesize gender Location
1 181 44 male city center
4 170 43 female city center
5 172 43 female city center
13 175 42 male out of city
14 181 44 male out of city
15 180 43 male out of city
16 177 43 female out of city
17 133 41 male out of city

Note: For purposes of testing we can use this in place of the line marked ## above. (Note that SO can introduce spaces at the beginnings of the lines so we remove them.)

Lines <- " height Shoesize gender Location
1 181 44 male city center
4 170 43 female city center
5 172 43 female city center
13 175 42 male out of city
14 181 44 male out of city
15 180 43 male out of city
16 177 43 female out of city
17 133 41 male out of city"

L <- readLines(textConnection(Lines))
L[-1] <- sub("^\\s+", "", L[-1])

import csv-table into R and got multiple errors

You have to define a separator otherwise R fail to read data properly. Suppose your data structure is the following:

structure(list(month = 2:5, titles_tmp = structure(c(1L, 1L, 
1L, 1L), .Label = "some text", class = "factor"), info_tmp = structure(c(1L,
1L, 1L, 1L), .Label = "More text", class = "factor"), unlist.text = structure(c(1L,
1L, 1L, 1L), .Label = "http://somelink.com", class = "factor")), .Names = c("month",
"titles_tmp", "info_tmp", "unlist.text"), class = "data.frame", row.names = c(NA,
-4L))

That means you separate each columns with single tab. Meaning you need to use sep = " " as a data separator. Provided your data file name is "df.csv" the following should import your data nicely:

df = read.csv("Sz-Iraki2.csv", sep= " ", fileEncoding = "UTF-8")

Issues importing a csv in R

I finally found the solution!
I was going nuts; even my instructor didn't know how to fix it!

This statement works:

o<-read.csv("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/Occ.txt", header=T, sep="\t", fileEncoding="UTF-16LE")

Like I said in my original question: I tried using fileEncoding="UTF-16LE" and it didn't help. After asking the question, I tried using sep="\t", and it didn't help. But using both of them did the trick!



Related Topics



Leave a reply



Submit