How to Read \" Double-Quote Escaped Values with Read.Table in R

How to read \ double-quote escaped values with read.table in R

It seems to me that read.table/read.csv cannot handle escaped quotes.

...But I think I have an (ugly) work-around inspired by @nullglob;

  • First read the file WITHOUT a quote character.
    (This won't handle embedded , as @Ben Bolker noted)
  • Then go though the string columns and remove the quotes:

The test file looks like this (I added a non-string column for good measure):

13,"foo","Fab D\"atri","bar"
21,"foo2","Fab D\"atri2","bar2"

And here is the code:

# Generate test file
writeLines(c("13,\"foo\",\"Fab D\\\"atri\",\"bar\"",
"21,\"foo2\",\"Fab D\\\"atri2\",\"bar2\"" ), "foo.txt")

# Read ignoring quotes
tbl <- read.table("foo.txt", as.is=TRUE, quote='', sep=',', header=FALSE, row.names=NULL)

# Go through and cleanup
for (i in seq_len(NCOL(tbl))) {
if (is.character(tbl[[i]])) {
x <- tbl[[i]]
x <- substr(x, 2, nchar(x)-1) # Remove surrounding quotes
tbl[[i]] <- gsub('\\\\"', '"', x) # Unescape quotes
}
}

The output is then correct:

> tbl
V1 V2 V3 V4
1 13 foo Fab D"atri bar
2 21 foo2 Fab D"atri2 bar2

How to read quoted text containing escaped quotes

One possibility is to use readLines() to get everything read in as is, and then proceed by replacing the quote character by something else, eg :

tt <- readLines("F:/temp/test.txt")
tt <- gsub("([^\\]|^)'","\\1\"",tt) # replace ' by "
tt <- gsub("\\\\","\\",tt) # get rid of the double escape due to readLines

This allows you to read the vector tt in using a textConnection

zz <- textConnection(tt)
read.csv(zz,header=F,quote="\"") # give text input
close(zz)

Not the most beautiful solution, but it works (provided you don't have a " character somewhere in the file off course...)

How to properly escape a double quote in CSV?

Use 2 quotes:

"Samsung U600 24"""

Read csv file in R with double quotes

fread from data.table handles this just fine:

library(data.table)

fread('Type,ID,NAME,CONTENT,RESPONSE,GRADE,SOURCE
A,3,"","I have comma, ha!",I have open double quotes",A,""')
# Type ID NAME CONTENT RESPONSE GRADE SOURCE
#1: A 3 I have comma, ha! I have open double quotes" A

R: read.table with quotation marks

See if this is what you want:

 data <- read.table(file = "pos.txt", quote = "")

Quotes are set to " and ' by default for read.table. From your question, I think you are trying to treat them as ordinary data elements. So, set the quote to empty character.



Related Topics



Leave a reply



Submit