Specify Custom Date Format For Colclasses Argument in Read.Table/Read.Csv

You can write your own function that accepts a string and converts it to a Date using the format you want, then use the setAs to set it as an as method. Then you can use your function as part of the colClasses.


setAs("character","myDate", function(from) as.Date(from, format="%d/%m/%Y") )

tmp <- c("1, 15/08/2008", "2, 23/05/2010")
con <- textConnection(tmp)

tmp2 <- read.csv(con, colClasses=c('numeric','myDate'), header=FALSE)

Then modify if needed to work for your data.

Edit ---

You might want to run setClass('myDate') first to avoid the warning (you can ignore the warning, but it can get annoying if you do this a lot and this is a simple call that gets rid of it).

colClasses date and time read.csv

You can't do this upon reading the data in to R using the colClasses argument because the data span two "columns" in the CSV file. Instead, load the data and process the date and time columns into a single POSIXlt variable:

dat <- read.csv(textConnection("date,time,val1,val2
dat <- within(dat, Datetime <- as.POSIXlt(paste(date, time),
format = "%Y%m%d %H:%M:%S"))

[I presume it is year month day??, If not use "%Y%d%m %H:%M:%S"]

Which gives:

> head(dat)
date time val1 val2 Datetime
1 20090503 0:05:12 107.25 1 2009-05-03 00:05:12
2 20090503 0:05:17 108.25 20 2009-05-03 00:05:17
3 20090503 0:07:45 110.25 5 2009-05-03 00:07:45
4 20090503 0:07:56 106.25 5 2009-05-03 00:07:56
> str(dat)
'data.frame': 4 obs. of 5 variables:
$ date : int 20090503 20090503 20090503 20090503
$ time : Factor w/ 4 levels "0:05:12","0:05:17",..: 1 2 3 4
$ val1 : num 107 108 110 106
$ val2 : int 1 20 5 5
$ Datetime: POSIXlt, format: "2009-05-03 00:05:12" "2009-05-03 00:05:17" ...

You can now delete date and `time if you wish:

> dat <- dat[, -(1:2)]
> head(dat)
val1 val2 Datetime
1 107.25 1 2009-05-03 00:05:12
2 108.25 20 2009-05-03 00:05:17
3 110.25 5 2009-05-03 00:07:45
4 106.25 5 2009-05-03 00:07:56

effect of colClasses in read.csv function

Factors (the data type R uses to store categorical variables) carry their possible levels along with them, and these are printed by default. There are a variety of solutions:

  • use colClasses when reading in the data as you suggested;
  • use stringsAsFactors=FALSE
  • read the file as usual, then use print(as.character(z1[1]))
  • use print(z1[1],max.levels=0)

Read csv with timestamp to R. Define colClass in table.read

For an unconventional date-time format, one can import as character (step 1) and then coerce the column via strp (step 2)

step 1

df <- read.table(file = "data.csv",
header = TRUE,
sep = "," ,
dec = "." ,
colClasses = "character",
comment.char = ""

step 2

strptime(df$v1, "%m/%d/%y  %H:%M")

v1 being the name of the column to coerce (in this case date-time in the unconventional format 12/13/2014 15:16:17)

Using argument sep is necessary since read.table default for sep = "".

When using read.csv there is no need to use the sep argument, which defaults to ",".

Using comment.char = "" (when possible) improves reading time.

Useful info at http://cran.r-project.org/doc/manuals/r-release/R-data.pdf

read.csv2 date formatting in R

When you convert character to date, you need specify format if it is not standard. The error you got is the result of as.Date("08.09.2016"). But if you do as.Date("08.09.2016", format = "%m.%d.%Y"), it is fine.

I am not sure whether it is possible to pass format to read.csv2 for correct date formatting (maybe not). I would simply read in this date column as factor, then do as.Date(as.character(), format = "%m.%d.%Y") on this column myself.

Generally we use the following format "dd/mm/yy" how can I reorganise the date to that format?

Use format(, format = "%d/%m/%y").

A complete example:

format(as.Date("08.09.2016", format = "%m.%d.%Y"), format = "%d/%m/%y")
# [1] "09/08/16"

read.csv() and colClasses

You really don't need the [1:24] in every assignment, this is what is causing your problems. You are assign to a subset of a indexed vector of some description.

The error message when are trying to assign to data[1:24], without data being assigned previously (in your previous usage (which you mentioned worked), data was probably a list or data.frame you had created.). As such data is a function (for loading data associated with packages, see ?data) and the error you saw is saying that (a function includes a closure)

I would suggest something like

Precipfiles <- list.files(pattern=".csv")
DFlist <- lapply(Precipfiles, read.table, sep = '\t',
na.string = '', header = TRUE)
bigDF <- do.call(rbind, DFlist)

# or, much faster using data.table
bigDF <- rbindlist(DFlist)

Converting a date in R returns NA

 as.Date.character(gsub("/", "-",td3$date), '%m-%d-%Y')
[1] "2016-05-06" "2016-05-07" "2016-04-13" "2016-04-14"

