Sqlsave: Mapping Dataframe Timestamps to SQL Server Timestamps

sqlSave: Mapping dataframe timestamps to SQL Server timestamps

Two options:

1) Lazy one: let the error occur, the table will be created, and change the column(s) to datetime manually in your database. It will work the next time.

2) Correct: use varTypes

Note that your problem can be stripped down by removing unnecessary stuff. As an aside, I probably would not use the column name timestamp in an sql server, because I have seen confusions because of the internal timestamp data type is totally different.

library(RODBC)
mdf = data.frame(timestamp=as.POSIXct(Sys.time()))

varTypes = c(timestamp="datetime")
channel = odbcConnect("test")
sqlSave(channel,mdf,rownames=FALSE,append=TRUE,varTypes=varTypes)
close(channel)

RODBC sqlSave table creation problems

After re-reading the RODBC vignette and here's the simple solution that worked:

sqlDrop(db, "df", errors = FALSE)
sqlSave(db, df)

Done.

After experimenting with this a lot more for several days, it seems that the problems stemmed from the use of the additional options, particularlly table = or, equivalently, tablename =. Those should be valid options but somehow they manage to cause problems with my particular version of RStudio ((Windows, 64 bit, desktop version, current build), R (Windows, 64 bit, v3), and/or MS SQL Server 2008.

sqlSave(db, df) will also work without sqlDrop(db, "df") if the table has never existed, but as a best practice I'm writing try(sqlDrop(db, "df", errors = FALSE), silent = TRUE) before all sqlSave statements in my code.

Force dbGetQuery to Return POSIXct Timestamp

TL;DR

(updated a little from my comment)

DBI::dbGetQuery(con, "select cast ( SYSDATETIMEOFFSET() at time zone 'UTC' as DATETIME ) as now")
# now
# 1 2020-03-25 20:30:33.026
Sys.time()
# [1] "2020-03-25 13:30:31.177 PDT"

(my laptop and the remote sql server are not synced)

Explanation

The odbc driver (using the nanodbc C++ library) will recognize data of SQL Server's type DATETIME. However, this type does not include time zone, so dumbing down the data can introduce error if two rows do not reference the same TZ.

DBI::dbExecute(con, "create table r2mt (id INTEGER, tm DATETIMEOFFSET)")
# [1] 0
DBI::dbExecute(con, "insert into r2mt (id,tm) values (1,'2020-03-23 12:34:56 +00:00'),(2,'2020-03-23 12:34:56.100 -04:00')")
# [1] 2

dat <- DBI::dbGetQuery(con, "select id, tm from r2mt")
str(dat)
# 'data.frame': 2 obs. of 2 variables:
# $ id: int 1 2
# $ tm: chr "2020-03-23 12:34:56.0000000 +00:00" "2020-03-23 12:34:56.5000000 -04:00"
as.POSIXct(gsub("([-+]?[0-9]{2}):([0-9]{2})$", "\\1\\2", dat$tm),
format = "%Y-%m-%d %H:%M:%OS %z")
# [1] "2020-03-23 05:34:56.0 PDT" "2020-03-23 09:34:56.5 PDT"
diff( as.POSIXct(gsub("([-+]?[0-9]{2}):([0-9]{2})$", "\\1\\2", dat$tm),
format = "%Y-%m-%d %H:%M:%OS %z") )
# Time difference of 4.000139 hours

dat <- DBI::dbGetQuery(con, "select id, cast(tm as DATETIME) as tm from r2mt")
str(dat)
# 'data.frame': 2 obs. of 2 variables:
# $ id: int 1 2
# $ tm: POSIXct, format: "2020-03-23 12:34:56.0" "2020-03-23 12:34:56.5"
diff(dat$tm)
# Time difference of 0.5 secs

(In R, the time zone is an attribute of the vector, the whole column, so will not vary between different elements in that column.)

Since you're trying to do as much in SQL as possible (good idea), when you cast to the DATETIME class, make sure you force a time zone for all so that at least all times are comparable.

dat <- DBI::dbGetQuery(con, "select id, cast(tm at time zone 'UTC' as DATETIME) as tm from r2mt")
str(dat)
# 'data.frame': 2 obs. of 2 variables:
# $ id: int 1 2
# $ tm: POSIXct, format: "2020-03-23 12:34:56.0" "2020-03-23 16:34:56.5"

dat <- DBI::dbGetQuery(con, "select id, cast(tm at time zone 'Central European Standard Time' as datetime) as tm from r2mt")
str(dat)
# 'data.frame': 2 obs. of 2 variables:
# $ id: int 1 2
# $ tm: POSIXct, format: "2020-03-23 13:34:56.0" "2020-03-23 17:34:56.5"

(Unfortunately, the time zones used in SQL Server are not the same as in R. I tend to prefer 'UTC' for lack of ambiguity, over to you.)

RODBC sqlSave table creation problems

After re-reading the RODBC vignette and here's the simple solution that worked:

sqlDrop(db, "df", errors = FALSE)
sqlSave(db, df)

Done.

After experimenting with this a lot more for several days, it seems that the problems stemmed from the use of the additional options, particularlly table = or, equivalently, tablename =. Those should be valid options but somehow they manage to cause problems with my particular version of RStudio ((Windows, 64 bit, desktop version, current build), R (Windows, 64 bit, v3), and/or MS SQL Server 2008.

sqlSave(db, df) will also work without sqlDrop(db, "df") if the table has never existed, but as a best practice I'm writing try(sqlDrop(db, "df", errors = FALSE), silent = TRUE) before all sqlSave statements in my code.

RODBC sqlSave() and mapping column names

I'm now doing it this way (maybe that's also what you meant):

colnames(dat) <- c("A", "B")
sqlSave(channel, dat, tablename = "tblTest", rownames=FALSE, append=TRUE)

It works for me. Thanks for your help.

Should I use the datetime or timestamp data type in MySQL?

Timestamps in MySQL are generally used to track changes to records, and are often updated every time the record is changed. If you want to store a specific value you should use a datetime field.

If you meant that you want to decide between using a UNIX timestamp or a native MySQL datetime field, go with the native DATETIME format. You can do calculations within MySQL that way
("SELECT DATE_ADD(my_datetime, INTERVAL 1 DAY)") and it is simple to change the format of the value to a UNIX timestamp ("SELECT UNIX_TIMESTAMP(my_datetime)") when you query the record if you want to operate on it with PHP.



Related Topics



Leave a reply



Submit