How to Pass R Variable into SQLdf

How can I pass R variable into sqldf?

You can use sprintf:

sqldf(sprintf("select TenScore from data where State_P = '%s'", stateValue))

Pass a variable value as parameter to sqldf

You probably need single quotes around the value returned from Sys.Date(). This should work

TODAY <- Sys.Date()
sqldf(paste0("SELECT * FROM DATA",
" WHERE DATE < '", TODAY, "'",
" ORDER BY DATE DESC LIMIT 1"))

This would generate the following raw query:

SELECT *
FROM DATA
WHERE DATE < '2018-03-29'
ORDER BY DATE DESC
LIMIT 1

This query will return the most recent record in your data happening before today.

Note that building a SQL string using raw concatenation like is generally bad. But if you are just doing it from your R console for some data analysis, then it should be acceptable.

data

DATA <- data.frame(DATE = c(Sys.Date() + 5:10, Sys.Date() - 5:10))

How to include a user defined variable in sqldf query?

We can wrap the $var in a single quote

library(sqldf)
fn$sqldf("Select Iteration, (SUM(Hours) / SUM(Effort)) as Efficiency
From df WHERE Iteration = '$var2'")

Using a reproducible example

fn$sqldf("select cyl, (sum(hp)/sum(wt)) as Efficiency from mtcars where new = '$var2'")
# cyl Efficiency
#1 6 34.21462

If we are doing this on a loop

path_list <- c("Mazda RX4", "Datsun 710", "Valiant")
for(var1 in path_list) print(sqldf(sprintf("select cyl, (sum(hp)/sum(wt)) as Efficiency from mtcars where new = '%s'", var1)))
#cyl Efficiency
#1 6 41.98473
# cyl Efficiency
#1 4 40.08621
# cyl Efficiency
#1 6 30.34682

data

data(mtcars)
mtcars$new <- row.names(mtcars)
var2 <- "Hornet 4 Drive"

R call variable inside sqldf

Variable i will not be replaced in the query as it is. You need sprintf to assign a value to it. (I also don't know if you need to account for break lines but just to be sure I provide it below. Maybe you don't need for sqldf; in which case just remove strwrap).

#let's assume loaddates is the following:
loaddates <- 'something'

One way to get the query as you want, i.e. with no break lines and with i taking the values of loaddates you need:

strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM deuda AS D
WHERE D.loaddate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000)

which will output:

[1] "SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, COUNT(*) AS clientes FROM deuda AS D WHERE D.CodEmp = 'TGG' and D.loaddate = something GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext ORDER BY D.DaysRange, D.DaysRangeNext"

Which is what you need in one line with no break lines or with the variable i unassigned.

In your loop it should be:

for (i in loaddates) {

strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM deuda AS D
WHERE D.loaddate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000)

}

With your dataset:

library(sqldf)
data <- read.table(text ="
loaddate DaysRange DaysRangeNext
1 2014-03-16 30 30
2 2014-03-16 0 0
3 2014-03-16 0 0
4 2014-03-16 60 NA
5 2014-04-16 30 30
6 2014-04-16 0 30
"
,header = TRUE,stringsAsFactors=F)

loaddates<- unique(sort(data$loaddate))

for (i in loaddates) {

print(sqldf(
strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM data AS D
WHERE D.loaddate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000) ))
}

Output:

    loaddate DaysRange DaysRangeNext clientes
1 2014-03-16 0 0 2
2 2014-03-16 30 30 1
3 2014-03-16 60 NA 1
loaddate DaysRange DaysRangeNext clientes
1 2014-04-16 0 30 1
2 2014-04-16 30 30 1

R: Passing function argument to sqldf

Common mistake. The column name t1.lat_lon may exist, but in your call to function1, you are referencing a variable named t1.lat_lon, which does not exist.

You have a couple of options, and for simplicity I recommend the first (strings):

  1. Pass it as a string, and use paste or sprintf to form the query string. Such as:

    library(sqldf)
    func1 <- function(x) {
    sqldf(sprintf("select %s from mtcars where cyl > 7", x))
    }
    func1("disp")
  2. If you really must reference it without the quotes, you can do this trickery, but in this case it is unnecessary complexity (for the sake of two quotation mark symbols):

    func2 <- function(x) {
    sqldf(sprintf("select %s from mtcars where cyl > 7",
    deparse(substitute(x))))
    }
    func2(disp)

    This type of dereferencing is used (and useful) when doing something like plot(disp ~ mpg, data = mtcars) (though that does things even slightly differently with a formula). In those situations it adds usefulness at the expense of some code complexity and indirection. If you don't need this (and, based on your example, it does not look like you do), then I would avoid this for the sake of simplicity.

BTW: in both cases, I return the value returned from sqldf, I chose (and I strongly urge) against side-effects as you did in your function. Side effects can be desired but, if avoidable, are opined to provide cleaner and more predictable functions.

Pass R variable to a sql statement

If v is an integer then you don't want to enclose the $v with single quotes - that makes it a string value. Try without the single quotes.

temp <- fn$sqldf("select count(V1) from file_new where V1=$v ")

Use sqldf inside a function with generic column references

Pass the names rather than the columns. Change the sqldf call to fn$sqldf which will enable string interpolation using $. Then in the select statement use $x and $y.

library(sqldf)

Bar_Prep1 <- function(data, x, y) {
dataframe <- fn$sqldf("select
a.[$x] Grp,
AVG(a.[$y]) Mean,
stdev(a.[$y]) SD,
Max(a.[$y]) Max
from data a
group by a.[$x]")
dataframe$RD <- round(dataframe$Mean, digits = 0)
return(dataframe)
}

Bar_Prep1(df, "X", "Y")
## Grp Mean SD Max RD
## 1 a 3.166667 0.7637626 4 3
## 2 b 5.100000 1.0148892 6 5
## 3 c 7.185000 0.7400507 8 7

Note that it would be possible to absorb the rounding into the SQL statement:

Bar_Prep1 <- function(data, x, y) {
fn$sqldf("with tmp as (select
a.[$x] Grp,
AVG(a.[$y]) Mean,
stdev(a.[$y]) SD,
Max(a.[$y]) Max
from data a
group by a.[$x])
select *, round(Mean) RD from tmp")
}


Related Topics



Leave a reply



Submit