Saving dbplyr query (tbl_sql object) to MySQL without saving data locally
Creating a table with INTO
command is an SQL Server (even MS Access) specific syntax and not supported in MySQL. Instead, consider the counterpart statement: CREATE TABLE...SELECT
. Also, schema differs between RDBMS's. For MySQL, database is synonymous to schema.
Therefore, consider adjusted version of SQL build:
sql_query <- glue::glue(
"CREATE TABLE {db}.{tbl_name}\n AS \n",
"SELECT * \n",
"FROM (\n",
dbplyr::sql_render(input_tbl),
"\n) AS sub_query"
)
Problem using dbplyr to create a SQL query
show_query()
will only work on a database, and you are trying to use it on a dataframe. To send your data from the csv to a temporary database object to create the query, you could use tbl_memdb()
and instead do:
data %>%
tbl_memdb() %>%
filter(...) %>%
mutate(...) %>%
show_query()
dbplyr generating unexpected SQL query
dbplyr is generating the SQL query as I would expect. What it has done is one query inside another:
SELECT id, date, type FROM myTable
Is a subquery in the super query
SELECT *
FROM (
subquery
) q01
WHERE type = foobar
The q01
is the name given to the subquery. In the same way as the AS
keyword. For example: FROM very_long_table_name AS VLTN
.
Yes, this nesting is ugly. But many SQL engines have a query optimizer that calculates the best way to execute a query. On SQL Server, I have noticed little difference in performance because the query optimizer finds a faster way to execute than as written.
However, it appears that for MySQL, nested queries are known to result in slower performance. See here, here, and here.
One thing that might solve this is changing the order of the select
and filter
commands in R:
tab %>%
filter(type = 'foobar') %>%
select(id, date, type)
Will probably produce the translated query:
SELECT `id`, `date`, `type`
FROM `myTable`
WHERE (`type` == 'foobar')
Which will perform better.
Connect to a DB using DBplyr
In the example you have linked to, mtcars
is a table in datawarehouse
. I am going to assume mtcars
is in the database you are connecting to. But you can check for this using:
'mtcars' %in% DBI::dbListTables(con)
If you want to query a table in a specific database or schema (not the default) then you need to use in_schema
.
Without in_schema
:
tbl(con, 'dbo.mtcars')
Produces an sql query like:
SELECT *
FROM "dbo.mtcars"
Where the "
delimit names. So in this case SQL is looking for a table named dbo.mtcars
not a table named mtcars
in dbo
.
With in_schema
:
tbl(con, in_schema('dbo','mtcars'))
Produces an sql query like:
SELECT *
FROM "dbo"."mtcars"
So in this case SQL is looking for a table named mtcars
in dbo
. Because each term is "
quoted separately.
How to solve error no applicable method for 'show_query' applied to an object of class data.frame
show_query()
translates the dplyr
syntax into query code for the backend you are using.
A database
backend using dbplyr
will result in an SQL
query (as a data.table
backend using dtplyr will result in a DT[i,j,by]
query).
show_query
doesn't need to have a method to translate dplyr
syntax applied to a data.frame
backend to itself, hence the error message you're getting.
An easy way to get an SQL
query result is to transform the data.frame
into an in-memory database with memdb_frame
:
memdb_frame(iris) %>%
filter(Species == "setosa") %>%
summarise(mean.Sepal.Length = mean(Sepal.Length),
mean.Petal.Length = mean(Petal.Length)) %>% show_query()
<SQL>
SELECT AVG(`Sepal.Length`) AS `mean.Sepal.Length`, AVG(`Petal.Length`) AS `mean.Petal.Length`
FROM `dbplyr_002`
WHERE (`Species` = 'setosa')
Related Topics
MySQL Correlated Subquery in Join Syntax
Check Users in a Security Group in SQL Server
Efficiently Duplicate Some Rows in Postgresql Table
How to Send a Query Result in CSV Format
Count of Unique Values in a Rolling Date Range for R
Oracle - Clone Table - Structure, Data Constraints and All
How Replace Accented Letter in a Varchar2 Column in Oracle
How to Force MySQL to Perform Subquery First
Sql Server Begin/End Vs Begin Trans/Commit/Rollback
Any Disadvantages to Bit Flags in Database Columns
Thoughts on Index Creation for SQL Server for Missing Indexes
How to Use Group by Based on a Case Statement in Oracle
Counter_Cache Has_Many_Through SQL Optimisation, Reduce Number of SQL Queries
Get Total Row Count While Paging
Oracle Subquery Does Not See the Variable from the Outer Block 2 Levels Up