How to Insert (Add) a Row to a SQLite Db Table Using Dplyr Package

Is it possible to insert (add) a row to a SQLite db table using dplyr package?

You can perform SQL operations on a database/table created via dplyr, but you have to revert to RSQLite/DBI calls and change how you made the database/table:

library(dplyr)

my_db <- src_sqlite("my_db.sqlite3", create=TRUE)
copy_to(my_db, iris, "my_table", temporary=FALSE) # need to set temporary to FALSE

# grab the db connection from the object created by src_sqlite
# and issue the INSERT That way

res <- dbSendQuery(my_db$con,
'INSERT INTO my_table VALUES (9.9, 9.9, 9.9, 9.9, "new")')

Can't append data to sqlite3 with dplyr db_write_table()

No, you shouldn't use db_write_table() instead of db_insert_table(), since it can't be generalized across backends.

And you shouldn't use the tidyverse versions rather than the relevant DBI::: versions, since the tidyverse helper functions are for internal use, and not designed to be robust enough for use by users. See the discussion at https://github.com/tidyverse/dplyr/issues/3120#issuecomment-339034612 :

Actually, I don't think you should be using these functions at all. Despite that SO post, these are not user facing functions. You should be calling the DBI functions directly.

-- Hadley Wickham, package author.

Adding column to sqlite database

Not aware of a way of doing this with dyplr, but you can do it with RSQLite directly. The problem is not actually with RSQLite, but the fact that I don't know how to pass a list to mutate. Note that, in your code, something like this would work:

cars_tbl %>% mutate(new_col = another_column / 3.14)

Anyway, my alternative. I've created a toy cars dataframe.

cars <- data.frame(year=c(1999, 2007, 2009, 2017), model=c("Ford", "Toyota", "Toyota", "BMW"))

I open connection and actually create the table,

dbcon <- dbConnect(RSQLite::SQLite(), "cars.db")
dbWriteTable(dbcon, name = "cars", value = cars)

Add the new column and check,

dbGetQuery(dbcon, "ALTER TABLE cars ADD COLUMN new_col TEXT")
dbGetQuery(dbcon, "SELECT * FROM cars")
year model new_col
1 1999 Ford <NA>
2 2007 Toyota <NA>
3 2009 Toyota <NA>
4 2017 BMW <NA>

And then you can update the new column, but the only tricky thing is that you have to provide a where statement, in this case I use the year.

new_values <- sample(c("A","B","C"), nrow(cars), replace = TRUE) 
new_values
[1] "C" "B" "B" "B"

dbGetPreparedQuery(dbcon, "UPDATE cars SET new_col = ? where year=?",
bind.data=data.frame(new_col=new_values,
year=cars$year))

dbGetQuery(dbcon, "SELECT * FROM cars")
year model new_col
1 1999 Ford C
2 2007 Toyota B
3 2009 Toyota B
4 2017 BMW B

As a unique index, you could always use rownames(cars), but you would have to add it as a column in your dataframe and then in your table.

EDIT after suggestion by @krlmlr: indeed much better using dbExecute instead of deprecated dbGetPreparedQuery,

dbExecute(dbcon, "UPDATE cars SET new_col = :new_col where year = :year",
params=data.frame(new_col=new_values,
year=cars$year))

EDIT after comments: I did not think about this a few days ago, but even if it is a SQLite you can use the rowid. I've tested this and it works.

dbExecute(dbcon, "UPDATE cars SET new_col = :new_col where rowid = :id",
params=data.frame(new_col=new_values,
id=rownames(cars)))

Although you have to make sure that the rowid's in the table are the same as your rownames. Anyway you can always get your rowid's like this:

dbGetQuery(dbcon, "SELECT rowid, * FROM cars")
rowid year model new_col
1 1 1999 Ford C
2 2 2007 Toyota B
3 3 2009 Toyota B
4 4 2017 BMW B

Select specific rows from sqlite database using dbplyr in R

filter would work with sqlitedb

library(dplyr)
con <- DBI::dbConnect(RSQLite::SQLite(), dbname = ":memory:")
copy_to(con, iris, "iris")
iris_db <- tbl(con, "iris")
iris_db %>%
filter(row_number() %in% c(1, 8, 20))
# Source: lazy query [?? x 5]
# Database: sqlite 3.29.0 [:memory:]
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# <dbl> <dbl> <dbl> <dbl> <chr>
#1 5.1 3.5 1.4 0.2 setosa
#2 5 3.4 1.5 0.2 setosa
#3 5.1 3.8 1.5 0.3 setosa

How can I append data to a PostgreSQL table with `dplyr` without `collect()`?

dplyr does not include commands to insert or update records in a database, so there is not a complete native dplyr solution for this. But you could combine dplyr with regular SQL statements to avoid bringing the data to R.

Let's start by reproducing your steps before the collect() statement

library(dplyr)

pg <- src_postgres()

reg_data <- tbl(pg, "reg_data")

reg_results <-
reg_data %>%
summarize(r_squared=regr_r2(y, x),
num_obs=regr_count(y, x),
constant=regr_intercept(y, x),
slope=regr_slope(y, x),
mean_analyst_fog=regr_avgx(y, x),
mean_manager_fog=regr_avgy(y, x))

Now, you could use compute() instead of collect() to create a temporary table in the database.

temp.table.name <- paste0(sample(letters, 10, replace = TRUE), collapse = "")

reg_results <- reg_results %>% compute(name=temp.table.name)

Where temp.table.name is a random table name. Using the option name = temp.table.name in compute we assign this random name to the temporary table created.

Now, we will use the library RPostgreSQL to create an insert query that uses the results stored in the temporary table. As the temporary table only lives in the connection created by src_postgresql() we need to reuse it.

library(RPostgreSQL)
copyconn <- pg$con
class(copyconn) <- "PostgreSQLConnection" # I get an error if I don't fix the class

Finally the insert query

sql <- paste0("INSERT INTO destination_table SELECT * FROM ", temp.tbl.name,";")

dbSendQuery(copyconn, sql)

So, everything is happening in the database and the data is not brought into R.

EDIT

Previous versions of this post did break encapsulation when we obtained temp.tbl.name from reg_results. This is avoided using the option name=in compute.

dplyr:: append to postgresql remote source if uid not present

Per @hadley's comment to this answer, you can use dbWriteTable(append = TRUE). Here's an example with SQLite:

library(dplyr)

# Create some example data
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
dd <- data.frame(uid = 1:20, value = LETTERS[1:20])
dat <- data.frame(uid = 15:26, value = LETTERS[15:26])
copy_to(con, dd, "my_data",
temporary = FALSE,
indexes = list("uid", "value")
)

# Identify rows to append
dd_db <- tbl(con, "my_data")
dd_uid <- select(dd_db, uid) %>%
collect() %>%
unlist()
dat_to_append <- filter(dat, !(uid %in% dd_uid))

DBI::dbWriteTable(con, "my_data", dat_to_append, append = TRUE)


Related Topics



Leave a reply



Submit