Faster Bulk Inserts in SQLite3

Faster bulk inserts in sqlite3?

You can also try tweaking a few parameters to get extra speed out of it. Specifically you probably want PRAGMA synchronous = OFF;.

Improve INSERT-per-second performance of SQLite

Several tips:

  1. Put inserts/updates in a transaction.
  2. For older versions of SQLite - Consider a less paranoid journal mode (pragma journal_mode). There is NORMAL, and then there is OFF, which can significantly increase insert speed if you're not too worried about the database possibly getting corrupted if the OS crashes. If your application crashes the data should be fine. Note that in newer versions, the OFF/MEMORY settings are not safe for application level crashes.
  3. Playing with page sizes makes a difference as well (PRAGMA page_size). Having larger page sizes can make reads and writes go a bit faster as larger pages are held in memory. Note that more memory will be used for your database.
  4. If you have indices, consider calling CREATE INDEX after doing all your inserts. This is significantly faster than creating the index and then doing your inserts.
  5. You have to be quite careful if you have concurrent access to SQLite, as the whole database is locked when writes are done, and although multiple readers are possible, writes will be locked out. This has been improved somewhat with the addition of a WAL in newer SQLite versions.
  6. Take advantage of saving space...smaller databases go faster. For instance, if you have key value pairs, try making the key an INTEGER PRIMARY KEY if possible, which will replace the implied unique row number column in the table.
  7. If you are using multiple threads, you can try using the shared page cache, which will allow loaded pages to be shared between threads, which can avoid expensive I/O calls.
  8. Don't use !feof(file)!

I've also asked similar questions here and here.

Bulk insert performance in SQLite

Appreciate that with your current approach, inserting 1 million rows would require executing 1 million separate round trip inserts to SQLite. Instead, you could try using one of the following two approaches. For more recent versions of SQLite:

INSERT INTO myTable (id, format, size)
VALUES
(%d, '%s', %d),
(%d, '%s', %d),
(%d, '%s', %d),
... (more rows)

For earlier versions of SQLite, you may use an INSERT INTO ... SELECT construct:

INSERT INTO myTable (id, format, size)
SELECT %d, '%s', %d UNION ALL
SELECT %d, '%s', %d UNION ALL
... (more rows)

The basic idea here is that you can try just making a single insert call to SQLite with all of your data, instead of inserting one row at a time.

Not a C person, but here is how you might build the insert string from your C code:

const int MAX_BUF = 1000;  // make this as large as is needed
char* sql_buffer = malloc(MAX_BUF * sizeof(char));
int length = 0;
length += snprintf(sql_buffer+length, MAX_BUF-length, "INSERT INTO myTable (id, format, size) VALUES");
for (int i=0; item=row[i]; i++) {
length += snprintf(sql_buffer+length, MAX_BUF-length, " (%d, '%s', %d)", item.id, item.format, item.size);
}

rc = sqlite3_exec(db, sql_buffer, NULL, NULL, NULL);

Is it possible to insert multiple rows at a time in an SQLite database?

update

As BrianCampbell points out here, SQLite 3.7.11 and above now supports the simpler syntax of the original post. However, the approach shown is still appropriate if you want maximum compatibility across legacy databases.

original answer

If I had privileges, I would bump river's reply: You can insert multiple rows in SQLite, you just need different syntax. To make it perfectly clear, the OPs MySQL example:

INSERT INTO 'tablename' ('column1', 'column2') VALUES
('data1', 'data2'),
('data1', 'data2'),
('data1', 'data2'),
('data1', 'data2');

This can be recast into SQLite as:

     INSERT INTO 'tablename'
SELECT 'data1' AS 'column1', 'data2' AS 'column2'
UNION ALL SELECT 'data1', 'data2'
UNION ALL SELECT 'data1', 'data2'
UNION ALL SELECT 'data1', 'data2'

a note on performance

I originally used this technique to efficiently load large datasets from Ruby on Rails. However, as Jaime Cook points out, it's not clear this is any faster wrapping individual INSERTs within a single transaction:

BEGIN TRANSACTION;
INSERT INTO 'tablename' table VALUES ('data1', 'data2');
INSERT INTO 'tablename' table VALUES ('data3', 'data4');
...
COMMIT;

If efficiency is your goal, you should try this first.

a note on UNION vs UNION ALL

As several people commented, if you use UNION ALL (as shown above), all rows will be inserted, so in this case, you'd get four rows of data1, data2. If you omit the ALL, then duplicate rows will be eliminated (and the operation will presumably be a bit slower). We're using UNION ALL since it more closely matches the semantics of the original post.

in closing

P.S.: Please +1 river's reply, as it presented the solution first.

Bulk insert huge data into SQLite using Python

Divide your data into chunks on the fly using generator expressions, make inserts inside the transaction. Here's a quote from sqlite optimization FAQ:

Unless already in a transaction, each SQL statement has a new
transaction started for it. This is very expensive, since it requires
reopening, writing to, and closing the journal file for each
statement. This can be avoided by wrapping sequences of SQL statements
with BEGIN TRANSACTION; and END TRANSACTION; statements. This speedup
is also obtained for statements which don't alter the database.

Here's how your code may look like.

Also, sqlite has an ability to import CSV files.

DBD::SQLite fastest way to insert multiple rows

See in the example below where autocommit is set to 0

#!/usr/bin/perl
use strict;
use warnings;
use DBI;

my $dbh = DBI->connect("dbi:SQLite:dbname=pedro.lite","","",
{PrintError => 1, AutoCommit => 0}) or die "Can't connect";

my $sth = $dbh->prepare(q{INSERT INTO purchases VALUES(?,?,?,?)})
or die $dbh->errstr;

while (<DATA>) {
chomp;
$sth->execute( split /\|/ );
}

$dbh->commit() or die $dbh->errstr;

__DATA__
Pedro|groceries|apple|1.42
Nitin|tobacco|cigarettes|15.00
Susie|groceries|cereal|5.50
Susie|groceries|milk|4.75
Susie|tobacco|cigarettes|15.00
Susie|fuel|gasoline|44.90
Pedro|fuel|propane|9.60

This disables the commit until all records are inserted. In practice, you may not want to wait to commit if there is a lot of inserts - perhaps every 5000 inserts or whatever you feel is best. By not committing, if there is an error or computer shutdown, then you will only have created the number of records at the last commit - a difficult situation.

Bulk insert data in SQLite with prepare statements in C++

This code solved my problem.

void insertSAData(vector<json> saData){
sqlite3_mutex_enter(sqlite3_db_mutex(db));
char* errorMessage;
sqlite3_exec(db, "PRAGMA synchronous=OFF", NULL, NULL, &errorMessage);
sqlite3_exec(db, "PRAGMA count_changes=OFF", NULL, NULL, &errorMessage);
sqlite3_exec(db, "PRAGMA journal_mode=MEMORY", NULL, NULL, &errorMessage);
sqlite3_exec(db, "PRAGMA temp_store=MEMORY", NULL, NULL, &errorMessage);

sqlite3_exec(db, "BEGIN TRANSACTION", NULL, NULL, &errorMessage);

char const *szSQL = "INSERT INTO SA_DATA (DATA_ID,P_KEY,AMOUNT,AMOUNT_INDEX) VALUES (?,?,?,?);";
int rc = sqlite3_prepare(db, szSQL, -1, &stmt, &pzTest);

if( rc == SQLITE_OK ) {
for(int x=0;x<saData.size();x++){
// bind the value
sqlite3_bind_int(stmt, 1, saData[x].at("lastSAId"));
std::string hash = saData[x].at("public_key");
sqlite3_bind_text(stmt, 2, hash.c_str(), strlen(hash.c_str()), 0);
sqlite3_bind_int64(stmt, 3, saData[x].at("amount"));
string amount_index = saData[x].at("amount_idx");
sqlite3_bind_int(stmt, 4, atoi(amount_index.c_str()));

int retVal = sqlite3_step(stmt);
if (retVal != SQLITE_DONE)
{
printf("Commit Failed! %d\n", retVal);
}

sqlite3_reset(stmt);
}

sqlite3_exec(db, "COMMIT TRANSACTION", NULL, NULL, &errorMessage);
sqlite3_finalize(stmt);
}else{
fprintf(stderr, "SQL error: %s\n", zErrMsg);
sqlite3_free(zErrMsg);
}

sqlite3_mutex_leave(sqlite3_db_mutex(db));
}

How to insert 40000 records fast into an sqlite database in an iPad

There are three things that you need to do in order to speed up the insertions:

  • Move the call of sqlite3_open outside the loop. Currently, the loop is not shown, so I assume it is outside your code snippet
  • Add BEGIN TRANSACTION and COMMIT TRANSACTION calls - you need to begin transaction before the insertion loop and end it right after the loop is over.
  • Make formatStringQueryInsertWithTable truly parameterized - Currently it appears that you are not using prepared statements to their fullest, because despite using sqlite3_prepare_v2, you have no calls of sqlite3_bind_XYZ in your code.

Here is a nice post that shows you how to do all of the above. It is plain C, but it will work fine as part of an Objective C program.

char* errorMessage;
sqlite3_exec(mDb, "BEGIN TRANSACTION", NULL, NULL, &errorMessage);
char buffer[] = "INSERT INTO example VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)";
sqlite3_stmt* stmt;
sqlite3_prepare_v2(mDb, buffer, strlen(buffer), &stmt, NULL);
for (unsigned i = 0; i < mVal; i++) {
std::string id = getID();
sqlite3_bind_text(stmt, 1, id.c_str(), id.size(), SQLITE_STATIC);
sqlite3_bind_double(stmt, 2, getDouble());
sqlite3_bind_double(stmt, 3, getDouble());
sqlite3_bind_double(stmt, 4, getDouble());
sqlite3_bind_int(stmt, 5, getInt());
sqlite3_bind_int(stmt, 6, getInt());
sqlite3_bind_int(stmt, 7, getInt());
if (sqlite3_step(stmt) != SQLITE_DONE) {
printf("Commit Failed!\n");
}
sqlite3_reset(stmt);
}
sqlite3_exec(mDb, "COMMIT TRANSACTION", NULL, NULL, &errorMessage);
sqlite3_finalize(stmt);


Related Topics



Leave a reply



Submit