Faster bulk inserts in sqlite3?
You can also try tweaking a few parameters to get extra speed out of it. Specifically you probably want PRAGMA synchronous = OFF;
.
Improve INSERT-per-second performance of SQLite
Several tips:
- Put inserts/updates in a transaction.
- For older versions of SQLite - Consider a less paranoid journal mode (
pragma journal_mode
). There isNORMAL
, and then there isOFF
, which can significantly increase insert speed if you're not too worried about the database possibly getting corrupted if the OS crashes. If your application crashes the data should be fine. Note that in newer versions, theOFF/MEMORY
settings are not safe for application level crashes. - Playing with page sizes makes a difference as well (
PRAGMA page_size
). Having larger page sizes can make reads and writes go a bit faster as larger pages are held in memory. Note that more memory will be used for your database. - If you have indices, consider calling
CREATE INDEX
after doing all your inserts. This is significantly faster than creating the index and then doing your inserts. - You have to be quite careful if you have concurrent access to SQLite, as the whole database is locked when writes are done, and although multiple readers are possible, writes will be locked out. This has been improved somewhat with the addition of a WAL in newer SQLite versions.
- Take advantage of saving space...smaller databases go faster. For instance, if you have key value pairs, try making the key an
INTEGER PRIMARY KEY
if possible, which will replace the implied unique row number column in the table. - If you are using multiple threads, you can try using the shared page cache, which will allow loaded pages to be shared between threads, which can avoid expensive I/O calls.
- Don't use
!feof(file)
!
I've also asked similar questions here and here.
Bulk insert performance in SQLite
Appreciate that with your current approach, inserting 1 million rows would require executing 1 million separate round trip inserts to SQLite. Instead, you could try using one of the following two approaches. For more recent versions of SQLite:
INSERT INTO myTable (id, format, size)
VALUES
(%d, '%s', %d),
(%d, '%s', %d),
(%d, '%s', %d),
... (more rows)
For earlier versions of SQLite, you may use an INSERT INTO ... SELECT
construct:
INSERT INTO myTable (id, format, size)
SELECT %d, '%s', %d UNION ALL
SELECT %d, '%s', %d UNION ALL
... (more rows)
The basic idea here is that you can try just making a single insert call to SQLite with all of your data, instead of inserting one row at a time.
Not a C person, but here is how you might build the insert string from your C code:
const int MAX_BUF = 1000; // make this as large as is needed
char* sql_buffer = malloc(MAX_BUF * sizeof(char));
int length = 0;
length += snprintf(sql_buffer+length, MAX_BUF-length, "INSERT INTO myTable (id, format, size) VALUES");
for (int i=0; item=row[i]; i++) {
length += snprintf(sql_buffer+length, MAX_BUF-length, " (%d, '%s', %d)", item.id, item.format, item.size);
}
rc = sqlite3_exec(db, sql_buffer, NULL, NULL, NULL);
Is it possible to insert multiple rows at a time in an SQLite database?
update
As BrianCampbell points out here, SQLite 3.7.11 and above now supports the simpler syntax of the original post. However, the approach shown is still appropriate if you want maximum compatibility across legacy databases.
original answer
If I had privileges, I would bump river's reply: You can insert multiple rows in SQLite, you just need different syntax. To make it perfectly clear, the OPs MySQL example:
INSERT INTO 'tablename' ('column1', 'column2') VALUES
('data1', 'data2'),
('data1', 'data2'),
('data1', 'data2'),
('data1', 'data2');
This can be recast into SQLite as:
INSERT INTO 'tablename'
SELECT 'data1' AS 'column1', 'data2' AS 'column2'
UNION ALL SELECT 'data1', 'data2'
UNION ALL SELECT 'data1', 'data2'
UNION ALL SELECT 'data1', 'data2'
a note on performance
I originally used this technique to efficiently load large datasets from Ruby on Rails. However, as Jaime Cook points out, it's not clear this is any faster wrapping individual INSERTs
within a single transaction:
BEGIN TRANSACTION;
INSERT INTO 'tablename' table VALUES ('data1', 'data2');
INSERT INTO 'tablename' table VALUES ('data3', 'data4');
...
COMMIT;
If efficiency is your goal, you should try this first.
a note on UNION vs UNION ALL
As several people commented, if you use UNION ALL
(as shown above), all rows will be inserted, so in this case, you'd get four rows of data1, data2
. If you omit the ALL
, then duplicate rows will be eliminated (and the operation will presumably be a bit slower). We're using UNION ALL since it more closely matches the semantics of the original post.
in closing
P.S.: Please +1 river's reply, as it presented the solution first.
Bulk insert huge data into SQLite using Python
Divide your data into chunks on the fly using generator expressions, make inserts inside the transaction. Here's a quote from sqlite optimization FAQ:
Unless already in a transaction, each SQL statement has a new
transaction started for it. This is very expensive, since it requires
reopening, writing to, and closing the journal file for each
statement. This can be avoided by wrapping sequences of SQL statements
with BEGIN TRANSACTION; and END TRANSACTION; statements. This speedup
is also obtained for statements which don't alter the database.
Here's how your code may look like.
Also, sqlite has an ability to import CSV files.
DBD::SQLite fastest way to insert multiple rows
See in the example below where autocommit is set to 0
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $dbh = DBI->connect("dbi:SQLite:dbname=pedro.lite","","",
{PrintError => 1, AutoCommit => 0}) or die "Can't connect";
my $sth = $dbh->prepare(q{INSERT INTO purchases VALUES(?,?,?,?)})
or die $dbh->errstr;
while (<DATA>) {
chomp;
$sth->execute( split /\|/ );
}
$dbh->commit() or die $dbh->errstr;
__DATA__
Pedro|groceries|apple|1.42
Nitin|tobacco|cigarettes|15.00
Susie|groceries|cereal|5.50
Susie|groceries|milk|4.75
Susie|tobacco|cigarettes|15.00
Susie|fuel|gasoline|44.90
Pedro|fuel|propane|9.60
This disables the commit until all records are inserted. In practice, you may not want to wait to commit if there is a lot of inserts - perhaps every 5000 inserts or whatever you feel is best. By not committing, if there is an error or computer shutdown, then you will only have created the number of records at the last commit - a difficult situation.
Bulk insert data in SQLite with prepare statements in C++
This code solved my problem.
void insertSAData(vector<json> saData){
sqlite3_mutex_enter(sqlite3_db_mutex(db));
char* errorMessage;
sqlite3_exec(db, "PRAGMA synchronous=OFF", NULL, NULL, &errorMessage);
sqlite3_exec(db, "PRAGMA count_changes=OFF", NULL, NULL, &errorMessage);
sqlite3_exec(db, "PRAGMA journal_mode=MEMORY", NULL, NULL, &errorMessage);
sqlite3_exec(db, "PRAGMA temp_store=MEMORY", NULL, NULL, &errorMessage);
sqlite3_exec(db, "BEGIN TRANSACTION", NULL, NULL, &errorMessage);
char const *szSQL = "INSERT INTO SA_DATA (DATA_ID,P_KEY,AMOUNT,AMOUNT_INDEX) VALUES (?,?,?,?);";
int rc = sqlite3_prepare(db, szSQL, -1, &stmt, &pzTest);
if( rc == SQLITE_OK ) {
for(int x=0;x<saData.size();x++){
// bind the value
sqlite3_bind_int(stmt, 1, saData[x].at("lastSAId"));
std::string hash = saData[x].at("public_key");
sqlite3_bind_text(stmt, 2, hash.c_str(), strlen(hash.c_str()), 0);
sqlite3_bind_int64(stmt, 3, saData[x].at("amount"));
string amount_index = saData[x].at("amount_idx");
sqlite3_bind_int(stmt, 4, atoi(amount_index.c_str()));
int retVal = sqlite3_step(stmt);
if (retVal != SQLITE_DONE)
{
printf("Commit Failed! %d\n", retVal);
}
sqlite3_reset(stmt);
}
sqlite3_exec(db, "COMMIT TRANSACTION", NULL, NULL, &errorMessage);
sqlite3_finalize(stmt);
}else{
fprintf(stderr, "SQL error: %s\n", zErrMsg);
sqlite3_free(zErrMsg);
}
sqlite3_mutex_leave(sqlite3_db_mutex(db));
}
How to insert 40000 records fast into an sqlite database in an iPad
There are three things that you need to do in order to speed up the insertions:
- Move the call of
sqlite3_open
outside the loop. Currently, the loop is not shown, so I assume it is outside your code snippet - Add
BEGIN TRANSACTION
andCOMMIT TRANSACTION
calls - you need to begin transaction before the insertion loop and end it right after the loop is over. - Make
formatStringQueryInsertWithTable
truly parameterized - Currently it appears that you are not using prepared statements to their fullest, because despite usingsqlite3_prepare_v2
, you have no calls ofsqlite3_bind_XYZ
in your code.
Here is a nice post that shows you how to do all of the above. It is plain C, but it will work fine as part of an Objective C program.
char* errorMessage;
sqlite3_exec(mDb, "BEGIN TRANSACTION", NULL, NULL, &errorMessage);
char buffer[] = "INSERT INTO example VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)";
sqlite3_stmt* stmt;
sqlite3_prepare_v2(mDb, buffer, strlen(buffer), &stmt, NULL);
for (unsigned i = 0; i < mVal; i++) {
std::string id = getID();
sqlite3_bind_text(stmt, 1, id.c_str(), id.size(), SQLITE_STATIC);
sqlite3_bind_double(stmt, 2, getDouble());
sqlite3_bind_double(stmt, 3, getDouble());
sqlite3_bind_double(stmt, 4, getDouble());
sqlite3_bind_int(stmt, 5, getInt());
sqlite3_bind_int(stmt, 6, getInt());
sqlite3_bind_int(stmt, 7, getInt());
if (sqlite3_step(stmt) != SQLITE_DONE) {
printf("Commit Failed!\n");
}
sqlite3_reset(stmt);
}
sqlite3_exec(mDb, "COMMIT TRANSACTION", NULL, NULL, &errorMessage);
sqlite3_finalize(stmt);
Related Topics
What Is the Easiest Way to Print a Variadic Parameter Pack Using Std::Ostream
What Does the Standard Say About How Calling Clear on a Vector Changes the Capacity
How to Add a Timed Delay to a C++ Program
What's Faster, Iterating an Stl Vector with Vector::Iterator or with At()
"String Could Not Resolved" Error in Eclipse for C++ (Eclipse Can't Resolve Standard Library)
Copy Map Values to Vector in Stl
Inherit Interfaces Which Share a Method Name
Are C++17 Parallel Algorithms Implemented Already
Overloading Operator<<: Cannot Bind Lvalue to 'Std::Basic_Ostream<Char>&&'
Is There a Formula to Determine Overall Color Given Bgr Values? (Opencv and C++)
Confusion Between C++ and Opengl Matrix Order (Row-Major VS Column-Major)
Deleted Default Constructor. Objects Can Still Be Created... Sometimes
Calling a Function for Each Variadic Template Argument and an Array