How to Compare Two SQLite Databases on Linux

How to compare two SQLite databases on Linux

Please have a look at the SQLite Release 3.8.10 which was released on May 7, 2015. This release for the first time contains the sqldiff.exe utility program for computing the differences between two SQLite database files. Most likely this program will also be part of future releases.

The sqldiff.exe command-line line tool should work for all supported operating systems and offers several switches for altering its output behavior. Example usage:

sqldiff [options] database1.sqlite database2.sqlite

If no options are specified, then the output of sqldiff.exe is SQL statements that will transform database1.sqlite (the "source" database) into database2.sqlite (the "destination" database).

However, there are also certain limitations. For example, the sqldiff.exe utility (at least currently) does not display differences in TRIGGERs, VIEWs, or virtual tables.


Sample command and output

I took a simple key-value store database (db1.sqlite) and made
a copy of it (db2.sqlite). I then inserted one key-value pair into
db2.sqlite. After that I ran the following command:

sqldiff db1.sqlite db2.sqlite

and got the following output:

INSERT INTO my_table(rowid,"key",value) VALUES(1,'D:\Test\Test.txt',x'aabbccdd');

UPDATE my_table_size SET counter=1 WHERE rowid=1;

The table my_table_size was automatically updated by a TRIGGER after the key-value pair was inserted to my_table. I then ran sqldiff.exe again, but this time with with db2.sqlite as first argument and db1.sqlite as second argument:

sqldiff db2.sqlite db1.sqlite

and got the following output:

DELETE FROM my_table WHERE rowid=1;

UPDATE my_table_size SET counter=0 WHERE rowid=1;


sqldiff download links

Since SQLite version 3.10.2 which was released on January 20, 2016, the 32-bit binaries for sqldiff can be directly downloaded from the SQLite Download Page. They can be found in the sqlite tools archives for the corresponding operating systems (see the Precompiled Binaries sections). For example, here are the links to the sqlite tools archives of version 3.39.3:

  • SQLite tools for Linux
  • SQLite tools for OS X
  • SQLite tools for Windows

For SQLite versions prior to version 3.10.2, the SQLite website hosts 32-bit binaries for sqldiff, but does not link to them. Here are the links to sqldiff of version 3.8.10:

  • sqldiff for Linux
  • sqldiff for OS X
  • sqldiff for Windows

If you need 64-bit binaries, then you have to download the raw sources and compile them by yourself. (The file sqldiff.c is located in the tool sub-directory of the archive containing the sources.)

Comparing 2 SQLite databases

You might try dumping each SQLite database, using SQLite's .dump command:

$ sqlite3 /path/to/database1 .dump > database1.sql
$ sqlite3 /path/to/database2 .dump > database2.sql

And then comparing the generated files. If your two databases are quite similar, a simple diff might work to show any differences, e.g.:

$ diff -u database1.sql database2.sql

Hope this helps!

Comparing two sqlite databases using Python

If both databases are opened in the same connection (which requires ATTACH), you can do the comparison in SQL:

import sqlite3

db1 = r"C:\Users\X\Documents\sqlitedb\db1.db"
db2 = r"C:\Users\X\Documents\sqlitedb\db2.db"

conn = sqlite3.connect(db1)
conn.execute("ATTACH ? AS db2", [db2])

res1 = conn.execute("""SELECT * FROM main.fdetail
WHERE keyid NOT IN
(SELECT keyid FROM db2.fdetail)
""").fetchall()
res2 = conn.execute("""SELECT * FROM db2.fdetail
WHERE keyid NOT IN
(SELECT keyid FROM main.fdetail)
""").fetchall()

You can also get a single result by combining the queries with UNION ALL.

How to compare two .db files format in android?

I believe that you could utilise the following which takes two Files and compares the schema (sqlite_master) between the two based upon a query which makes a union between the two groups by the name, type and sql columns (if they are the same then there would be 2 per group) and outputs those for which there are NOT 2 rows (i.e. a mismatch). If the returned Cursor has no rows then the Schema match.

public class CompareDBSchemas {

public static boolean areDBSchemasEqual(File db1File, File db2File, boolean compareSizes) {
boolean rv = true;
if (!(db1File.exists() && db2File.exists())) return false;
if (compareSizes) {
if (db1File.length() != db2File.length()) return false;
}
SQLiteDatabase db1 = SQLiteDatabase.openDatabase(db1File.getPath(),null,SQLiteDatabase.OPEN_READWRITE);
db1.execSQL("ATTACH '" +
db2File.getPath() + "' AS other");

/*
WITH cte AS (SELECT * FROM main.sqlite_master UNION ALL SELECT * FROM other.sqlite_master)
SELECT * FROM cte GROUP BY type,name,sql HAVING count() <> 2
*/
Cursor csr = db1.rawQuery("WITH cte AS (" +
"SELECT * FROM main.sqlite_master UNION ALL SELECT * FROM other.sqlite_master" +
") " +
"SELECT * FROM cte GROUP BY type,name,sql HAVING count() <> 2 ",
null
);
if (csr.getCount() > 0) {
rv = false;
}
csr.close();
db1.close();
return rv;
}

public static boolean areDBSchemasEqual(File db1File, File db2File) {
return areDBSchemasEqual(db1File,db2File,false);
}
}
  • Note that the above has the option to detect database size differences, which may detect changed data rather than just a schema change.

Example Usage

The following demonstrates the above being used where 3 databases are created the first and second having the same schema, the third being different (extra column added).

DatabseHlpr.java

public class DatabaseHlpr extends SQLiteOpenHelper {

public static final String TABLE_NAME = "mytable";
public static final String MYTABLE_COL_ID = BaseColumns._ID;
public static final String MYTABLE_COL_COMMON = "common";
public static final String MYTABLE_COL_SCHEMA1ONLY = "schema1only";

private int mSchema;
private SQLiteDatabase mDB;


public DatabaseHlpr(@Nullable Context context, @Nullable String name, int schema) {
super(context, name, null,1);
this.mSchema = schema;
mDB = this.getWritableDatabase();
}

@Override
public void onCreate(SQLiteDatabase db) {
switch (mSchema) {
case 1:
useSchema1(db);
break;
default:
useSchema0(db);
}
}

private void useSchema1(SQLiteDatabase db) {
db.execSQL("CREATE TABLE IF NOT EXISTS " + TABLE_NAME + "(" +
MYTABLE_COL_ID + " INTEGER PRIMARY KEY, " +
MYTABLE_COL_COMMON + " TEXT, " +
MYTABLE_COL_SCHEMA1ONLY + " TEXT " +
")");
}

private void useSchema0(SQLiteDatabase db) {
db.execSQL("CREATE TABLE IF NOT EXISTS " + TABLE_NAME + "(" +
MYTABLE_COL_ID + " INTEGER PRIMARY KEY, " +
MYTABLE_COL_COMMON + " TEXT" +
")");
}

@Override
public void onUpgrade(SQLiteDatabase db, int oldVersion, int newVersion) {

}
}

MainActivity.java

public class MainActivity extends AppCompatActivity {

static final String DB1NAME = "db1", DB2NAME = "db2", DB3NAME = "db3";
DatabaseHlpr mDB1Hlpr, mDB2Hlpr, mDB3Hlpr;
File db1File, db2File,db3File;

@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
mDB1Hlpr = new DatabaseHlpr(this,DB1NAME,0);
mDB2Hlpr = new DatabaseHlpr(this,DB2NAME,0);
mDB3Hlpr = new DatabaseHlpr(this,DB3NAME,1);
mDB1Hlpr.close();
mDB2Hlpr.close();
mDB3Hlpr.close();
db1File = new File(this.getDatabasePath(DB1NAME).getPath());
db2File = new File(this.getDatabasePath(DB2NAME).getPath());
db3File = new File(this.getDatabasePath(DB3NAME).getPath());
String result = " the same ";
if (!CompareDBSchemas.areDBSchemasEqual(db1File,db2File,false)) {
result = " NOT the same ";
}
Log.d("RESULTINFO",
"Database Schemas are " + result +
" for " +
"\n\t" +db1File.getPath() +
"\n and \n\t" + db2File.getPath()
);

result = " the same ";
if (!CompareDBSchemas.areDBSchemasEqual(db1File,db3File)) {
result = " NOT the same ";
}

Log.d("RESULTINFO",
"Database Schemas are " + result +
" for " +
"\n\t" +db1File.getPath() +
"\n and \n\t" + db3File.getPath()
);
}
}
  • Note that the above has been written to demonstrate the comparison process and thus File retrieval is simple and convenient. It is not intended to reflect how the files would be retrieved.

Result :-

2019-10-27 07:19:23.688 28976-28976/aso.so58566618dbcompareschema D/RESULTINFO: Database Schemas are  the same  for 
/data/user/0/aso.so58566618dbcompareschema/databases/db1
and
/data/user/0/aso.so58566618dbcompareschema/databases/db2
2019-10-27 07:19:23.693 28976-28976/aso.so58566618dbcompareschema D/RESULTINFO: Database Schemas are NOT the same for
/data/user/0/aso.so58566618dbcompareschema/databases/db1
and
/data/user/0/aso.so58566618dbcompareschema/databases/db3

ADDITIONAL

If you wanted the ability to check row counts (data inserted or deleted) and the actual data (data changed by update) then the following adaption of the CompareDBSchemas class could be used :-

public class CompareDBSchemas {

private static final String
SQLITE_MASTER = "sqlite_master",
SQLITE_MASTER_TYPE_COLUMN = "type",
SQLITE_MASTER_NAME_COLUMN = "name",
SQLITE_MASTER_SQL_COLUMN = "sql",
SQLITE_MASTER_TABLE_TYPE = "table",
SQLITE_SYSTEMTABLES = "sqlite_",
ANDROID_METADATA = "android_metadata",
CTE_NAME = "cte", MAIN_SCHEMA = "main", OTHER_SCHEMA = "other"
;

public static boolean areDBSchemasEqual(File db1File, File db2File, boolean compareSizes, boolean compareRowCounts, boolean compareData) {
boolean rv = true;
if (!(db1File.exists() && db2File.exists())) return false;
if (compareSizes) {
if (db1File.length() != db2File.length()) return false;
}
SQLiteDatabase db1 = SQLiteDatabase.openDatabase(db1File.getPath(),null,SQLiteDatabase.OPEN_READWRITE);
db1.beginTransaction();
db1.execSQL("ATTACH '" +
db2File.getPath() + "' AS " + OTHER_SCHEMA);

/*
WITH cte AS (SELECT * FROM sqlite_master UNION ALL SELECT * FROM sqlite_master)
SELECT * FROM cte GROUP BY type,name,sql HAVING count() <> 2
*/
Cursor csr = db1.rawQuery("WITH " + CTE_NAME + " AS (" +
"SELECT * FROM " + MAIN_SCHEMA + "." + SQLITE_MASTER +
" UNION ALL " +
"SELECT * FROM " + OTHER_SCHEMA + "." + SQLITE_MASTER +
") " +
"SELECT * FROM " + CTE_NAME +
" GROUP BY " +
SQLITE_MASTER_TYPE_COLUMN + "," +
SQLITE_MASTER_NAME_COLUMN + "," +
SQLITE_MASTER_SQL_COLUMN +
" HAVING count() <> 2 ",
null
);
if (csr.getCount() > 0) {
rv = false;
}
if (compareRowCounts && rv) {
csr = db1.rawQuery("SELECT * FROM main." + SQLITE_MASTER +
" WHERE " + SQLITE_MASTER_TYPE_COLUMN +
" = '" + SQLITE_MASTER_TABLE_TYPE +
"' AND (" + SQLITE_MASTER_NAME_COLUMN +
" NOT LIKE '" + SQLITE_SYSTEMTABLES +
"%' AND " + SQLITE_MASTER_NAME_COLUMN +
" <> '" + ANDROID_METADATA + "')",null);
while(csr.moveToNext()) {
if (
DatabaseUtils.queryNumEntries(db1,MAIN_SCHEMA +"." + csr.getString(csr.getColumnIndex(SQLITE_MASTER_NAME_COLUMN))) ==
DatabaseUtils.queryNumEntries(db1,OTHER_SCHEMA + "." + csr.getString(csr.getColumnIndex(SQLITE_MASTER_NAME_COLUMN)))
) continue;
rv = false;
break;
}
}
if (compareData && rv) {
csr.moveToPosition(-1);
while (csr.moveToNext()) {
if (
isTableDataTheSame(db1,csr.getString(csr.getColumnIndex(SQLITE_MASTER_NAME_COLUMN)))
) continue;
rv = false;
break;
}
}
db1.endTransaction();
csr.close();
db1.close();
return rv;
}

private static boolean isTableDataTheSame(SQLiteDatabase db, String table) {
boolean rv = true;
Cursor csr = db.rawQuery("PRAGMA table_info("+ table +")",null);
StringBuilder columnConcat = new StringBuilder();
while (csr.moveToNext()) {
if (columnConcat.length() > 0) columnConcat.append("||");
columnConcat.append(csr.getString(csr.getColumnIndex(SQLITE_MASTER_NAME_COLUMN)));
}
csr = db.rawQuery("WITH " + CTE_NAME +
" AS (" +
" SELECT " + columnConcat.toString() +
" AS comparison FROM " + MAIN_SCHEMA + "." + table +
" UNION ALL SELECT " + columnConcat.toString() + " FROM " + OTHER_SCHEMA + "." + table +
") SELECT * FROM " + CTE_NAME +
" GROUP BY comparison HAVING count() <> 2",
null
);
if (csr.getCount() > 0) {
rv = false;
}
csr.close();
return rv;
}

public static boolean areDBSchemasEqual(File db1File, File db2File) {
return areDBSchemasEqual(db1File,db2File,false,false,false);
}
}

sqlite database comparison

This is normal. A database can be fragmented, have gaps where rows were deleted which will affect the exact bytes stored on disc but not what a database client sees. This will depend not just on the current state of the database but also it's history: what is the sequence of inserts, updates, deletes etc that lead to the current state.

Operations like VACUUM don't change the stored data but do do things like reclaim empty space and so on, which would change the md5 sums.

Compare Two Relations in SQL


because the result of a subquery is a relation

Relation is the scientific name for what we call a table in a database and I like the name "table" much better than "relation". A table is easy to imagine. We know them from our school time schedule for instance. Yes, we relate things here inside a table (day and time and the subject taught in school), but we can also relate tables to tables (pupils' timetables with the table of class rooms, the overall subject schedule, and the teacher's timetables). As such, tables in an RDBMS are also related to each other (hence the name relational database management system). I find the name relation for a table quite confusing (and many people use the word "relation" to describe the relations between tables instead).

So, yes, a query result itself is again a table ("relation"). And from tables we can of course select:

select * from (select * from b) as subq;

And then there are scalar queries that return exactly one row and one column. select count(*) from b is such a query. While this is still a table we can select from

select * from (select count(*) as cnt from b) as subq;

we can even use them where we usually have single values, e.g. in the select clause:

select a.*, (select count(*) from b) as cnt from a;

In your query you have two scalar subqueries in your where clause.

With subqueries there is another distinction to make: we have correlated and non-correlated subqueries. The last query I have just shown contains a non-correlated subquery. It selects the count of b rows for every single result row, no matter what that row contains elsewise. A correlated subquery on the other hand may look like this:

select a.*, (select count(*) from b where b.x = a.y) as cnt from a;

Here, the subquery is related to the main table. For every result row we look up the count of b rows matching the a row we are displaying via where b.x = a.y, so the count is different from row to row (but we'd get the same count for a rows sharing the same y value).

Your subqueries are also correlated. As with the select clause, the where clause deals with one row at a time (in order to keep or dismiss it). So we look at one student S1 at a time. For this student we count other students (S2, where S2.sID <> S1.sID) who have the same GPA (and S2.GPA = S1.GPA) and count other students who have the same sizeHS. We only keep students (S1) where there are exactly as many other students with the same GPA as there are with the same sizeHS.


UPDATE

As do dealing with multiple tuples as in

select *
from Student S1
where (
select count(*), avg(grade)
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) = (
select count(*), avg(grade)
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
);

this is possible in some DBMS, but not in SQL Server. SQL Server doesn't know tuples.

But there are other means to achieve the same. You could just add two subqueries:

select * from student s1
where (...) = (...) -- compare counts here
and (...) = (...) -- compare averages here

Or get the data in the FROM clause and then deal with it. E.g.:

select *
from Student S1
cross apply
(
select count(*) as cnt, avg(grade) as avg_grade
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) sx
cross apply
(
select count(*) as cnt, avg(grade) as avg_grade
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
) sy
where sx.cnt = sy.cnt and sx.avg_grade = sy.avg_grade;


Related Topics



Leave a reply



Submit