How to Analyse a SQLite Query Execution

How can I analyse a Sqlite query execution?

I know of no pretty graphical tools, but all of the information you seek is available from the EXPLAIN keyword.

Consider this database:

sqlite> create table users (name, email);
sqlite> create index user_names on users (name);

A query predicated on email will not use an index:

sqlite> explain select * from users where email='foo';


























































































































































































addropcodep1p2p3p4p5comment
0Trace00000
1String8010foo00
2Goto013000
3OpenRead020200
4Rewind011000
5Column01200
6Ne1102collseq(BINARY)6a
7Column00400
8Column01500
9ResultRow42000
10Next05001
11Close00000
12Halt00000
13Transaction00000
14VerifyCookie05000
15TableLock020users00
16Goto03000

How to understand SQLite `EXPLAIN QUERY PLAN` result?

The line EXECUTE CORRELATED SCALAR SUBQUERY 1 is just there because you're using a different query syntax.

The actual execution is the same in both cases: SQLite goes through all records of the Queue table, and tries to look up the corresponding record in the LastQueue table.

That the estimated record counts are different is of no concern to you, because you know that the actual number of records is the same.

Joins might be better than subqueries in other databases, but in SQLite, which uses only nested loop joins, the only difference is that a join allows SQLite to choose the outer and the inner table in the join.

In any case, you should write the query in the most simple and maintainable way, and optimize it only if you have measured that you get a noticeable and necessary improvement.


Please note that instead of:

WHERE (SELECT ...) IS NOT NULL

it would be more idiomatic to write:

WHERE EXISTS (SELECT ...)

What's the number in SQLite query plan mean?

As the official documentation says, this is the number of rows that the database estimates will be returned.

If there is an index on a seached column, and if you have run ANALYZE, then SQLite can make an estimate based on the actual data. Otherwise, it assumes that tables contain one million rows, and that a search like column > x filters out half the rows.

SQLite: Downsides of ANALYZE

There is another downside. The ANALYZE results may cause the query planner to ignore indexes that you really want to use.

For example suppose you have a table with a boolean column "isSpecial". Most of the rows have isSpecial = 0 but there are a few with isSpecial = 1.

When you do a query SELECT * FROM MyTable WHERE isSpecial = 1, in the absence of ANALYZE data the query planner will assume the index on isSpecial is good and will use it. In this case it will happen to be right. If you were to do isSpecial = 0 then it would still use the index, which would be inefficient, so don't do that.

After you have run ANALYZE, the query planner will know that isSpecial has only two values, so the selectivity of the index is bad. So it won't use it, even in the isSpecial = 1 case above. For it to know that the isSpecial values are very unevenly distributed it would need data that it only gathers when compiled with the SQLITE_ENABLE_STAT4 option. That option is not enabled by default and it has a big downside of its own: it makes the query plan for a prepared statement depend on its bound values, so sqlite will re-prepare the statement much more often. (Possibly every time it's executed, I don't know the details)

tl;dr: running ANALYZE makes it almost impossible to use indexes on boolean fields, even when you know they would be helpful.

Sqlite analyze big table

There are a few things that would help but first of all I would add an index to your database:

create index your_index_name_here on your_table_name_here (your_column_name_here);

I just created a database with 3 columns id, name, number like this with 1 million lines (you have 600+ million lines):

1   R7N47UCYNE  363233
2 BFDXTROHVZ 273710
3 AVN6B8RCS7 949561
4 OHWTUCWDQW 863241
5 7F0B7UDJI4 791392
6 CV5TGBAQBJ 919640
7 PP8UF35C8J 897746
8 CN2J9ZAO63 671596
9 ZPM338I8QN 601521
10 12BFX9LXGE 845189
11 2NFP6DG299 858336
12 T5GH1V609M 692232

Searching for 10 names within my database before the index looked like this:

# (1999841, 'K6IWJK0MNP', 500224)
# (1715030, 'RY0N5VO9UM', 148000)
# (1999694, '474R4EMI3T', 705184)
# (1999539, 'L2YYGSLMUH', 943698)
# (1999425, 'UT352CDTQH', 828853)
# (1714958, 'KKAQ36DZGV', 279517)
# (1714905, 'BZ9CNWL5LS', 320070)
# (1714877, '6X7WCBBIZF', 247905)
# (1714839, '4MSV61HY9I', 263813)
# (1714806, '7NSRANTWEP', 476828)
# Finished in 1.3834 seconds

After the index was created:

# (1999841, 'K6IWJK0MNP', 500224)
# (1715030, 'RY0N5VO9UM', 148000)
# (1999694, '474R4EMI3T', 705184)
# (1999539, 'L2YYGSLMUH', 943698)
# (1999425, 'UT352CDTQH', 828853)
# (1714958, 'KKAQ36DZGV', 279517)
# (1714905, 'BZ9CNWL5LS', 320070)
# (1714877, '6X7WCBBIZF', 247905)
# (1714839, '4MSV61HY9I', 263813)
# (1714806, '7NSRANTWEP', 476828)
# Finished in 0.0006 seconds

Your can also set up a multiple cursors to do the second search rather than do a count/range. Your first query could be:

select_all = "SELECT * FROM your_table"
rows = cursor.execute(select_all).fetchall()
for row in rows:
value = row[0]
.....etc etc

Then use the results to generate the a second query:

allValue = cursor2.execute(f"SELECT * FROM 'T' WHERE value = '{value}'").fetchall()

Finally SQLite might not be the best database to work with for this volume of rows. You could use a more performant database and dump the SQLite one to something like MariaDB:

sqlite3 sample.db .dump > dump.sql

Then into your new database with a command like:

mysql -p -u root -h 127.0.0.1 newdb < dump.sql

Determining execution time of queries in SQLite

From within the sqlite3 command-line program you can do:

.timer ON
select * from my_table;

This will print the CPU time taken for the query.

Query plan For Sqlite

SQLite supports the EXPLAIN keyword, like MySQL, but syntax is slightly different:

EXPLAIN QUERY PLAN SELECT col FROM YOUR_TABLE

Is there a tool to profile sqlite queries?

You have a mix of questions in here. To view what queries are run and how long each takes, you'll need to either modify sqlite3.dll if an application is linking to that or if it's your own application you can write it into your code easier (we do this and long all queries, transactions, timings, etc.).

For individual query analysis, you can use EXPLAIN. It won't tell you timing of individual steps within a query but it will tel you how the query was executed.

http://www.sqlite.org/lang_explain.html

An SQL statement can be preceded by the keyword "EXPLAIN" or by the phrase "EXPLAIN QUERY PLAN". Either modification causes the SQL statement to behave as a query and to return information about how the SQL statement would have operated if the EXPLAIN keyword or phrase had been omitted.

When the EXPLAIN keyword appears by itself it causes the statement to behave as a query that returns the sequence of virtual machine instructions it would have used to execute the command had the EXPLAIN keyword not been present. When the EXPLAIN QUERY PLAN phrase appears, the statement returns high-level information about what indices would have been used.

The output from EXPLAIN and EXPLAIN QUERY PLAN is intended for interactive analysis and troubleshooting only. The details of the output format are subject to change from one release of SQLite to the next. Applications should not use EXPLAIN or EXPLAIN QUERY PLAN since their exact behavior is undocumented, unspecified, and variable.



Related Topics



Leave a reply



Submit