Sqlite and Range Query

SQLite and range query

I'd try something like

SELECT PersonName FROM Persons WHERE (Age BETWEEN 20 AND 45) AND (Weight BETWEEN 50 AND 80)

Indexing the fields Age and Weight would help speeding up the query.

Here's a nice overview about indexing in SQLite: http://www.tutorialspoint.com/sqlite/sqlite_indexes.htm

Search for a range of characters in SQLite

For this you need the operator GLOB which:

uses the Unix file globbing syntax for its wildcards

  • SELECT * FROM t WHERE f GLOB '*[abcde]*'; -- match entries which contains 'a' to 'e' inclusive
  • SELECT * FROM t WHERE f GLOB '*[a-e]*'; -- same query as above
  • SELECT * FROM t WHERE f GLOB '*[^x]*'; -- match entries which do NOT contain an 'x'

There is also the ? wildcard which matches exactly 1 char.

How to write A Range Query in Sqlite Android?

This should ideally work. Will return a cursor with all the results from the table.

db.rawQuery("SELECT Description FROM Table_Name WHERE Num BETWEEN "+(inputNumber-Range)+" AND "+(inputNumber+Range) +"ORDER BY (Num- "+inputNumber+")", null);

Edit: I am glad the above query works for you.But I think this query is the correct answer to the question you specifically asked

db.rawQuery("SELECT Description FROM Table_Name WHERE Num BETWEEN "+(inputNumber-Range)+" AND "+(inputNumber+Range) +"ORDER BY ABS(Num- "+inputNumber+")", null);

Taking the absolute value to avoid negative differences

How to use time-series with Sqlite, with fast time-range queries?

First solution

The method (2) detailed in the question seems to work well. In a benchmark, I obtained:

  • naive method, without index: 18 MB database, 86 ms query time
  • naive method, with index: 32 MB database, 12 ms query time
  • method (2): 18 MB database, 12 ms query time

The key point is here to use dt as an INTEGER PRIMARY KEY, so it will be the row id itself (see also Is an index needed for a primary key in SQLite?), using a B-tree, and there will not be another hidden rowid column. Thus we avoid an extra index which would make a correspondance dt => rowid: here dt is the row id.

We also use AUTOINCREMENT which internally creates a sqlite_sequence table, which keeps track of the last added ID. This is useful when inserting: since it is possible that two events have the same timestamp in seconds (it would be possible even with milliseconds or microseconds timestamps, the OS could truncate the precision), we use the maximum between timestamp*10000 and last_added_ID + 1 to make sure it's unique:

 MAX(?, (SELECT seq FROM sqlite_sequence) + 1)

Code:

import sqlite3, random, time
db = sqlite3.connect('test.db')
db.execute("CREATE TABLE data(dt INTEGER PRIMARY KEY AUTOINCREMENT, label TEXT);")

t = 1600000000
for i in range(1000*1000):
if random.randint(0, 100) == 0: # timestamp increases of 1 second with probability 1%
t += 1
db.execute("INSERT INTO data(dt, label) VALUES (MAX(?, (SELECT seq FROM sqlite_sequence) + 1), 'hello');", (t*10000, ))
db.commit()

# t will range in a ~ 10 000 seconds window
t1, t2 = 1600005000*10000, 1600005100*10000 # time range of width 100 seconds (i.e. 1%)
start = time.time()
for _ in db.execute("SELECT 1 FROM data WHERE dt BETWEEN ? AND ?", (t1, t2)):
pass
print(time.time()-start)


Using a WITHOUT ROWID table

Here is another method with WITHOUT ROWID which gives a 8 ms query time. We have to implement an auto-incrementing id ourself, since AUTOINCREMENT is not available when using WITHOUT ROWID.

WITHOUT ROWID is useful when we want to use a PRIMARY KEY(dt, another_column1, another_column2, id) and avoid to have an extra rowid column. Instead of having one B-tree for rowid and one B-tree for (dt, another_column1, ...), we'll have just one.

db.executescript("""
CREATE TABLE autoinc(num INTEGER); INSERT INTO autoinc(num) VALUES(0);

CREATE TABLE data(dt INTEGER, id INTEGER, label TEXT, PRIMARY KEY(dt, id)) WITHOUT ROWID;

CREATE TRIGGER insert_trigger BEFORE INSERT ON data BEGIN UPDATE autoinc SET num=num+1; END;
""")

t = 1600000000
for i in range(1000*1000):
if random.randint(0, 100) == 0: # timestamp increases of 1 second with probabibly 1%
t += 1
db.execute("INSERT INTO data(dt, id, label) VALUES (?, (SELECT num FROM autoinc), ?);", (t, 'hello'))
db.commit()

# t will range in a ~ 10 000 seconds window
t1, t2 = 1600005000, 1600005100 # time range of width 100 seconds (i.e. 1%)
start = time.time()
for _ in db.execute("SELECT 1 FROM data WHERE dt BETWEEN ? AND ?", (t1, t2)):
pass
print(time.time()-start)


Roughly-sorted UUID

More generally, the problem is linked to having IDs that are "roughly-sorted" by datetime. More about this:

  • ULID (Universally Unique Lexicographically Sortable Identifier)
  • Snowflake
  • MongoDB ObjectId

All these methods use an ID which is:

[---- timestamp ----][---- random and/or incremental ----]

Does SQLite support character range with [ ]?

SQLite does not support this SQL Server - like functionallity that you want.

You can do it with SUBSTR():

WHERE SUBSTR(LastName, 1, 1) BETWEEN 'B' AND 'L'

or:

WHERE LastName >= 'B' AND LastName < 'M'

SQLite query GROUP BY range

Yes, it works.

Putting some numbers into excel with your formula below, it works for me. Your gap value will be returned as the top end of each time range grouping.

SELECT time + (21600000 - (time%21600000)) as gap ...

Using the below:

SELECT time - (time%21600000) as gap_bottom ...

Would return you the bottom end of each time range grouping. You could add this as an additional calculated column and have both returned.

EDIT / PS:

You can also use the SQLite date formatting functions after dividing 1,000 milliseconds out of your epoch time and converting it to the SQLite unixepoch:

strftime('%Y-%m-%d %H:%M:%S', datetime(1517418000000 / 1000, 'unixepoch') )

... for ...

SELECT strftime('%Y-%m-%d %H:%M:%S', datetime( (time + (21600000 - (time%21600000))) / 1000, 'unixepoch') ) as gap ...


Related Topics



Leave a reply



Submit