Difference Between Int Primary Key and Integer Primary Key SQLite

Difference between INT PRIMARY KEY and INTEGER PRIMARY KEY SQLite

Yes, there is a difference: INTEGER is a special case in SQLite, when the database does not create a separate primary key, but reuses the ROWID column instead. When you use INT (or any other type that "maps" to INTEGER internally) a separate primary key is created.

That is why you see sqlite_autoindex created for the INT primary key, and no index created for the one of type INTEGER: SQLite reuses a built-in indexing structure for the integer primary key, rendering the autoindex unnecessary.

That is why the INTEGER primary key is more economical, both in terms of storage and in terms of performance.

See this link for details.

In SQLITE, does specifying the integer type for Primary Keys matter considering that primary keys must have unique values?

The column type INT, INTEGER, WHATEVER (you can specify virtually any column type) has little bearing, it's an indication of what is to be stored in the column. However, it does not set the type as with one exception (to be discussed) of data that can be stored. In short any type (bar the exception) of data can be stored in any column (irrespective of the defined column type).

  • see 3.1 Determination of Column Affinity in the link below

SQL does not differentiate between stored values other than the storage class (null,integer,real,text,blob), if stored as an INTEGER then it is an integer bound only by the limitations of it being stored in at most 8 bytes (64 bit signed).

  • see 2. Storage Classes and Datatypes in the link below

The exception is the use specifically of INTEGER PRIMARY KEY or INTEGER with the column set as the primary key at the table level. The value stored MUST be an integer otherwise a DATATYPE MISMATCH will occur.

  • as per Any column in an SQLite version 3 database, except an INTEGER PRIMARY KEY column, may be used to store a value of any storage class. ( also in 2. Storage Classes and Datatypes)

So ultimately my question is, will declaring the datatype of a primary key to be specifically be one of TINYINT SMALLINT, MEDIUMINT, BIGINT, UNSIGNED BIG INT, INT2, INT8 make any difference whatsoever?

Not with the listed types (TINYINT ....) as the types all contain INT they will have a type affinity of INTEGER and the column will NOT be an alias of the rowid column.

If you included INTEGER in the list then YES it will make a difference as the column will then be an alias of the rowid column (i.e. it is INTEGER PRIMARY KEY). The column will also be restricted to being an integer value (the columns using the other listed types will not be restricted to integer values).

You may wish to refer to Datatypes in SQLite

The following SQL demonstrates some of the above:-

DROP TABLE IF EXISTS example;
CREATE TABLE IF NOT EXISTS example (
rowid_alias_must_be_unique_integer INTEGER PRIMARY KEY, -- INTEGER PRIMARY KEY makes the column an alias of the rowid
col_text TEXT,
col_integer INTEGER,
col_real REAL,
col_BLOB BLOB,
col_anyother this_is_a_stupid_column_type -- will have a type affinitiy of NUMERIC
);

/* INSERTS first row with a negative rowid */
INSERT INTO example VALUES (-100,'MY TEXT', 340000,34.5678,x'f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff',100);
/* All subsequent inserts use the generated rowid */
/* the same value is inserted into all the other columns */
INSERT INTO example (col_text,col_integer,col_real,col_blob,col_anyother) VALUES
('MY TEXT','MY TEXT','MY TEXT','MY TEXT','MY TEXT'),
(100,100,100,100,100),
(34.5678,34.5678,34.5678,34.5678,34.5678),
(x'f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff',x'f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff',x'f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff',x'f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff',x'f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff')
;

SELECT
*,
rowid,
typeof(rowid_alias_must_be_unique_integer),
typeof(col_text),
typeof(col_integer),
typeof(col_real),
typeof(col_blob),
typeof(col_anyother)
FROM example
;
/* WILL FAIL as rowid alias is not an integer */
INSERT INTO example VALUES('a','a','a','a','a','a');
DROP TABLE IF EXISTS example;

The result of the first SELECT will be :-

Sample Image

  • Note that blobs are handled/displayed according to how the tool (Navicat for SQLite) handles the display of blobs.

The last INSERT fails because the value being inserted into the rowid alias is not an integer value e.g. :-

/* WILL FAIL as rowid alias is not an integer */
INSERT INTO example VALUES('a','a','a','a','a','a')
> datatype mismatch
> Time: 0s
  • Note that the answer has not dealt with the intricacies of how the column affinity may effect the extraction of data.

sqlite text as primary key vs autoincrement integers

SQLite does not automatically compress text. So the answer to your question is "no".

Should you use text or an auto-incrementing id as the primary key? This can be a complex question. But happily, the answer is that it doesn't make much difference. That said, there are some considerations:

  • Integers are of fixed length. In general, fix length keys are slightly more efficient in B-tree indexes than variable length keys.
  • If the strings are short (like 1 or 2 or 3 characters), then they may be shorter -- or no longer -- than integers.
  • If you change the string (say, if it is originally misspelled), then using an "artificial" primary key makes this easy: just change the value in one table. Using the string itself as a key can result in lots of updates to lots of tables.

What is the difference between SQLite integer data types like int, integer, bigint, etc.?

From the SQLite3 documentation:

http://www.sqlite.org/datatype3.html

Most SQL database engines (every SQL database engine other than
SQLite, as far as we know) uses static, rigid typing. With static
typing, the datatype of a value is determined by its container - the
particular column in which the value is stored.

SQLite uses a more general dynamic type system. In SQLite, the
datatype of a value is associated with the value itself, not with its
container. The dynamic type system of SQLite is backwards compatible
with the more common static type systems of other database engines in
the sense that SQL statement that work on statically typed databases
should work the same way in SQLite. However, the dynamic typing in
SQLite allows it to do things which are not possible in traditional
rigidly typed databases.

So in MS Sql Server (for example), an "int" == "integer" == 4 bytes/32 bits.

In contrast, a SqlLite "integer" can hold whatever you put into it: from a 1-byte char to an 8-byte long long.

The above link lists all types, and gives more details about Sqlite "affinity".

The C/C++ interface you're referring to must work with strongly typed languages.

So there are two APIs: sqlite3_column_int(), max 4-byte; and sqlite3_column_int64()

http://www.sqlite.org/capi3ref.html#sqlite3_int64

Running out of INT datatype for PRIMARY KEY in SQLite

The limit to number of rows is 18,446,744,073,709,551,616 if you make use of negative values, beyond what most (any?) devices can store at 1 bytes per row (no row would be just 1 byte they would be more bytes).

For more details
Limits In SQLite - Maximum Number Of Rows In A Table

I try to resolve this by changing the datatype to BigInt but the navigator crashed

As for datatype that does not affect matters except when retrieving data, which is a subtle issue that can be got around by using implicit or explicit CASTing. The other except for datatype is if using INTEGER PRIMARY KEY (implicit or explicit but specifically INTEGER not INT or BIGINT) which makes the column a rowid alias and there can only store integer value. In all other columns except rowid and an alias of the rowid any type of value can be stored.

For more details Datatypes In SQLite Version 3

Please what is the best Approach of resolving this?
From SQLite perspective nothing, there is no issue.

INTEGER PRIMARY KEY vs rowid in SQLite

It would appear that there is an overhead of having an alias for the rowid of a byte (I think) per row, which I believe is explained by :-

When an SQL table includes an INTEGER PRIMARY KEY column (which
aliases the rowid) then that column appears in the record as a NULL
value. SQLite will always use the table b-tree key rather than the
NULL value when referencing the INTEGER PRIMARY KEY column.
Database File Format - 2.3. Representation Of SQL Tables.

The 1 byte per row appears to be pretty close according to the following testing:-

Two databases were created with the two differing tables, loaded with 1,000,000 million rows using the following SQL :-

For the First :-

DROP TABLE IF EXISTS points;
CREATE TABLE IF NOT EXISTS points (tags BLOB NOT NULL, lon INTEGER NOT NULL, lat INTEGER NOT NULL);
WITH RECURSIVE counter(tags,lon,lat) AS (SELECT x'00000000', 0,0 UNION ALL SELECT tags, random() AS lon, random() AS lat FROM counter LIMIT 1000000)
INSERT INTO points (tags,lon,lat) SELECT * FROM counter;
SELECT * FROM points;
VACUUM

For the Second (with an alias of the rowid):-

DROP TABLE IF EXISTS points;
CREATE TABLE IF NOT EXISTS points (id INTEGER PRIMARY KEY, tags BLOB NOT NULL, lon INTEGER NOT NULL, lat INTEGER NOT NULL);
WITH RECURSIVE counter(tags,lon,lat) AS (SELECT x'00000000', 0,0 UNION ALL SELECT tags, random() AS lon, random() AS lat FROM counter LIMIT 1000000)
INSERT INTO points (tags,lon,lat) SELECT * FROM counter;
SELECT * FROM points;
VACUUM

The the resultant file sizes were 29484Kb and 30600Kb respectively.

That being a difference of 30600 - 29484 = 1,116, multiply this by 1024 = 1142784 (not that far off the 1,000,000 rows, pages and freespace probably accounting for the discrepancy ).

  • Note the VACUUM command made no difference (as they were new tables there was no expectation that they would.)

SQLite: primary key and sqlitedatabase.insert

From: https://www.sqlite.org/datatype3.html

INTEGER. The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8
bytes depending on the magnitude of the value

As you can see in SQLite the INTEGER data type is not what int is in Java,

but it can store values up to 8 bytes just like the long type of Java.

Is there a REAL performance difference between INT and VARCHAR primary keys?

You make a good point that you can avoid some number of joined queries by using what's called a natural key instead of a surrogate key. Only you can assess if the benefit of this is significant in your application.

That is, you can measure the queries in your application that are the most important to be speedy, because they work with large volumes of data or they are executed very frequently. If these queries benefit from eliminating a join, and do not suffer by using a varchar primary key, then do it.

Don't use either strategy for all tables in your database. It's likely that in some cases, a natural key is better, but in other cases a surrogate key is better.

Other folks make a good point that it's rare in practice for a natural key to never change or have duplicates, so surrogate keys are usually worthwhile.



Related Topics



Leave a reply



Submit