SQL Select Speed Int VS Varchar

SQL SELECT speed int vs varchar

Int comparisons are faster than varchar comparisons, for the simple fact that ints take up much less space than varchars.

This holds true both for unindexed and indexed access. The fastest way to go is an indexed int column.


As I see you've tagged the question postgreql, you might be interested in the space usage of different date types:

  • int fields occupy between 2 and 8 bytes, with 4 being usually more than enough ( -2147483648 to +2147483647 )
  • character types occupy 4 bytes plus the actual strings.

SQL - performance in varchar vs. int

Should I create a new column with a number datatype in both the table and join the table to reduce the time taken by the SQL Query.?

If you're in a position where you can change the design of the database with ease then yes, your Primary Key should be an integer. Unless there is a really good reason to have an FK as a varchar, then they should be integers as well.

If you can't change the PK or FK fields, then make sure they're indexed properly. This will eventually become a bottleneck though.

Is there a REAL performance difference between INT and VARCHAR primary keys?

You make a good point that you can avoid some number of joined queries by using what's called a natural key instead of a surrogate key. Only you can assess if the benefit of this is significant in your application.

That is, you can measure the queries in your application that are the most important to be speedy, because they work with large volumes of data or they are executed very frequently. If these queries benefit from eliminating a join, and do not suffer by using a varchar primary key, then do it.

Don't use either strategy for all tables in your database. It's likely that in some cases, a natural key is better, but in other cases a surrogate key is better.

Other folks make a good point that it's rare in practice for a natural key to never change or have duplicates, so surrogate keys are usually worthwhile.

SQL performance for string field vs multiple int/varchar fields

The primary question is whether you can run faster than scanning the entire table. The answer is "no" unless a small number of the booleans can be handled separately with Index(es).

Your WHERE bools LIKE '%a%c%d%' is a clever trick for ANDing any number of flags together. However, it will need to look at every row, and LIKE is slightly heavyweight.

INT(1) takes 4 bytes plus overhead. TINYINT is what you are fishing for; it takes 1 byte, plus overhead.

A SET with up to 64 bools is another technique. The coding is a bit clumsy, but it is rather efficient

INT UNSIGNED (for up to 32) or BIGINT UNSIGNED (for up to 64) flags is implemented similarly to SET and also takes up to 8 bytes. But the coding is rather clumsy. Let's number the bits starting with 0 in the least significant bit.

WHERE (bools & ( (1 << 0) | (1 << 2) | (1 << 3) ) ) = 
( (1 << 0) | (1 << 2) | (1 << 3) )

would check that bits 0, 2, and 3 are all set. (This is like your test for a,c,d.) A variety of ANDs and ORs are possible with this approach. (You could pre-compute those bit values-- 13 in this example. Or use a bit literal: 0b1101.)

The benefit of SET or bits in an INT is the 'speed' within each row. Still, all rows must be tested.

So, I recommend triaging your bools, etc, and decide what needs indexing and what can go into this combined column or in a combined JSON column for non-bools.

In MySQL, is it faster to compare with integer or string of integer?

MySQL ultimately runs on some processor, and in general an integer comparison can be done in a single CPU cycle, while string comparisons will generally take multiple cycles, perhaps one cycle per character. See Why is integer comparison faster then string comparison? for more information.

SQL Server Performance: GROUP BY int vs GROUP BY VARCHAR

I would say GROUP BY INT is faster, as only 4 bytes are checked verses n bytes in a varchar field.



Related Topics



Leave a reply



Submit