Create a Unique Index on a Non-Unique Column

Create a unique index on a non-unique column

An index can only index actual rows, not aggregated rows. So, yes, as far as the desired index goes, creating a table with unique values like you mentioned is your only option. Enforce referential integrity with a foreign key constraint from data.day to days.day. This might also be best for performance, depending on the complete situation.

However, since this is about performance, there is an alternative solution: you can use a recursive CTE to emulate a loose index scan:

WITH RECURSIVE cte AS (
( -- parentheses required
SELECT day FROM data ORDER BY 1 LIMIT 1
)
UNION ALL
SELECT (SELECT day FROM data WHERE day > c.day ORDER BY 1 LIMIT 1)
FROM cte c
WHERE c.day IS NOT NULL -- exit condition
)
SELECT day FROM cte;

Parentheses around the first SELECT are required because of the attached ORDER BY and LIMIT clauses. See:

  • Combining 3 SELECT statements to output 1 table

This only needs a plain index on day.

There are various variants, depending on your actual queries:

  • Optimize GROUP BY query to retrieve latest row per user
  • Unused index in range of dates query
  • Select first row in each GROUP BY group?

More in my answer to your follow-up querstion:

  • Counting distinct rows using recursive cte over non-distinct index

For mysql, do indexes help for non-unique columns?

You could put a non-unique index on kitchen_id. This allows the dbms to do what's known as an "index-range" scan, which is to say that the dbms does a direct index lookup for the first kitchen_id = 33, and then, because index keys are already sorted, it can read index keys sequentially until it finds one where kitchen_id != 33 and then stop.

How much faster this is than a full table scan depends on the ratio (kitchen 33) / (all kitchens), and the break-even point comes somehere above 1/2.

Is any performance enhancement when we used Unique index instead of non Unique index?

An unique index won't be any faster to scan than a non-unique one. The only potential benefit in query execution speed could be that the optimizer can make certain deductions from the uniqueness and for example remove an unnecessary join.

The primary use of unique indexes is to implement table constraints, not to provide a performance advantage over non-unique indexes.

Here is an example:

CREATE TABLE parent (pid bigint PRIMARY KEY);

CREATE TABLE child (
cid bigint PRIMARY KEY,
pid bigint UNIQUE REFERENCES parent
);

EXPLAIN (COSTS OFF)
SELECT parent.pid FROM parent LEFT JOIN child USING (pid);

QUERY PLAN
════════════════════
Seq Scan on parent
(1 row)

Without the unique constraint on child.pid (which is implemented by a unique index) the join could not be removed.

SQL Server Clustered Index on Non-Unique Column

The clustered index does not need to be unique, so it is possible.

However, the issue is that each time a new message is inserted, SQL Server needs to find a space for the new row next to the other rows for the same customer. This can often be inefficient, because pages need to be split, resulting in many half-filled pages. And, things get even more complicated if you have deletes on the rows as well.

There are several options. In a busy database, you can leave room on the pages for additional inserts. Or, another option is to partition the table based on the customer id. It all depends.

Under most circumstances, an identity column on the messages table would be the primary key and the clustered key as well. An additional index on the customer table would be sufficient. But, there are definitely alternative structures that can work better in some scenarios.

How to declare secondary or non-unique index in mysql?

Please distinguish between "key" and "index".

The former is a logical concept (restricts, and therefore changes the meaning of data) and the later is physical (doesn't change the meaning, but can change the performance)1.

Let's get the basic concepts straight:

  • A "superkey" is any set of attributes that, taken together, are unique.
  • A "candidate key" (or just "key") is minimal superkey - if you take any attribute away from it, it is no longer unique.
  • All keys are logically equivalent, but we pick one of them as "primary key" for practical2 and historical reasons, the rest are called "alternate keys".
  • In the database, you declare primary key using PRIMARY KEY constraint, and alternate key using UNIQUE constraint on NOT NULL fields.
  • Most DBMSes (MySQL is no exception) will automatically create indexes underneath keys. Nonetheless, they are still separate concepts and some DBMSes will actually allow you to have a key without index.

Unfortunately, MySQL has royally messed-up the terminology:

  • MySQL uses column constraint KEY as a synonym for PRIMARY KEY.
  • MySQL uses table constraint KEY for index, not key, same if you were to use CREATE INDEX statement.

So for example, both...

CREATE TABLE T (
A INT PRIMARY KEY,
B INT
);

CREATE INDEX T_IE1 ON T (B);

...and...

CREATE TABLE T (
A INT KEY,
B INT,
KEY (B)
);

...mean the same thing: primary key on A (with unique/clustering index on A) and non-unique index on B.


1 Unique index is an oddball here - it straddles both worlds a little bit. But let's leave that discussion for another time...

2 For example, InnoDB clusters the table on primary key.

What is the purpose of non unique indexes in a database?

In the data model for your example email application it would not make sense to add a non unique index to the position attribute because each message has exactly one position and each position only contains one message; in this case the index should be unique.

But consider a possible "Sender" attribute. many messages can come from the same sender. If your application had a function to find all messages from a particular sender then it would make sense to add a non unique index on the sender column to improve performance on that operation.

what is the purpose of non unique index

The purpose of non-unique indexes is to make queries more efficient. The only use is performance.

They can be used in multiple ways; some of them are:

  • Filtering rows based on the index. For instance, get all customers in zip code 12345.
  • Joining to another table. Often foreign keys have indexes for this purpose.
  • Ordering the result set.
  • Facilitating aggregations.

Unique indexes add one more facilitate, which is guaranteeing data integrity. However, the uniqueness has little to do with how indexes are used efficiently.



Related Topics



Leave a reply



Submit