Is storing a delimited list in a database column really that bad?
In addition to violating First Normal Form because of the repeating group of values stored in a single column, comma-separated lists have a lot of other more practical problems:
- Can’t ensure that each value is the right data type: no way to prevent 1,2,3,banana,5
- Can’t use foreign key constraints to link values to a lookup table; no way to enforce referential integrity.
- Can’t enforce uniqueness: no way to prevent 1,2,3,3,3,5
- Can’t delete a value from the list without fetching the whole list.
- Can't store a list longer than what fits in the string column.
- Hard to search for all entities with a given value in the list; you have to use an inefficient table-scan. May have to resort to regular expressions, for example in MySQL:
idlist REGEXP '[[:<:]]2[[:>:]]'
or in MySQL 8.0:idlist REGEXP '\\b2\\b'
- Hard to count elements in the list, or do other aggregate queries.
- Hard to join the values to the lookup table they reference.
- Hard to fetch the list in sorted order.
- Hard to choose a separator that is guaranteed not to appear in the values
To solve these problems, you have to write tons of application code, reinventing functionality that the RDBMS already provides much more efficiently.
Comma-separated lists are wrong enough that I made this the first chapter in my book: SQL Antipatterns, Volume 1: Avoiding the Pitfalls of Database Programming.
There are times when you need to employ denormalization, but as @OMG Ponies mentions, these are exception cases. Any non-relational “optimization” benefits one type of query at the expense of other uses of the data, so be sure you know which of your queries need to be treated so specially that they deserve denormalization.
Getting values of comma separated fields in SQL Server
The easy way is to convert CSV
values to rows for each Id, join that with CITY
table and convert back to CSV
values. I have written the logic inside the query.
;WITH CTE1 AS
(
-- Convert CSV to rows
SELECT Id,LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'NAME'
FROM
(
-- To change ',' to any other delimeter, just change ',' before '</M><M>' to your desired one
SELECT Id,CAST ('<M>' + REPLACE(Name, ',', '</M><M>') + '</M>' AS XML) AS Data
FROM #TEMP
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
)
,CTE2 AS
(
-- Now join the values in rows with Id in CITY table
SELECT T.ID,T.NAME,C.CITYNAME
FROM CTE1 T
JOIN #CITY C ON T.NAME=C.ID
)
-- Now convert back to CSV format
SELECT DISTINCT ID,
SUBSTRING(
(SELECT ', ' + CITYNAME
FROM CTE2 I
WHERE I.Id=O.Id
FOR XML PATH('')),2,200000) [VALUES]
FROM CTE2 O
- Click here to view result
I have some comma separated values in database column and I have a value to check if that value exists in those comma separated value in database
You should fix your table design and never store data as comma separated.
You could use FIND_IN_SET
SELECT * FROM colleges where FIND_IN_SET(1, Courses);
Demo
If you have spaces after or before comma you could use:
SELECT * FROM colleges where FIND_IN_SET(1, REPLACE(REPLACE(Courses, ', ', ','), ' ,', ','));
Demo
matching comma separated string to a database field and sort the result to be in the same order as the comma separated string
You can use ORDER BY FIELD
ORDER BY field(email_address, 'test@test.com','test@test2.com','test@test3.com');
Reference: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_field
MySQL: How to check if a field value exists in a comma separated field in the same table
You can do it with this:
select *
from tablename
where
concat(',', col2, ',') like
concat('%,aaa:000.', substring_index(col1,'aaa:',-1), '/,%')
and if there may or may not be a /
at the end then:
select *
from tablename
where
concat(',', col2, ',') like
concat('%,aaa:000.', substring_index(col1,'aaa:',-1), '/,%')
or
concat(',', col2, ',') like
concat('%,aaa:000.', substring_index(col1,'aaa:',-1), ',%')
See the demo
Data separated by commas inside a field vs new table
Never, ever, ever choose the separate-by-commas solution. It is a violation of every principle of database design. Create a separate table instead.
In your particular case, create the table with the PRIMARY KEY on (article_id, user_id). The database will then prohibit the entry of duplicate records. Depending on your SQL engine, you can additionally use INSERT OR IGNORE (or equivalent) to avoid throwing exceptions.
The other solution requires you to enforce the uniqueness in the all applications that touch the data.
When to use comma-separated values in a DB Column?
You already know the answer.
First off, your PHP code isn't even close to working because it only works if user 2 has only a single value in LookingFor or Drugs. If either of these columns contains multiple comma-separated values then IN won't work even if those values are in the exact same order as User 1's values. What do expect IN to do if the right-hand side has one or more commas?
Therefore, it's not "easy" to do what you want in PHP. It's actually quite a pain and would involve splitting user 2's fields into single values, writing dynamic SQL with many ORs to do the comparison, and then doing an extremely inefficient query to get the results.
Furthermore, the fact that you even need to write PHP code to answer such a relatively simple question about the intersection of two sets means that your design is badly flawed. This is exactly the kind of problem (relational algebra) that SQL exists to solve. A correct design allows you to solve the problem in the database and then simply implement a presentation layer on top in PHP or some other technology.
Do it correctly and you'll have a much easier time.
Related Topics
Separate Comma Separated Values and Store in Table in SQL Server
Convert a String Date into Datetime in Oracle
Fastest Way to Determine If Record Exists
SQL Group by Case Statement with Aggregate Function
Best Way to Delete Millions of Rows by Id
Count Based on Condition in SQL Server
How to Get the First and Last Date of the Current Year
Database Eav Pros/Cons and Alternatives
Using Excel Vba to Run SQL Query
Unresolved Reference to Object [Information_Schema].[Tables]
How to Perform a Group by on an Aliased Column in SQL Server
How to Implement Pagination in SQL for Ms Access
SQL Server 2008 - If Not Exists Insert Else Update
Create Trigger to Log SQL That Affected Table