Compare Comma Separated Values in SQL

How to compare two comma-separated value in a column, and return only matching values in SQL

You can try the following solution.

It is using XQuery and its FLWOR expression.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, col1 VARCHAR(MAX), col2 VARCHAR(MAX));
INSERT INTO @tbl (col1, col2) VALUES
('John, james, steve', 'john, smith, will, james'),
('Mary, Lisa', 'Mary');
-- DDL and sample data population, end

DECLARE @separator CHAR(2) = ', ';

SELECT *
, REPLACE(TRY_CAST('<root>' +
'<source><r><![CDATA[' + REPLACE(col1, @separator, ']]></r><r><![CDATA[') +
']]></r></source>' +
'<target><r><![CDATA[' + REPLACE(col2, @separator, ']]></r><r><![CDATA[') +
']]></r></target>' +
'</root>' AS XML)
.query('
for $x in /root/source/r/text(),
$y in /root/target/r/text()
where lower-case($y) eq lower-case($x)
return data($x)
').value('.', 'NVARCHAR(MAX)'), SPACE(1), ',') AS result
FROM @tbl;

Output

+----+--------------------+--------------------------+------------+
| ID | col1 | col2 | result |
+----+--------------------+--------------------------+------------+
| 1 | John, james, steve | john, smith, will, james | John,james |
| 2 | Mary, Lisa | Mary | Mary |
+----+--------------------+--------------------------+------------+

Comparing Comma Separate Value with Comma Separated Values SQL

This should do it:

CREATE PROCEDURE GetMatches(@input varchar (100))
AS
BEGIN

WITH CTE AS
(
SELECT value AS number
FROM STRING_SPLIT(@input, ',')
)
SELECT CTE.number, A.ColumnName
FROM A
INNER JOIN CTE
ON ',' + A.ColumnName + ',' LIKE '%,' + CTE.number + ',%';

END

You can test the stored procedure like this:

EXEC dbo.GetMatches @input = '101,104';

Comparing comma separated values from two columns of two different tables

You could get the table(s) into first normal form and then compare the compounds that are stored in each row. A starting point could be:

{1} Tokenize each row, and write the tokens into a new table. Give each token its original ID plus a 3-letter prefix, indicating which table the token came from.
{2} Group the rows of the new ("normalized") table by ID, and perform a LISTAGG(). Perform a self join, and find matching "token groups".

{1} Tokenize, create table as select (CTAS)

create table tokens
as
select
ltrim( -- ltrim() and rtrim() remove leading/trailing spaces (blanks)
rtrim(
substr( N.wrapped
, instr( N.wrapped, ',', 1, T.pos ) + 1
, ( instr( N.wrapped, ',', 1, T.pos + 1 ) - instr( N.wrapped, ',', 1, T.pos ) ) - 1
)
)
) token
, N.id
from (
select ',' || name1 || ',' as wrapped, 'T1_' || to_char( id_t1 ) as id from t1 -- names wrapped in commas, (table)_id
union all
select ',' || name2 || ',' , 'T2_' || to_char( id_t2 ) from t2
) N join (
select level as pos -- (max) possible position of char in an existing token
from dual
connect by level <= (
select greatest( -- find the longest string ie max position (query T1 and T2)
( select max( length( name1 ) ) from t1 )
, ( select max( length( name2 ) ) from t2 )
) as pos
from dual
)
) T
on T.pos <= ( length( N.wrapped ) - length( replace( N.wrapped, ',') ) ) - 1
;

The inspiration to tokenize without using CONNECT BY came from this SO answer.

The contents of the TOKENS table will look something like this:

SQL> select * from tokens ;
TOKEN ID
ASCORBIC ACID T1_1
SODIUM HYDROGEN CARBONATE T1_2
CAFFEINE T1_3
PSEUDOEPHEDRINE HYDROCHLORIDE T1_4
PARACETAMOL T1_100
sodium hydroxide T1_110
POTASSIUM HYDROGEN CARBONATE T2_4
SODIUM HYDROGEN CARBONATE T2_5
PARACETAMOL PH. EUR. T2_6
CODEINE PHOSPHATE T2_7
DEXCHLORPHENIRAMINE MALEATE T2_8
DEXCHLORPHENIRAMINE MALEATE T2_10
PARACETAMOL T2_200
...

{2} GROUP BY, LISTAGG, self join

select
S1.id id1
, S2.id id2
, S1.tokengroup_T1
, S2.tokengroup_T2
from
(
select substr( id, 4, length( id ) - 3 ) id
, listagg( token, ' + ' ) within group ( order by token ) tokengroup_T1
from tokens
group by id
having substr( id, 1, 3 ) = 'T1_'
) S1
join
(
select substr( id, 4, length( id ) - 3 ) id
, listagg( token, ' + ' ) within group ( order by token ) tokengroup_T2
from tokens
group by id
having substr( id, 1, 3 ) = 'T2_'
) S2
on S1.tokengroup_T1 = S2.tokengroup_T2
;

-- result
ID1 ID2 TOKENGROUP_T1 TOKENGROUP_T2
4 10 DEXCHLORPHENIRAMINE MALEATE + PSEUDOEPHEDRINE HYDROCHLORIDE DEXCHLORPHENIRAMINE MALEATE + PSEUDOEPHEDRINE HYDROCHLORIDE
110 210 potassium carbonate + sodium hydroxide potassium carbonate + sodium hydroxide
1 4 ASCORBIC ACID + PARACETAMOL + POTASSIUM HYDROGEN CARBONATE ASCORBIC ACID + PARACETAMOL + POTASSIUM HYDROGEN CARBONATE
3 6 CAFFEINE + PARACETAMOL PH. EUR. CAFFEINE + PARACETAMOL PH. EUR.

When doing things this way, you can get the substances into (alphabetical) order, and you can also pick a "delimiter" that you like (we have used '+') here.

ALTERNATIVE

If all that is no use to you, or you think this is too complicated, then you could try using TRANSLATE(). In this case, I'd recommend stripping all spaces/blanks from your dataset (in a query - not altering the original data!) like so:

Query

select 
id1, id2
, name1, name2
from (
select
id_t1 id1
, id_t2 id2
, T1.name1 name1
, T2.name2 name2
from T1
join T2
on translate( replace( T1.name1, ' ', '' ), replace( T2.name2, ' ', '' ), '!' )
= translate( replace( T2.name2, ' ', '' ), replace( T1.name1, ' ', '' ), '!' )
) ;

Result

  ID1   ID2 NAME1                                                                NAME2                                                        
2 5 SODIUM HYDROGEN CARBONATE, SODIUM CARBONATE ANHYDROUS, CITRIC ACID SODIUM HYDROGEN CARBONATE, SODIUM CARBONATE ANHYDROUS
3 6 CAFFEINE, PARACETAMOL PH. EUR. PARACETAMOL PH. EUR.,CAFFEINE
100 10 PARACETAMOL, DEXTROMETHORPHAN, PSEUDOEPHEDRINE, PYRILAMINE DEXCHLORPHENIRAMINE MALEATE, PSEUDOEPHEDRINE HYDROCHLORIDE
110 210 sodium hydroxide, potassium carbonate sodium hydroxide, potassium carbonate

NOTE:
I've added the following rows to your sample data:

-- T1
110, 'sodium hydroxide, potassium carbonate'

-- T2
210, 'sodium hydroxide, potassium carbonate'
211, 'potassium hydroxide, sodium carbonate'

I found that it is easy to use TRANSLATE() in a way that gives you "false positives" ie the substances with ids 110, 210, and 211 will appear to "match". (In other words: I don't think this is the right tool for this job.)

DBFIDDLE here

(follow the link to see the sample tables and queries).

How to compare comma separated values in mysql?

Try find_in_set function:

select users.name,items.price
from orders
join users
on user.id = orders.user_id
join items
on find_in_set(items.id, orders.item_id)
where orders.id = 1

However, store id with comma separated is a kind of bad db design, you'd better add a table to store the relationship between orders and items.

Just take an example:

create table order_item (
order_id varchar(20),
item_id varchar(20)
);

then the query:

select users.name,items.price
from orders
join users
on user.id = orders.user_id
join order_item oi on oi.order_id = orders.id
join items on oi.item_id = items.id
where orders.id = 1
group by orders.id

EDIT:

With [] included, you can try this:

select users.name,items.price
from orders
join users
on user.id = orders.user_id
join items
on find_in_set(items.id, replace(replace(orders.item_id, '[', ''), ']', ''))
where orders.id = 1

Compare Single Value with comma Separated value in SQL Server

Below the code requested:

DECLARE @sec varchar(max)

SET @sec = '2,3,4,5,6,89,52,36,14,';

;WITH CTE_SEC AS (
SELECT value AS sec
FROM
STRING_SPLIT(@sec, ',')
WHERE
RTRIM(value) <> ''
)
SELECT *
FROM
XXX X
INNER JOIN CTE_SEC C ON C.sec = X.yyyyy

STRING_SPLIT() function was introduced with SQL Server 2016



Related Topics



Leave a reply



Submit