How to compare two comma-separated value in a column, and return only matching values in SQL
You can try the following solution.
It is using XQuery and its FLWOR expression.
SQL
-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, col1 VARCHAR(MAX), col2 VARCHAR(MAX));
INSERT INTO @tbl (col1, col2) VALUES
('John, james, steve', 'john, smith, will, james'),
('Mary, Lisa', 'Mary');
-- DDL and sample data population, end
DECLARE @separator CHAR(2) = ', ';
SELECT *
, REPLACE(TRY_CAST('<root>' +
'<source><r><![CDATA[' + REPLACE(col1, @separator, ']]></r><r><![CDATA[') +
']]></r></source>' +
'<target><r><![CDATA[' + REPLACE(col2, @separator, ']]></r><r><![CDATA[') +
']]></r></target>' +
'</root>' AS XML)
.query('
for $x in /root/source/r/text(),
$y in /root/target/r/text()
where lower-case($y) eq lower-case($x)
return data($x)
').value('.', 'NVARCHAR(MAX)'), SPACE(1), ',') AS result
FROM @tbl;
Output
+----+--------------------+--------------------------+------------+
| ID | col1 | col2 | result |
+----+--------------------+--------------------------+------------+
| 1 | John, james, steve | john, smith, will, james | John,james |
| 2 | Mary, Lisa | Mary | Mary |
+----+--------------------+--------------------------+------------+
Comparing Comma Separate Value with Comma Separated Values SQL
This should do it:
CREATE PROCEDURE GetMatches(@input varchar (100))
AS
BEGIN
WITH CTE AS
(
SELECT value AS number
FROM STRING_SPLIT(@input, ',')
)
SELECT CTE.number, A.ColumnName
FROM A
INNER JOIN CTE
ON ',' + A.ColumnName + ',' LIKE '%,' + CTE.number + ',%';
END
You can test the stored procedure like this:
EXEC dbo.GetMatches @input = '101,104';
Comparing comma separated values from two columns of two different tables
You could get the table(s) into first normal form and then compare the compounds that are stored in each row. A starting point could be:
{1} Tokenize each row, and write the tokens into a new table. Give each token its original ID plus a 3-letter prefix, indicating which table the token came from.
{2} Group the rows of the new ("normalized") table by ID, and perform a LISTAGG(). Perform a self join, and find matching "token groups".
{1} Tokenize, create table as select (CTAS)
create table tokens
as
select
ltrim( -- ltrim() and rtrim() remove leading/trailing spaces (blanks)
rtrim(
substr( N.wrapped
, instr( N.wrapped, ',', 1, T.pos ) + 1
, ( instr( N.wrapped, ',', 1, T.pos + 1 ) - instr( N.wrapped, ',', 1, T.pos ) ) - 1
)
)
) token
, N.id
from (
select ',' || name1 || ',' as wrapped, 'T1_' || to_char( id_t1 ) as id from t1 -- names wrapped in commas, (table)_id
union all
select ',' || name2 || ',' , 'T2_' || to_char( id_t2 ) from t2
) N join (
select level as pos -- (max) possible position of char in an existing token
from dual
connect by level <= (
select greatest( -- find the longest string ie max position (query T1 and T2)
( select max( length( name1 ) ) from t1 )
, ( select max( length( name2 ) ) from t2 )
) as pos
from dual
)
) T
on T.pos <= ( length( N.wrapped ) - length( replace( N.wrapped, ',') ) ) - 1
;
The inspiration to tokenize without using CONNECT BY came from this SO answer.
The contents of the TOKENS table will look something like this:
SQL> select * from tokens ;
TOKEN ID
ASCORBIC ACID T1_1
SODIUM HYDROGEN CARBONATE T1_2
CAFFEINE T1_3
PSEUDOEPHEDRINE HYDROCHLORIDE T1_4
PARACETAMOL T1_100
sodium hydroxide T1_110
POTASSIUM HYDROGEN CARBONATE T2_4
SODIUM HYDROGEN CARBONATE T2_5
PARACETAMOL PH. EUR. T2_6
CODEINE PHOSPHATE T2_7
DEXCHLORPHENIRAMINE MALEATE T2_8
DEXCHLORPHENIRAMINE MALEATE T2_10
PARACETAMOL T2_200
...
{2} GROUP BY, LISTAGG, self join
select
S1.id id1
, S2.id id2
, S1.tokengroup_T1
, S2.tokengroup_T2
from
(
select substr( id, 4, length( id ) - 3 ) id
, listagg( token, ' + ' ) within group ( order by token ) tokengroup_T1
from tokens
group by id
having substr( id, 1, 3 ) = 'T1_'
) S1
join
(
select substr( id, 4, length( id ) - 3 ) id
, listagg( token, ' + ' ) within group ( order by token ) tokengroup_T2
from tokens
group by id
having substr( id, 1, 3 ) = 'T2_'
) S2
on S1.tokengroup_T1 = S2.tokengroup_T2
;
-- result
ID1 ID2 TOKENGROUP_T1 TOKENGROUP_T2
4 10 DEXCHLORPHENIRAMINE MALEATE + PSEUDOEPHEDRINE HYDROCHLORIDE DEXCHLORPHENIRAMINE MALEATE + PSEUDOEPHEDRINE HYDROCHLORIDE
110 210 potassium carbonate + sodium hydroxide potassium carbonate + sodium hydroxide
1 4 ASCORBIC ACID + PARACETAMOL + POTASSIUM HYDROGEN CARBONATE ASCORBIC ACID + PARACETAMOL + POTASSIUM HYDROGEN CARBONATE
3 6 CAFFEINE + PARACETAMOL PH. EUR. CAFFEINE + PARACETAMOL PH. EUR.
When doing things this way, you can get the substances into (alphabetical) order, and you can also pick a "delimiter" that you like (we have used '+') here.
ALTERNATIVE
If all that is no use to you, or you think this is too complicated, then you could try using TRANSLATE(). In this case, I'd recommend stripping all spaces/blanks from your dataset (in a query - not altering the original data!) like so:
Query
select
id1, id2
, name1, name2
from (
select
id_t1 id1
, id_t2 id2
, T1.name1 name1
, T2.name2 name2
from T1
join T2
on translate( replace( T1.name1, ' ', '' ), replace( T2.name2, ' ', '' ), '!' )
= translate( replace( T2.name2, ' ', '' ), replace( T1.name1, ' ', '' ), '!' )
) ;
Result
ID1 ID2 NAME1 NAME2
2 5 SODIUM HYDROGEN CARBONATE, SODIUM CARBONATE ANHYDROUS, CITRIC ACID SODIUM HYDROGEN CARBONATE, SODIUM CARBONATE ANHYDROUS
3 6 CAFFEINE, PARACETAMOL PH. EUR. PARACETAMOL PH. EUR.,CAFFEINE
100 10 PARACETAMOL, DEXTROMETHORPHAN, PSEUDOEPHEDRINE, PYRILAMINE DEXCHLORPHENIRAMINE MALEATE, PSEUDOEPHEDRINE HYDROCHLORIDE
110 210 sodium hydroxide, potassium carbonate sodium hydroxide, potassium carbonate
NOTE:
I've added the following rows to your sample data:
-- T1
110, 'sodium hydroxide, potassium carbonate'
-- T2
210, 'sodium hydroxide, potassium carbonate'
211, 'potassium hydroxide, sodium carbonate'
I found that it is easy to use TRANSLATE() in a way that gives you "false positives" ie the substances with ids 110, 210, and 211 will appear to "match". (In other words: I don't think this is the right tool for this job.)
DBFIDDLE here
(follow the link to see the sample tables and queries).
How to compare comma separated values in mysql?
Try find_in_set
function:
select users.name,items.price
from orders
join users
on user.id = orders.user_id
join items
on find_in_set(items.id, orders.item_id)
where orders.id = 1
However, store id
with comma separated is a kind of bad db design, you'd better add a table to store the relationship between orders
and items
.
Just take an example:
create table order_item (
order_id varchar(20),
item_id varchar(20)
);
then the query:
select users.name,items.price
from orders
join users
on user.id = orders.user_id
join order_item oi on oi.order_id = orders.id
join items on oi.item_id = items.id
where orders.id = 1
group by orders.id
EDIT:
With []
included, you can try this:
select users.name,items.price
from orders
join users
on user.id = orders.user_id
join items
on find_in_set(items.id, replace(replace(orders.item_id, '[', ''), ']', ''))
where orders.id = 1
Compare Single Value with comma Separated value in SQL Server
Below the code requested:
DECLARE @sec varchar(max)
SET @sec = '2,3,4,5,6,89,52,36,14,';
;WITH CTE_SEC AS (
SELECT value AS sec
FROM
STRING_SPLIT(@sec, ',')
WHERE
RTRIM(value) <> ''
)
SELECT *
FROM
XXX X
INNER JOIN CTE_SEC C ON C.sec = X.yyyyy
STRING_SPLIT()
function was introduced with SQL Server 2016
Related Topics
How to Prevent Deletion of the First Row in Table (Postgresql)
Converting a Time into 12 Hour Format in SQL
Error Converting Data Type Varchar
Compare Strings as Numbers in SQLite3
How to Group by Each Day in Pl/Sql
Find Out the Calling Stored Procedure in SQL Server
How to Create a Check Constraint on a Varchar Column in SQL Server Specifying a Minimum Data Length
How to Declare Input-Output Parameters in SQL Server Stored Procedure/Function
Adding a Column to All User Tables in T-Sql
Inserting New Columns in the Middle of a Table
Calculating How Many Days Are Between Two Dates in Db2
How to Add Sequence Number for Each Element in a Group Using a SQL Query Without Temp Tables
SQL Server 2008 R2 Using Pivot with Varchar Columns Not Working
Update Statement in Oracle Using SQL or Pl/SQL to Update First Duplicate Row Only