Sql:Find Rows and Sort According to Number of Matching Columns

SQL : find rows and sort according to number of matching columns?

There are probably a few ways to optimise the sub-queries, but without using case statements or sub-optimal join clauses:

select
*
from
(
select
selection.CarId,
selection.Colour,
selection.Weight,
selection.Type,
3 as Relevance
from
tblCars as selection
where
selection.Colour = 'black' and selection.Weight = 'light' and selection.Type = 'van'
union all
select
cars.CarId,
cars.Colour,
cars.Weight,
cars.Type,
count(*) as Relevance
from
tblCars as cars
inner join
(
select
byColour.CarId
from
tblCars as cars
inner join
tblCars as byColour
on
cars.Colour = byColour.Colour
where
cars.Colour = 'black' and cars.Weight = 'light' and cars.Type = 'van'
and
byColour.CarId <> cars.CarId
union all
select
byWeight.CarId
from
tblCars as cars
inner join
tblCars as byWeight
on
cars.Weight = byWeight.Weight
where
cars.Colour = 'black' and cars.Weight = 'light' and cars.Type = 'van'
and
byWeight.CarId <> cars.CarId
union all
select
byType.CarId
from
tblCars as cars
inner join
tblCars as byType
on
cars.Type = byType.Type
where
cars.Colour = 'black' and cars.Weight = 'light' and cars.Type = 'van'
and
byType.CarId <> cars.CarId
) as matches
on
cars.CarId = matches.CarId
group by
cars.CarId,
cars.Colour,
cars.Weight,
cars.Type
) as results
order by
Relevance desc

Output:

CarId   Colour  Weight  Type    Relevance
1 black light van 3
3 white light van 2
4 blue light van 2
5 black medium van 2
6 white medium van 1
7 blue medium van 1
8 black heavy limo 1

Sort column values to match order of values in another table column

So you need to update Column2 with the row-number according toColumn1?

You can use ROW_NUMBER and a CTE:

WITH CTE AS 
(
SELECT Column1, Column2, RN = ROW_NUMBER() OVER (ORDER BY Column1)
FROM MyTable
)
UPDATE CTE SET Column2 = RN;

This updates the table MyTable and works because the CTE selects a single table. If it contains more than one table you have to JOIN the UPDATE with the CTE.

Demo

Sort by first matching number then by second matching number and so on in SQL

Assuming you would have 2 number blocks at most and each number would be 10 digits at most, I created a sample CLR UDF like this for you (DbProject - SQL CLR Database project):

using System.Collections.Generic;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlString CustomStringParser(SqlString str)
{
int depth = 2; // 2 numbers at most
int width = 10; // 10 digits at most

List<string> numbers = new List<string>();
var matches = Regex.Matches((string)str, @"\d+");
foreach (Match match in matches)
{
numbers.Add(int.Parse(match.Value).ToString().PadLeft(width, '0'));
}
return string.Join("", numbers.ToArray()).PadRight(depth*width);
}
}

I added this to the 'test' database as follows:

IF EXISTS ( SELECT  *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].[ufn_MyCustomParser]') AND
type IN ( N'FN', N'IF', N'TF', N'FS', N'FT' ) )
DROP FUNCTION [dbo].[ufn_MyCustomParser]
GO
IF EXISTS ( SELECT *
FROM sys.[assemblies] AS [a]
WHERE [a].[name] = 'DbProject' AND
[a].[is_user_defined] = 1 )
DROP ASSEMBLY DbProject;
GO

CREATE ASSEMBLY DbProject
FROM 'C:\SQLCLR\DbProject\DbProject\bin\Debug\DbProject.dll'
WITH PERMISSION_SET = SAFE;
GO

CREATE FUNCTION ufn_MyCustomParser ( @csv NVARCHAR(4000))
RETURNS NVARCHAR(4000)
AS EXTERNAL NAME
DbProject.[UserDefinedFunctions].CustomStringParser;
GO

Note: SQL server 2012 (2017 has strict security problem that you need to handle).

Finally tested with this T-SQL:

declare @MyTable table (col1 varchar(50));
insert into @MyTable values
('Btc0504'),
('Btc0007_Shd_7'),
('Btc0007_Shd_01'),
('Btc0007_Shd_6'),
('MR_Tst_Btc0565'),
('Btc0004_Shd_4'),
('Btc_BwwwQAZtc0605'),
('Btc_Bwwwwe12541edddddtc0605'),
('QARTa1b2');
SELECT * FROM @MyTable
ORDER BY dbo.ufn_MyCustomParser(col1);

Output:

col1
QARTa1b2
Btc0004_Shd_4
Btc0007_Shd_01
Btc0007_Shd_6
Btc0007_Shd_7
Btc0504
MR_Tst_Btc0565
Btc_BwwwQAZtc0605
Btc_Bwwwwe12541edddddtc0605

SQL multiple words search, ordered by number of matches

ORDER BY
(
CASE
WHEN col LIKE '%red%' THEN 1
ELSE 0
END CASE
+
CASE
WHEN col LIKE '%green%' THEN 1
ELSE 0
END CASE
+
CASE
WHEN col LIKE '%blue%' THEN 1
ELSE 0
END CASE
) DESC

If your DB vendor has IF, you can use it instead of CASE (e.g., for Mysql you can write
IF (col LIKE '%red% , 1,0) + IF(....'

Sort SQL records based on matched conditions

.... ORDER BY CASE

WHEN key LIKE '1,2,3,%' THEN 1

WHEN key LIKE '1,2,%' THEN 2

ELSE 3

END

How can I return the best matched row first in sort order from a set returned by querying a single search term against multiple columns in Postgres?

Use greatest():

greatest(similarity('12345', foo_text), similarity('12345', bar_text), similarity('12345', foobar_text)) desc

SQL query to find rows with the most matching keywords

Like @a_horse commented: This would be simpler with a normalized design (besides making other tasks simpler/ cleaner), but still not trivial.

Also, a PK column of data type character varying(36) is highly suspicious (and inefficient) and should most probably be an integer type or at least a uuid instead.

Here is one possible solution based on your design as is:

WITH cte AS (
SELECT id, string_to_array(a.keywords, ',') AS keys
FROM article a
)
SELECT id, string_agg(b_id, ',') AS best_matches
FROM (
SELECT a.id, b.id AS b_id
, row_number() OVER (PARTITION BY a.id ORDER BY ct.ct DESC, b.id) AS rn
FROM cte a
LEFT JOIN cte b ON a.id <> b.id AND a.keys && b.keys
LEFT JOIN LATERAL (
SELECT count(*) AS ct
FROM (
SELECT * FROM unnest(a.keys)
INTERSECT ALL
SELECT * FROM unnest(b.keys)
) i
) ct ON TRUE
ORDER BY a.id, ct.ct DESC, b.id -- b.id as tiebreaker
) sub
WHERE rn < 4
GROUP BY 1;

sqlfiddle (using an integer id instead).

The CTE cte converts the string into an array. You could even have a functional GIN index like that ...

If multiple rows tie for the top 3 picks, you need to define a tiebreaker. In my example, rows with smaller id come first.

Detailed explanation in this recent related answer:

  • Query and order by number of matches in JSON array

The comparison is between a JSON array and an SQL array, but it's basically the same problem, burns down to the same solution(s). Also comparing a couple of similar alternatives.

To make this fast, you should at least have a GIN index on the array column (instead of the comma-separated string) and the query wouldn't need the CTE step. A completely normalized design has other advantages, but won't necessarily be faster than an array with GIN index.

SQL Find most rows that match between two tables

We really could do with some expected output to help clarify the question.

If I understand you correctly however, this query will get you close to the results you require:

;with cte as
( SELECT t1a.[group] AS Group1
, t2a.[Group] AS Group2
, RANK() OVER(PARTITION BY t1a.[group]
ORDER BY COUNT(t2a.[Group]) DESC) AS MatchRank
FROM Table1 t1a
JOIN Table2 t2a
ON t1a.member = t2a.member
GROUP BY t1a.[group], t2a.[GRoup])
SELECT *
FROM cte
WHERE MatchRank=1

The query doesn't identify ties, but it will display any tied results...

If you are a newbie to common table expressions(the ;with statement) there is a useful description here.



Related Topics



Leave a reply



Submit