What Is The SQL Used to Do a Search Similar to "Related Questions" on Stackoverflow

Searching the same word with different cases

You can either use

 WHERE LOWER(< Name_field >) LIKE '%abc%'

or

 WHERE REGEXP_LIKE(< Name_field >,'abc','i') --where "i" means case "i"nsensitive search

What is Full Text Search vs LIKE

In general, there is a tradeoff between "precision" and "recall". High precision means that fewer irrelevant results are presented (no false positives), while high recall means that fewer relevant results are missing (no false negatives). Using the LIKE operator gives you 100% precision with no concessions for recall. A full text search facility gives you a lot of flexibility to tune down the precision for better recall.

Most full text search implementations use an "inverted index". This is an index where the keys are individual terms, and the associated values are sets of records that contain the term. Full text search is optimized to compute the intersection, union, etc. of these record sets, and usually provides a ranking algorithm to quantify how strongly a given record matches search keywords.

The SQL LIKE operator can be extremely inefficient. If you apply it to an un-indexed column, a full scan will be used to find matches (just like any query on an un-indexed field). If the column is indexed, matching can be performed against index keys, but with far less efficiency than most index lookups. In the worst case, the LIKE pattern will have leading wildcards that require every index key to be examined. In contrast, many information retrieval systems can enable support for leading wildcards by pre-compiling suffix trees in selected fields.

Other features typical of full-text search are

  • lexical analysis or tokenization—breaking a
    block of unstructured text into
    individual words, phrases, and
    special tokens
  • morphological
    analysis, or stemming—collapsing variations
    of a given word into one index term;
    for example, treating "mice" and
    "mouse", or "electrification" and
    "electric" as the same word
  • ranking—measuring the
    similarity of a matching record to
    the query string

search similar words in mysql

you could try like below by using aggregate function

select *, MATCH(Description) AGAINST('Acifree -O 10ml' IN NATURAL LANGUAGE MODE)
as score
from tutorial
where MATCH(Description) AGAINST('Acifree -O 10ml' IN NATURAL LANGUAGE MODE)
and MATCH(Description) AGAINST('Acifree -O 10ml' IN NATURAL LANGUAGE MODE)=
(select max(MATCH(Description) AGAINST('Acifree -O 10ml' IN NATURAL LANGUAGE MODE))
from tutorial
)

demo link

output

id  description score
1 Acifree -O 10ml 0.15835624933242798
2 Acifree O 10ml 0.15835624933242798

When MATCH() is used in a WHERE clause, the rows returned are automatically sorted with the highest relevance first.
Relevance values are non negative floating-point numbers.
Zero relevance means no similarity.
Relevance is computed based on -

  • the number of words in the row
  • the number of unique words in that row
  • the total number of words in the collection
  • the number of documents
    (rows) that contain a particular word

As you need the best relevant so i took the max score

How to find similar results in SQL Server?

It's difficult to get something that works really well for this sort of thing in SQL Server. Fuzzy matches are really hard to work with when you need to search for spelling mistakes while trying not to get bad matches on things.

For example, the following is one way you could try to do this:

DECLARE @ TABLE (id INT IDENTITY(1, 1), blah NVARCHAR(255));

INSERT @ VALUES ('Harry Potter and the Chamber of Secrets')
,('Harry Potter and the Deathly Hallows: Part 1')
,('Harry Potter and the Deathly Hallows: Part 2')
,('Harry Potter and the Goblet of Fire')
,('Harry Potter and the Half-Blood Prince')
,('Harry Potter and the Order of the Phoenix')
,('Harry Potter and the Prisoner of Azkaban')
,('Harry Potter and the Sorcerer''s Stone');

DECLARE @myVar NVARCHAR(255) = 'deadly halow'; -- returns 2 matches (both parts of Deathly Hallows)
-- SET @myVar = 'hary poter'; -- returns 8 matches, all of them
-- SET @myVar = 'order'; -- returns 1 match (Order of the Phoenix)
-- SET @myVar = 'phoneix'; -- returns 2 matches (Order of the Phoenix and Half-blood Prince, the latter due to a fuzzy match on 'prince')

WITH CTE AS (
SELECT id, blah
FROM @
UNION ALL
SELECT 0, @myVar
)
, CTE2 AS (
SELECT id
, blah
, SUBSTRING(blah, 1, ISNULL(NULLIF(CHARINDEX(' ', blah), 0) - 1, LEN(blah))) individualWord
, NULLIF(CHARINDEX(' ', blah), 0) cIndex
, 1 L
FROM CTE
UNION ALL
SELECT CTE.id
, CTE.blah
, SUBSTRING(CTE.blah, cIndex + 1, ISNULL(NULLIF(CHARINDEX(' ', CTE.blah, cIndex + 1), 0) - 1 - cIndex, LEN(CTE.blah)))
, NULLIF(CHARINDEX(' ', CTE.blah, cIndex + 1), 0)
, L + 1
FROM CTE2
JOIN CTE ON CTE.id = CTE2.id
WHERE cIndex IS NOT NULL
)
SELECT blah
FROM (
SELECT X.blah, ROW_NUMBER() OVER (PARTITION BY X.ID, Y.L ORDER BY (SELECT NULL)) RN, Y.wordCount
FROM CTE2 X
JOIN (SELECT *, COUNT(*) OVER() wordCount FROM CTE2 WHERE id = 0) Y ON DIFFERENCE(X.individualWord, Y.individualWord) >= 3 AND X.id <> 0) T
WHERE RN = 1
GROUP BY blah
HAVING COUNT(*) = MAX(wordCount);

This splits each of the words in the search term, splits each of the words in the titles, then uses the DIFFERENCE() function, which compares the SOUNDEX() of the values and tells you how far apart they are. e.g. SOUNDEX('Halow') is 'H400' and SOUNDEX('Hallows') is 'H420' - the difference here is 3 (because H, 4 and one of the zeroes match). A perfect match would have a difference of 4, a close match has a difference above 3 generally.

Unfortunately, because you need to check for close matches, you get some false positives with this sometimes. I tested it with, for example, 'phoneix' as the input and got a match on 'Half-blood Prince' due to a fuzzy match between 'prince' and 'phoenix'. I'm sure there are ways this could be improved upon, but something like this should work as a basis for what you're trying to achieve.

SQL - similar data in column

You could use SOUNDEX to do this.

Sample data;

CREATE TABLE #SampleData (Column1 int, Column2 varchar(10))
INSERT INTO #SampleData (Column1, Column2)
VALUES
(1,'blue car')
,(2,'red doll')
,(3,'blue cars')
,(4,'green tree')
,(5,'red dolly')

The following code will use soundex to create a list of similar entries in column2. It then uses a different sub query to see how many occurrences of that soundex field appear;

SELECT
a.GroupingField
,a.Title
,b.SimilarFields
FROM (
SELECT
SOUNDEX(Column2) GroupingField
,MAX(Column2) Title --Just return a unique title for this soundex group
FROM #SampleData
GROUP BY SOUNDEX(Column2)
) a
LEFT JOIN (
SELECT
SOUNDEX(Column2) GroupingField
,COUNT(Column2) SimilarFields --How many fields are in the soundex group?
FROM #SampleData
GROUP BY SOUNDEX(Column2)
) b
ON a.GroupingField = b.GroupingField
WHERE b.SimilarFields > 1

The results look like this (I've left the soundex field in to show you what it looks like);

GroupingField   Title       SimilarFields
B400 blue cars 2
R300 red dolly 2

Some further reading on soundex https://msdn.microsoft.com/en-gb/library/ms187384.aspx

Edit: as per your request, to get the original data you may as well push into a temp table, change the query i've given you to put an INTO before the FROM statement;

SELECT
a.GroupingField
,a.Title
,b.SimilarFields
INTO #Duplicates
FROM (
SELECT
SOUNDEX(Column2) GroupingField
,MAX(Column2) Title --Just return a unique title for this soundex group
FROM #SampleData
GROUP BY SOUNDEX(Column2)
) a
LEFT JOIN (
SELECT
SOUNDEX(Column2) GroupingField
,COUNT(Column2) SimilarFields --How many fields are in the soundex group?
FROM #SampleData
GROUP BY SOUNDEX(Column2)
) b
ON a.GroupingField = b.GroupingField
WHERE b.SimilarFields > 1

Then use the following query to link back to your original data;

SELECT
a.GroupingField
,a.Title
,a.SimilarFields
,b.Column1
,b.Column2
FROM #Duplicates a
JOIN #SampleData b
ON a.GroupingField = SOUNDEX(b.Column2)
ORDER BY a.GroupingField

Would give the following result;

GroupingField   Title       SimilarFields   Column1     Column2
B400 blue cars 2 1 blue car
B400 blue cars 2 3 blue cars
R300 red dolly 2 5 red dolly
R300 red dolly 2 2 red doll

Remember to

DROP TABLE #Differences

SQL search multiple values in same field

Yes, you can use SQL IN operator to search multiple absolute values:

SELECT name FROM products WHERE name IN ( 'Value1', 'Value2', ... );

If you want to use LIKE you will need to use OR instead:

SELECT name FROM products WHERE name LIKE '%Value1' OR name LIKE '%Value2';

Using AND (as you tried) requires ALL conditions to be true, using OR requires at least one to be true.

Search related data in different data base

Joining tables from different databases may not be supported by your RDBMS (PostgreSQL for example). But if supported (MSSQL, MySQL) then table names should be prefixed with database name (and schema if needed). You can achieve this in Yii2 using {{%TableName}} syntax in tableName() function.

public static function tableName()
{
return '{{%table_name}}';
}

But be careful with joining tables from different databases if they are located on different servers -- this can be very slow.

If you just want to get related data (joined tables are not used in WHERE) then use with() instead of joinWith(). This will be executed as separate query with IN statement. In most cases this way has a better performance and no problems with different sources (and even different DBMS).

->with('nextTab', 'nextTab.nextTab1', 'nextTab.nextTab1.nextTab2')


Related Topics



Leave a reply



Submit