Best Way to Do a Weighted Search Over Multiple Fields in MySQL

Best way to do a weighted search over multiple fields in mysql?

Probably this approach of doing a weighted search / results is suitable for you:

SELECT *,
IF(
`name` LIKE "searchterm%", 20,
IF(`name` LIKE "%searchterm%", 10, 0)
)
+ IF(`description` LIKE "%searchterm%", 5, 0)
+ IF(`url` LIKE "%searchterm%", 1, 0)
AS `weight`
FROM `myTable`
WHERE (
`name` LIKE "%searchterm%"
OR `description` LIKE "%searchterm%"
OR `url` LIKE "%searchterm%"
)
ORDER BY `weight` DESC
LIMIT 20

It uses a select subquery to provide the weight for ordering the results. In this case three fields searched over, you can specify a weight per field. It's probably less expensive than unions and probably one of the faster ways in plain MySQL only.

If you've got more data and need results faster, you can consider using something like Sphinx or Lucene.

Calculate a weighted average for several columns

Consider building a vector of sum expressions, then paste(..., collapse) in larger SQL statement. Adjust 1:4 to actual variable range. Line breaks do not render in passed query.

sums <- paste0("  sum(var", 1:4, "*time)/sum(time) as var", 1:4)

sql <- paste0('select ID,\n',
paste(sums, collapse = ', \n'), '\n',
'from table1 \n',
'group by ID;')

cat(sql)
# select ID,
# sum(var1*time)/sum(time) as var1,
# sum(var2*time)/sum(time) as var2,
# sum(var3*time)/sum(time) as var3,
# sum(var4*time)/sum(time) as var4
# from table1
# group by ID;

channel <- odbcConnect("redacted",uid="redacted",case="nochange")
x <- sqlQuery(channel, sql)

MySQL/PHP - Need to be able to produce query results with certain columns having more weight than others

Bear with me, this is going to be a strange query, but it seems to work on my end.

SELECT SUM(
IF(year = "1968", 30, 0) +
IF(make = "Ford", 100, 0) +
IF(model = "Mustang", 85, 0) +
IF(color = "Red", 10, 0) +
IF(type = "Sports Car", 50, 0)
) AS `weight`, cars.* FROM cars
WHERE year = "1968"
OR make = "Ford"
OR model = "Mustang"
OR color = "Red"
OR type = "Sports Car"
GROUP BY cars.id
ORDER BY `weight` DESC;

Basically, this groups all results by their id (which is necessary for the SUM() function, does some calculations on the different fields and returns the weight as a total value, which is then sorted highest-lowest. Also, this will only return results where one of the columns matches a supplied value.

Since I don't have an exact copy of your database, run some tests with this on your end and let me know if there's anything that needs to be adjusted.

Expected Results:

+============================================================+
| weight | year | make | model | color | type |
|============================================================|
| 130 | 1968 | Ford | Fairlane | Blue | Roadster |
| 100 | 2014 | Ford | Taurus | Silver | Sedan |
| 60 | 2015 | Chevrolet | Corvette | Red | Sports Car |
+============================================================+

So, as you can see, the results would list the closest matches, which in this case are two Ford (+100) vehicles, one from 1968 (+30), and a Red Sports Car (10 + 50) as the closest matches (using your criteria)

One more thing, if you also want to display the rest of the results (ie results with a 0 weight match score) simply remove the WHERE ... OR ..., so it will check against all records. Cheers!

Further to the comments below, checking the weight after a LEFT JOIN on a pivot table:

SELECT SUM(
IF(cars.year = "1968", 30, 0) +
IF(cars.make = "Ford", 100, 0) +
IF(cars.model = "Mustang", 85, 0) +
IF(cars.color = "Red", 10, 0) +
IF(types.name = "Sports Car", 50, 0)
) AS `weight`, cars.*, types.* FROM cars
LEFT JOIN cars_types ON cars_types.car_id = cars.id
LEFT JOIN types ON cars_types.type_id = types.id
WHERE year = "1968"
OR cars.make = "Ford"
OR cars.model = "Mustang"
OR cars.color = "Red"
OR types.name = "Sports Car"
GROUP BY cars.id
ORDER BY `weight` DESC;

Here is a picture of the LEFT JOIN in practice:

Sample Image

As you can see, the Cobalt matches on color (silver) and model (Cobalt) (85 + 10) while the Caliber matches on type (Sports Car) (50). And yes, I know a Dodge Caliber isn't a Sports Car, this was for example's sake. Hope that helped!

How to sort and filter searches on multiple fields in SQL

You can add a weight for every column in your search results.

Here's the code:

SELECT *,
CASE WHEN `artist` LIKE '%$searchquestion%' THEN 1 ELSE 0 END AS artist_match,
CASE WHEN `genres` LIKE '%$searchquestion%' THEN 1 ELSE 0 END AS genres_match,
CASE WHEN `trackname` LIKE '%$searchquestion%' THEN 1 ELSE 0 END AS trackname_match,
CASE WHEN `album_name` LIKE '%$searchquestion%' THEN 1 ELSE 0 END AS album_name_match,
FROM p2pm_tracks
WHERE
`artist` LIKE '%$searchquestion%' OR
`genres` LIKE '%$searchquestion%' OR
`trackname` LIKE '%$searchquestion%' OR
`album_name` LIKE '%$searchquestion%'
ORDER BY
`artist_match` DESC,
`genres_match` DESC,
`trackname_match` DESC,
`album_name_match` DESC,
`popularity` DESC,
LIMIT $startingpoint, $resultsperpage

This query will gather the results related to:

  • the artist FIRST,
  • THEN the genre,
  • THEN the track's title,
  • THEN the album's name,
  • THEN the popularity of the song

To optimize this query, you should avoid using "LIKE" and use "FULLTEXT SEARCH" instead.

The optimized code will be:

SELECT *,
CASE WHEN MATCH (artist) AGAINST ('$searchquestion') THEN 1 ELSE 0 END AS artist_match,
CASE WHEN MATCH (genres) AGAINST ('$searchquestion') THEN 1 ELSE 0 END AS genres_match,
CASE WHEN MATCH (trackname) AGAINST ('$searchquestion') THEN 1 ELSE 0 END AS trackname_match,
CASE WHEN MATCH (album_name) AGAINST ('$searchquestion') THEN 1 ELSE 0 END AS album_name_match,
FROM p2pm_tracks
WHERE
MATCH (artist) AGAINST ('$searchquestion') OR
MATCH (genres) AGAINST ('$searchquestion') OR
MATCH (trackname) AGAINST ('$searchquestion') OR
MATCH (album_name) AGAINST ('$searchquestion')
ORDER BY
`artist_match` DESC,
`genres_match` DESC,
`trackname_match` DESC,
`album_name_match` DESC,
`popularity` DESC,
LIMIT $startingpoint, $resultsperpage

And make sure that you're using the MyISAM engine for the MySQL table and that you created indexes for the columns you want to search.
The code for your MySQL table should look like:

CREATE TABLE p2pm_tracks (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
artist VARCHAR(255) NOT NULL,
trackname VARCHAR(255) NOT NULL,
...
...
FULLTEXT (artist,trackname)
) ENGINE=MyISAM;

For more info, check the following:
- http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html
- http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html

If you're looking for something more advanced, then look into Solr (based on Lucene), Sphinx, ElasticSearch (based on Lucene) etc.

SQL Weighted averages of multiple rows -

You can multiply the value times the weight and then divide by the sum of the weights. For the weighted average by question:

select question, sum(ai.value * a.weight) / sum(a.weight)
from answer_items ai join
answers a
on ai.answer_id = a.id
group by question;

Here is a db<>fiddle.

MySQL show ranks of multiple columns

Use Some thing Like this..

SELECT Name,Height,Weight,FIND_IN_SET( Height,( SELECT GROUP_CONCAT( Height ORDER BY Height DESC ) FROM scores )) AS Height_Rank,FIND_IN_SET( Weight,( SELECT GROUP_CONCAT( Weight ORDER BY Weight DESC ) FROM scores ) ) AS Weight_Rank FROM scores

MySQL where like with multiple words and order by weight

The simplest way is to add the following order by to the query:

order by (`text` LIKE '%What%') + (`text` LIKE '%year%') desc

If performance is an issue, then you should look into the MySQL full text functions. They significantly speed up many full text searches.



Related Topics



Leave a reply



Submit