MySQL Find Related Articles

MySQL Find Related Articles

Assuming that title is also unique

SELECT fA.ID, fA.Title
from
Articles bA,
articles_tags bAT,
articles_tags fAT,
Articles fA
where
bA.title = 'some name' AND
bA.id = bAT.Article_Id AND
bAT.Tag_ID = fAT.Tag_ID AND
fAT.Article_ID = fA.ID AND
fA.title != 'some name'
GROUP BY
fA.ID, fA.Title
HAVING
count(*) >= 3

Where to exclude the 'seed' article

Because I don't care exactly WHICH tags I match on, just THAT I match on 3 tags, I only need tag_id and avoid the join to the tags table completely. So now I join the many-to-many table to itself to find the articles which have an overlap.

The problem is that the article will match itself 100% so we need to eliminate that from the results.

You can exclude that record in 3 ways. You can filter it from the table to before joining, you can have it fall out of the join, or you can filter it when you're finished.

If you eliminate it before you begin the join, you're not gaining much of an advantage. You've got thousands or millions of articles and you're only eliminating 1. I also believe this will not be useful based on the best index for the article_tag mapping table.

If you do it as part of the join, the inequality will prevent that clause from being part of the index scan and be applied as a filter after the index scan.

Consider the index on article_tags as (Tag_ID, Article_ID). If I join the index to itself on tag_id = tag_id then I'll immediately define the slice of the index to process by walking the index to each tag_id my 'seed' article has. If I add the clause article_id != article_id, that can't use the index to define the slice to be processed. That means it will be applied as a filter. e.g. Say my first tag is "BLUE". I walk the index to get all the articles which have "BLUE". (by ID of course). Say there are 50 rows. We know that 1 is my seed article and 49 are matches. If I don't include the inequality, I include all 50 records and move on. If I do include the inequality, I then have to check each of the 50 records to see which is my seed and which isn't. The next tag is "Jupiter" and it matches 20,000 articles. Again I have to check each row in that slice of the index to exclude my seed article. After I go through this 2,5,20 times (depends on tags for that seed article), I now have a completely clean set of articles to do the COUNT(*) and HAVING on. If I don't include the inequality as part of my join but instead just filter the SEED ID out after the group by and having then I only do that filer once on a very short list.

mysql query for related articles

Try

select a.* from Article a
inner join ArticleTag at
on at.idArticle = a.idArticle
where at.idTag in (select idTag from ArticleTag where idArticle =5)

or

select a.* from Article a
inner join ArticleTag at on at.idArticle= a.idArticle
inner join ArticleTag at2 on at2.idTag = a.idTag and at2.IdArticle! = at.idArticle
where at2.idArticle = 5

MySQL query to find related posts by tag and sort by popular?

Try this:

SELECT at.article_id,count(*) AS q 
FROM article_tags at
INNER JOIN article_info ai ON at.article_id = ai.article_id
WHERE at.id_tag IN (
SELECT id_tag
FROM article_tags
WHERE article_id=41
)
AND at.article_id!=41
GROUP BY at.article_id
ORDER BY q DESC, ai.article_view DESC

How to get related articles by tag names?

SELECT a1.id, GROUP_CONCAT(DISTINCT a2.id) AS related_articles
FROM articles AS a1
JOIN article_tags AS t1 ON a1.id = t1.article_id
JOIN article_tags AS t2 ON t2.tag_name = t1.tag_name AND t2.article_id != t1.article_id
JOIN articles AS a2 ON a2.id = t2.article_id
GROUP BY a1.id

DEMO

How do I query related articles based on tags with two tables?

if I understand you correctly your query should look something like this

SELECT t2.article_id, COUNT(t2.tag_id) AS matches
FROM (SELECT article_id, tag_id
FROM article_tags
WHERE article_id = 1) t1
INNER JOIN (SELECT article_id, tag_id
FROM article_tags
WHERE article_id != 1) t2
ON t1.tag_id = t2.tag_id
GROUP BY t2.article_id
ORDER BY matches DESC
LIMIT 5;

First sub-query t1 select article_id (I think this is not necessary you can only select tag_id) and tag_id for the article is viewed...

Second sub-query t2 select article_id and tag_id for all other article.

Than we do simple INNER JOIN based on tag_id from both sub-query(this will exclude all tag_id from t2 which not match with tag_id from the first table).

After this we just group and order counted tags...

Here is SQL Fiddle to see how it's work.

GL!

Finding Related Article with the MySQL LIKE Statement

You can use the split function in PHP to break that title up into each word:

$title_words = split($title);

$rel_title = 'select * from tbl_content where status = \'t\' and (1=0';

foreach($title as $word)
{
$word = str_replace(',', '', $word);
$word = mysql_real_escape_string($word);
$rel_title .= ' or article_title like \'%' . $word . '%\'';
}

$rel_title .= ')';

echo $rel_title;

I also used str_replace to remove the comma in the question. Obviously, you can apply this to other characters if you have them.

show related articles by title and tag using php/mysql

Try it:

SELECT a1.itemid, a1.title, group_concat(DISTINCT a2.itemid, CONCAT('@',a2.title) SEPARATOR "||") as related_articles
FROM articles AS a1
JOIN tag_link AS t1 ON a1.itemid = t1.tag_itemid
JOIN
(SELECT a1.itemid as itemid, a1.title as title, t1.tag_id as tag_id
FROM articles AS a1
JOIN tag_link AS t1 ON a1.itemid = t1.tag_itemid) as `a2`
ON a2.tag_id=t1.tag_id
WHERE a2.itemid != a1.itemid
GROUP BY a1.itemid

Live DEMO

Find similar news on SQL based on similar tags

If you don't care about news that don't match at all, then just look out for matching tags.

select
n1.news_id, n1.title,
n2.news_id as compared_news_id, n2.title as compared_news_title,
count(*) as tags_in_common
from news n1
join news n2 on n2.news_id <> n1.news_id
join tags t1 on t1.news_id = n1.news_id
join tags t2 on t2.news_id = n2.news_id and t2.name = t1.name
where n1.newspaper_id = 1
group by n1.news_id, n2.news_id
order by n1.news_id, n2.news_id;

Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=6ff1db3be344c40b82f892654ca08e3a

If you don't want to restrict this to one newspaper, then remove where n1.newspaper_id = 1. In that case, if you want to avoid to have both news1/news5 and news5/news1 in your results, then change n2.news_id <> n1.news_id to n2.news_id > n1.news_id.

You can also change on n2.news_id <> n1.news_id to on n2.news_id <> n1.news_id and n2.newspaper_id <> n1.newspaper_id of course, if you don't want to compare news of the same newspaper.

How to select all articles with a tag and all article's tags with one MySQL query?

This seemed to work ok for me:

select a.*
from
(
SELECT GROUP_CONCAT(t.tag SEPARATOR ', ') AS tags,a.id,a.title,a.content
FROM articles AS a LEFT JOIN tags2all AS ta ON ta.iditem=a.id
AND ta.module='articles' JOIN tags AS t ON ta.idtag=t.id
GROUP BY a.id HAVING a.title IS NOT NULL
) a inner join tags2all t2a on t2a.iditem = a.id
inner join tags t on t.id = t2a.idtag
where t.id = 3;

Here's the DDL and data I used:

create table articles (id int unsigned not null primary key, title varchar(50), content varchar(50));

create table tags (id int unsigned not null primary key, tag varchar(50));

create table tags2all (id int unsigned not null primary key, iditem int unsigned not null, idtag int unsigned not null, module varchar(50));

insert into articles values (1,"Article1","Some content");
insert into articles values (2,"Article2","Some content");
insert into articles values (3,"Article3","Some content");

insert into tags values (1,"Tag1");
insert into tags values (2,"Tag2");
insert into tags values (3,"Tag3");
insert into tags values (4,"Tag4");
insert into tags values (5,"Tag5");
insert into tags values (6,"Tag6");
insert into tags values (7,"Tag7");

insert into tags2all (id,iditem,idtag,module) values (1,1,1,"articles");
insert into tags2all (id,iditem,idtag,module) values (2,1,2,"articles");
insert into tags2all (id,iditem,idtag,module) values (3,1,3,"articles");
insert into tags2all (id,iditem,idtag,module) values (4,2,3,"articles");
insert into tags2all (id,iditem,idtag,module) values (5,2,4,"articles");
insert into tags2all (id,iditem,idtag,module) values (6,2,5,"articles");
insert into tags2all (id,iditem,idtag,module) values (7,3,3,"articles");
insert into tags2all (id,iditem,idtag,module) values (8,3,4,"articles");
insert into tags2all (id,iditem,idtag,module) values (9,3,7,"articles");


Related Topics



Leave a reply



Submit