Recursive Categories with a Single Query

Recursive categories with a single query?

If the tree isn't too large, you can simply build the tree in PHP using some clever references.

$nodeList = array();
$tree = array();

$query = mysql_query("SELECT category_id, name, parent FROM categories ORDER BY parent");
while($row = mysql_fetch_assoc($query)){
$nodeList[$row['category_id']] = array_merge($row, array('children' => array()));
}
mysql_free_result($query);

foreach ($nodeList as $nodeId => &$node) {
if (!$node['parent'] || !array_key_exists($node['parent'], $nodeList)) {
$tree[] = &$node;
} else {
$nodeList[$node['parent']]['children'][] = &$node;
}
}
unset($node);
unset($nodeList);

This will give you the tree structure in $tree with the children in the respective children-slot.

We've done this with fairly large trees ( >> 1000 items) and it's very stable and a lot faster than doing recursive queries in MySQL.

Recursive SELECT Category breadcrumbs with LIKE In a single query

SQL can only (without being "clever" using stored procedure etc) ever return a fixed number of columns.

(FYI - the answer you link to, IMHO is being "clever". It's doing some really in-obvious things with session variables etc, which if they don't hurt performance are hurting readability - so I'm going to try and answer your question from a different angle.)

You could therefore fix (hardcode) the breadcrumb "depth", and with a fixed number of JOIN statements everything becomes very simple.

I'm assuming the breadcrumb depths are intended to be anything between 1 and infinity? i.e. another "collection" of items may be filed under a smaller depth of categories?

This being the case, your GROUP_CONCAT is probably one part of the solution, because it emulates "variable column counts" in SQL. (It returns as 1 column, but can contain a flexible number of delimited values inside.)

Your problem will be that still the nature of SQL can only join one table to another (single) table per JOIN statement. Your breadcrumb data structure is well normalised, and assumes a join per each sub-category its parent.

You could try to dynamically build the SQL - but that's probably going to burn you. You're probably left with two "obvious" options:

  1. Stored procedure. (Go beyond basic SQL.)
  2. Change your schema. (Solve the problem when the data is stored, rather than retrieved.)

Stored procedures could solve this in a number of ways - one obvious option is to iterate through building each breadcrumb procedurally, storing reach in a temporary table, then finally selecting the entire temporary table.

I'm happy to give you guidance on this but I won't do it as part of this answer (unless requested to) because I'm fairly sure the performance will be sufficiently bad that you ultimately won't want to use it.

The other "major" option then is to restructure the schema. In this scenario the level of normalisation you've achieved is making things unduly complex. It's good "academically" and it's good for disk space. But it's not lending itself to solving your problem very well!

Denormalising has one other major trade-off. You'll have more complexity when changing the data in the schema. I'd recommend starting off by writing a routine which "rebuilds" the data (if you take this approach), because otherwise things will get out of sync and you'll spend forever trying to work out what went wrong. (I speak from experience.)

For every matching record (you're comparing user input to CategoryName) you want to return and be able to group by everything which precedes it in the tree. And without doing "clever" stuff.

One (of several) denormalisation approaches would be to maintain a depth * width long list of leaves to ancestors. (Like I said, it's not storage efficient. You'll have to evaluate if that is a problem in a production scenario.) For your example data, it would look like this:

+------------+--------+
| AncestorId | LeafId |
+------------+--------+
| 20081 | 66840 |
| 20081 | 96762 |
| 13851 | 66840 |
| 13851 | 96762 |
| 100904 | 66840 |
| 100904 | 96762 |
| 66840 | 66840 |
| 96762 | 96762 |
+------------+--------+

Thus now you can do something like this:

CREATE TABLE `tree_branches` (
`AncestorId` int(11) NOT NULL,
`LeafId` int(11) NOT NULL
);

INSERT INTO `tree_branches` SET `AncestorId`=20081, `LeafId`=66840;
INSERT INTO `tree_branches` SET `AncestorId`=20081, `LeafId`=96762;
INSERT INTO `tree_branches` SET `AncestorId`=13851, `LeafId`=66840;
INSERT INTO `tree_branches` SET `AncestorId`=13851, `LeafId`=96762;
INSERT INTO `tree_branches` SET `AncestorId`=100904, `LeafId`=66840;
INSERT INTO `tree_branches` SET `AncestorId`=100904, `LeafId`=96762;
INSERT INTO `tree_branches` SET `AncestorId`=66840, `LeafId`=66840;
INSERT INTO `tree_branches` SET `AncestorId`=96762, `LeafId`=96762;

SELECT
GROUP_CONCAT(`breadCrumbCategories`.`CategoryName` SEPARATOR " > ")
FROM `ebay_categories` AS `matchedCategory`
INNER JOIN `tree_branches` AS `matchedCategoryLeaves` ON (`matchedCategoryLeaves`.`AncestorId` = `matchedCategory`.`categoryId`)
INNER JOIN `tree_branches` AS `breadCrumbs` ON (`breadCrumbs`.`LeafId` = `matchedCategoryLeaves`.`LeafId`)
INNER JOIN `ebay_categories` AS `breadCrumbCategories` ON (`breadCrumbCategories`.`CategoryId` = `breadCrumbs`.`ancestorId`)
WHERE
`matchedCategory`.`CategoryName` LIKE "Post%"
GROUP BY
`breadCrumbs`.`LeafId`
;

You should add some sort of sorting for the GROUP_BY to ensure it doesn't do something implicitly unexpected. You could (for example) maintain a level ID for that purpose.

Update:
Once you have understood what I've done above, you should test it with LIKE 'Ant%' and observe the erroneous output. Add a second GROUP BY clause and a DISTINCT to solve the problem caused by user queries matching multiple crumbs which are ancestors to the same leaf.

SELECT
DISTINCT GROUP_CONCAT(`breadCrumbCategories`.`CategoryName` SEPARATOR " > ")
FROM `ebay_categories` AS `matchedCategory`
INNER JOIN `tree_branches` AS `matchedCategoryLeaves` ON (`matchedCategoryLeaves`.`AncestorId` = `matchedCategory`.`categoryId`)
INNER JOIN `tree_branches` AS `breadCrumbs` ON (`breadCrumbs`.`LeafId` = `matchedCategoryLeaves`.`LeafId`)
INNER JOIN `ebay_categories` AS `breadCrumbCategories` ON (`breadCrumbCategories`.`CategoryId` = `breadCrumbs`.`ancestorId`)
WHERE
`matchedCategory`.`CategoryName` LIKE "An%"
GROUP BY
`breadCrumbs`.`LeafId`,
`matchedCategory`.`CategoryId`
;

How to recursively populate sub-categories?

You can give it a try:

SELECT 
t1.title AS 'Main Category',
t2.title AS 'Second Level Category',
t3.title AS 'Third Level Category'
FROM categories AS t1
INNER JOIN categories AS t2 ON t2.parent_id = t1.ID
INNER JOIN categories AS t3 ON t3.parent_id = t2.ID;

You will get output structure like below:

Main Category      Second Level Category        Third Level Category
A a aa
B b bb
C c cc
C c ccc

Reference

SQL - Showing all parents from recursive relationship

Here's one option using multiple outer joins:

select c.*, 
case when c2.id is not null then c2.name end parent_name,
case when c3.id is not null then c3.name end parent_name_2
from category c
left join category c2 on c.parent_id = c2.id
left join category c3 on c2.parent_id = c3.id
  • SQL Fiddle Demo

recursive function to get all the child categories

I had a hard time trying to figure out your function. I think this will do what you want. It gets all the children of a category with ID $id, and also their children (thus getting the whole sub category, sub sub category effect that you wanted).

function categoryChild($id) {
$s = "SELECT ID FROM PLD_CATEGORY WHERE PARENT_ID = $id";
$r = mysql_query($s);

$children = array();

if(mysql_num_rows($r) > 0) {
# It has children, let's get them.
while($row = mysql_fetch_array($r)) {
# Add the child to the list of children, and get its subchildren
$children[$row['ID']] = categoryChild($row['ID']);
}
}

return $children;
}

This function will return:

$var = array(
'categoryChild ID' => array(
'subcategoryChild ID' => array(
'subcategoryChild child 1' => array(),
'subcategoryChild child 2' => array()
)
),
'anotherCategoryChild ID' => array() # This child has no children of its own
);

It basically returns an array with the ID of the child and an array containing the IDs of its children. I hope this is of any help.



Related Topics



Leave a reply



Submit