Hierarchical data in Linq - options and performance
I would set up a view and an associated table-based function based on the CTE. My reasoning for this is that, while you could implement the logic on the application side, this would involve sending the intermediate data over the wire for computation in the application. Using the DBML designer, the view translates into a Table entity. You can then associate the function with the Table entity and invoke the method created on the DataContext to derive objects of the type defined by the view. Using the table-based function allows the query engine to take your parameters into account while constructing the result set rather than applying a condition on the result set defined by the view after the fact.
CREATE TABLE [dbo].[hierarchical_table](
[id] [int] IDENTITY(1,1) NOT NULL,
[parent_id] [int] NULL,
[data] [varchar](255) NOT NULL,
CONSTRAINT [PK_hierarchical_table] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE VIEW [dbo].[vw_recursive_view]
AS
WITH hierarchy_cte(id, parent_id, data, lvl) AS
(SELECT id, parent_id, data, 0 AS lvl
FROM dbo.hierarchical_table
WHERE (parent_id IS NULL)
UNION ALL
SELECT t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
FROM dbo.hierarchical_table AS t1 INNER JOIN
hierarchy_cte AS h ON t1.parent_id = h.id)
SELECT id, parent_id, data, lvl
FROM hierarchy_cte AS result
CREATE FUNCTION [dbo].[fn_tree_for_parent]
(
@parent int
)
RETURNS
@result TABLE
(
id int not null,
parent_id int,
data varchar(255) not null,
lvl int not null
)
AS
BEGIN
WITH hierarchy_cte(id, parent_id, data, lvl) AS
(SELECT id, parent_id, data, 0 AS lvl
FROM dbo.hierarchical_table
WHERE (id = @parent OR (parent_id IS NULL AND @parent IS NULL))
UNION ALL
SELECT t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
FROM dbo.hierarchical_table AS t1 INNER JOIN
hierarchy_cte AS h ON t1.parent_id = h.id)
INSERT INTO @result
SELECT id, parent_id, data, lvl
FROM hierarchy_cte AS result
RETURN
END
ALTER TABLE [dbo].[hierarchical_table] WITH CHECK ADD CONSTRAINT [FK_hierarchical_table_hierarchical_table] FOREIGN KEY([parent_id])
REFERENCES [dbo].[hierarchical_table] ([id])
ALTER TABLE [dbo].[hierarchical_table] CHECK CONSTRAINT [FK_hierarchical_table_hierarchical_table]
To use it you would do something like -- assuming some reasonable naming scheme:
using (DataContext dc = new HierarchicalDataContext())
{
HierarchicalTableEntity h = (from e in dc.HierarchicalTableEntities
select e).First();
var query = dc.FnTreeForParent( h.ID );
foreach (HierarchicalTableViewEntity entity in query) {
...process the tree node...
}
}
Recursive/Hierarchical Query in LINQ
Try this:
var geo_hierarchy = new []
{
new { cat_id = "Root", geo_id = 1, parent_id = (int?)null, geo_name = "Pakistan", },
new { cat_id = "Province", geo_id = 2, parent_id = (int?)1, geo_name = "Punjab", },
new { cat_id = "District", geo_id = 3, parent_id = (int?)2, geo_name = "Attock", },
new { cat_id = "City", geo_id = 4, parent_id = (int?)3, geo_name = "Attock", },
new { cat_id = "City", geo_id = 5, parent_id = (int?)3, geo_name = "Fateh Jang", },
new { cat_id = "City", geo_id = 6, parent_id = (int?)3, geo_name = "Hasan Abdal", },
};
var map = geo_hierarchy.ToDictionary(x => x.geo_id);
Func<int, string, string> up = null;
up = (geo_id, cat_id) =>
{
var record = map[geo_id];
return
record.cat_id == cat_id
? record.geo_name
: (record.parent_id.HasValue
? up(record.parent_id.Value, cat_id)
: null);
};
var result = up(5, "Province"); // -> "Punjab"
How to retrieve hierarchical data using Linq-to-entities?
I'm afraid you'll have to recurse over the LINQ results in your code. Depending on table size and structure you may want to pull down the whole table into memory (as it seems you are doing) and do the hierarchy walk there. If you have a very large table, you may want to make repeated trips to the DB.
For more information read this great article: Storing Hierarchical Data in a Database
Your code is already doing the flattening in memory. I would suggest using Skip(pSize * pIndex).Take(pSize)
on your list2
object (which contains the flattened) data. Be aware that this may lead to page breaks deep in the hierarchy.
Multi-level hierarchical data sorting C#
So, given:
public class Document
{
public int id;
public int? ParentId;
public string name;
}
And:
var documents = new[]
{
new Document() { id = 1, ParentId = null, name = "fruits" },
new Document() { id = 2, ParentId = null, name = "vegetables" },
new Document() { id = 3, ParentId = 2, name = "tomatoes" },
new Document() { id = 4, ParentId = 1, name = "apples" },
new Document() { id = 5, ParentId = 4, name = "golden apples" },
new Document() { id = 6, ParentId = 4, name = "red apples" },
};
Then this works:
var lookup = documents.ToLookup(x => x.ParentId);
Func<int?, IEnumerable<Document>> heirarchySort = null;
heirarchySort = pid =>
lookup[pid].SelectMany(x => new[] { x }.Concat(heirarchySort(x.id)));
IEnumerable<Document> sorted = heirarchySort(null);
It gives you:
Local functions now make this a little nicer on the eye.
IEnumerable<Document> heirarchySort(int? pid) =>
lookup[pid].SelectMany(x => new[] { x }.Concat(heirarchySort(x.id)));
How do I find the root of a hierarchy using LINQ and EF?
Is this the kind of thing you're looking for
CREATE TABLE [dbo].[hierarchical_table](
[id] INT,
[parent_id] INT,
[data] VARCHAR(255)
)
GO
INSERT [dbo].[hierarchical_table]
SELECT 1,NULL,'/1' UNION ALL
SELECT 2,1,'/1/2' UNION ALL
SELECT 3,2,'/1/2/3' UNION ALL
SELECT 4,2,'/1/2/4' UNION ALL
SELECT 5,NULL, '/5' UNION ALL
SELECT 6,5,'/5/6'
GO
CREATE FUNCTION [dbo].[fn_root_for_node]
(
@id int
)
RETURNS TABLE AS
RETURN (
WITH [hierarchy_cte](id, parent_id, data, [Level]) AS
(
SELECT
id,
parent_id,
data,
0 AS [Level]
FROM dbo.hierarchical_table
WHERE id = @id
UNION ALL
SELECT t1.id, t1.parent_id, t1.data, h.[Level] + 1 AS [Level]
FROM dbo.hierarchical_table AS t1
INNER JOIN hierarchy_cte AS h
ON t1.id = h.parent_id
)
SELECT TOP 1
id, parent_id, data, [Level]
FROM [hierarchy_cte]
ORDER BY [Level] DESC
)
GO
SELECT * FROM [dbo].[fn_root_for_node] (6)
GO
Rebuild Parent- Child- GrandChild hierarchy with LINQ performance
Why is the lambda so much slower?
projectSections.Select(section => new SectionViewModel(project, section))
from section in projectSections
select new SectionViewModel(
project,
section,
bidItemCollection.Where(...).ToList()
These two call the different constructor, hence the difference in the execution time.
As long as the logic and method is the same, both way of writing it should give the same result and at the same time. Because, the compiler generate the same IL.
How can I optimize it?
Since I cannot do the benchmark on your machine, I'll just use general assumption.
- Usually performing the pull all the required data is better than pulling it multiple times.
So I would avoid calling toSectionViewModel.ctor(Project,Section)
andBidItemViewModel.ctor(Project,BidItem)
as they will perform more queries in the database.
With that said, I would write my lambda as following : //actually this is just your 3rd piece of code cleaned up
sectionViewModels = new List<SectionViewModel>(
projectSections.Select(
s => new SectionViewModel(project, s, bidItemCollection.Where(b => b.SectionId == s.SectionId).Select(
b => new BidItemViewModel(project, b, subItemCollection.Where(si => si.BidItemId == b.BidItemId))))));
Also, for prettiness, I've changed the following contructors to avoid having ToList
in middle of Lambda :
public class SectionViewModel
{
private readonly Section section;
private readonly List<BidItemViewModel> bidItems;
public SectionViewModel(Project project, Section section, IEnumerable<BidItemViewModel> bidItemsForSection)
{
this.section = section;
this.bidItems = bidItemsForSection.ToList();
}
}
public class BidItemViewModel
{
private BidItem bidItem;
private List<SubItem> subItems;
public BidItemViewModel(Project project, BidItem bidItem, IEnumerable<SubItem> subItems)
{
this.bidItem = bidItem;
this.subItems = subItems.ToList();
}
}
Sort hierarchy with path and depth fields using Linq
Ok, first you should really add the following constraint
interface ITree<T>
where T : class, ITree<T>
{
// ...
}
so we can safely navigate the hierarchy using Parent
and Children
properties w/o casting.
Second, instead of reinventing the wheel, I'll reuse the general tree in order traversal helper method from my answer to How to flatten tree via LINQ? (and couple others):
public static partial class TreeUtils
{
public static IEnumerable<T> Expand<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector)
{
var stack = new Stack<IEnumerator<T>>();
var e = source.GetEnumerator();
try
{
while (true)
{
while (e.MoveNext())
{
var item = e.Current;
yield return item;
var elements = elementSelector(item);
if (elements == null) continue;
stack.Push(e);
e = elements.GetEnumerator();
}
if (stack.Count == 0) break;
e.Dispose();
e = stack.Pop();
}
}
finally
{
e.Dispose();
while (stack.Count != 0) stack.Pop().Dispose();
}
}
}
With that helper in the pocket, the method in question is simple as that:
partial class TreeUtils
{
public static IEnumerable<T> Ordered<T>(this IEnumerable<T> source, Func<IEnumerable<T>, IEnumerable<T>> order = null)
where T : class, ITree<T>
{
if (order == null) order = items => items.OrderBy(item => item.Name);
return order(source.Where(item => item.Parent == null))
.Expand(item => item.Children != null && item.Children.Any() ? order(item.Children) : null);
}
}
Sample usage:
List<TreeItem> flatList = ...;
var orderedList = flatList.Ordered().ToList();
UPDATE: Here is the same by using only Path
and Id
properties:
public static partial class TreeUtils
{
public static IEnumerable<T> Ordered<T>(this IEnumerable<T> source, Func<IEnumerable<T>, IEnumerable<T>> order = null)
where T : class, ITree<T>
{
if (order == null) order = items => items != null && items.Any() ? items.OrderBy(item => item.Name) : null;
var chldrenByParentId = source
.Select(item => new { item, path = item.Path.Split('.') })
.ToLookup(e => e.path.Length >= 2 ? int.Parse(e.path[e.path.Length - 2]) : (int?)null, e => e.item);
return (order(chldrenByParentId[null]) ?? Enumerable.Empty<T>())
.Expand(item => order(chldrenByParentId[item.Id]));
}
}
Related Topics
Why (And How) to Split Column Using Master..Spt_Values
Oracle Joins - Comparison Between Conventional Syntax VS Ansi Syntax
How to Set Auto Increment Primary Key in Postgresql
SQL Multiple Columns in In Clause
Insert Command :: Error: Column "Value" Does Not Exist
What's the Proper Index for Querying Structures in Arrays in Postgres JSONb
Get the Default Values of Table Columns in Postgres
How to Find Column Names for All Tables in All Databases in SQL Server
Sql:In Clause in Stored Procedure:How to Pass Values
Not Null Constraint Over a Set of Columns
Preventing Adjacent/Overlapping Entries with Exclude in Postgresql
Postgresql Sequence Based on Another Column
List Columns with Indexes in Postgresql
Why Are Relational Set-Based Queries Better Than Cursors
Reference Alias in Where Clause
Varbinary to String on SQL Server