Self-Referencing Many-To-Many Recursive Relationship Code First Entity Framework

Self-referencing many-to-many recursive relationship code first Entity Framework

By convention, Code First will take uni-directional associations as one to many. Therefore you need to use fluent API to let Code First know that you want to have a many to many self referencing association:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Member>().HasMany(m => m.Friends).WithMany().Map(m =>
        {
            m.MapLeftKey("MemberId");
            m.MapRightKey("FriendId");
            m.ToTable("MembersFriends");
        }
    );
}

One to many recursive relationship with Code First

Cause of the exception

The exception is caused by a Select or SelectMany on a null collection, in your case the result of

b => b.Children

For each branch in the hierarchy the Children collection is accessed when they reach the part

selector(node)

The selector is the lambda expression b => b.Children, which is the same as a method

IEnumerable<Branch> anonymousMethod(Branch b)
{
    return b.Children;
}

So what actually happens is b.Children.SelectMany(...), or null.SelectMany(...), which raises the exception you see.

Preventing it

But why are these Children collections null?

This is because lazy loading does not happen. To enable lazy loading the collection must be virtual:

public virtual ICollection<Branch> Children { get; set; }

When EF fetches a Branch object from the database it creates a proxy object, an object derived from Branch, that overrides virtual properties by code that is capable of lazy loading. Now when b.Children is addressed, EF will execute a query that populates the collection. If there are no children, the collection will be empty, not null.

Flattening explained

So what happens in the Flatten method is that first the children of the branch are fetched (selector(node)), subsequently on each of these children (SelectMany) the Flatten method is called again (now just as a method Flatten(x, selector), not an extension method).

In the Flatten method each node is added to the collection of its children (.Concat(new[] { node }), so in the end, all nodes in the hierarchy are returned (because Flatten returns the node that enters it).

Some remarks

I would like to have the parent node on top of the collection, so I would change the Flatten method into

public static IEnumerable<T> Flatten<T>(this T node, Func<T,IEnumerable<T>> selector)
{
    return new[] { node }
        .Concat(selector(node).SelectMany(x => Flatten(x, selector)));
}

Fetching a hierarchy by lazy loading is quite inefficient. In fact, LINQ is not the most suitable tool for querying hierarchies. Doing this efficiently would require a view in the database that uses a CTE (common table expression). But that's a different story...

Self-referencing many-to-many relationship EF code first

You must declare a foreign key in Person. Breeze requires the FK to correctly resolve associations.

Edit:

I just realized you are asking about a many-to-many relationship. (yeah, I should have read the post title...)
Breeze does not support many-to-many associations.
However, you could have two one-to-many relationships to work as a many-to-many. (i.e. many-to-one-to-many) In this case, you will need to define the linking table/entity and the foreign key as mentioned earlier. (see http://www.breezejs.com/documentation/navigation-properties)

Self Referencing Many-to-Many relations

It's not possible to have just one collection with relations. You need two - one with relations the ticket equals TicketFrom and second with relations the ticket equals TicketTo.

Something like this:

Model:

public class Ticket
{ 
    public int Id { get; set; }
    public string Title { get; set; }

    public virtual ICollection<Relation> RelatedTo { get; set; }
    public virtual ICollection<Relation> RelatedFrom { get; set; }
}

public class Relation
{
    public int FromId { get; set; }
    public int ToId { get; set; }

    public virtual Ticket TicketFrom { get; set; }
    public virtual Ticket TicketTo { get; set; }
}

Configuration:

modelBuilder.Entity<Relation>()
    .HasKey(e => new { e.FromId, e.ToId });

modelBuilder.Entity<Relation>()
    .HasOne(e => e.TicketFrom)
    .WithMany(e => e.RelatedTo)
    .HasForeignKey(e => e.FromId);

modelBuilder.Entity<Relation>()
    .HasOne(e => e.TicketTo)
    .WithMany(e => e.RelatedFrom)
    .HasForeignKey(e => e.ToId);

Note that a solution using Parent is not equivalent, because it would create one-to-many association, while if I understand correctly you are seeking for many-to-many.

EF Code First improving performance for self referencing, one to many relationships

CTE Approach

There are two ways to increase speed of queries against tree data types. The first (and likely easiest) is using a Stored Procedure and the execute sql functionality of EF to load the tree. The SProc will cache and the result set execution speed will be increased. My recommendation for the query in the sproc would be a recursive CTE.

http://msdn.microsoft.com/en-us/library/ms186243(v=sql.105).aspx

with <CTEName> as
(
     SELECT
         <Root Query>
     FROM <TABLE>

     UNION ALL

     SELECT
         <Child Query>
     FROM <TABLE>
     INNER JOIN <CTEName>
         ON <CTEJoinCondition>
     WHERE 
          <TERMINATION CONDITION>

)

Edit

Execute your sproc or CTE inline with:

DbContext ctx = new SampleContext();
ctx.Database.SqlQuery<YourEntityType>(@"SQL OR SPROC COMMAND HERE", new[] { "Param1", "Param2", "Etc" });

Flatten Your Tree Structure

The second approach is to build a flat representation of your tree. You can flatten a tree into a flat structure for quick querying and then use a linkage between the flat structure and the actual tree node to cut out the self referencing entity. You can build the flat structure using the above recursive CTE query.

This is just one approach but there are many papers on the subject:

http://www.governor.co.uk/news-plus-views/2010/5/17/depth-first-tree-flattening-with-the-yield-keyword-in-c-sharp/

EDIT: Adding additional clarification
Just a note, the Recursive CTE cache's the symbols for the query before iterating over the structure. This is the fastest and simplest way to write a query to solve your problem. However, this HAS to be a SQL query. You can use execute sql directly or you can execute a SProc. Sprocs cache the execution graph after being ran so they perform better than native queries that have to build an execution plan prior to running. This is entirely up to you.

The issue with a flat representation of your tree is you have to routinely rebuild or constantly upkeep the flat structure. Depending on your query path would determine what flattening algorithm you should use, but the end result remains the same. The flat structure is the only way to "accomplish" what you want to do inside EF without having to cheat and execute raw SQL through the DBConnection.

Self-Referencing Many-To-Many Recursive Relationship Code First Entity Framework