Why No Many-To-Many Relationships

Why no many-to-many relationships?

Think about a simple relationship like the one between Authors and Books. An author can write many books. A book could have many authors. Now, without a bridge table to resolve the many-to-many relationship, what would the alternative be? You'd have to add multiple Author_ID columns to the Books table, one for each author. But how many do you add? 2? 3? 10? However many you choose, you'll probably end up with a lot of sparse rows where many of the Author_ID values are NULL and there's a good chance that you'll run across a case where you need "just one more." So then you're either constantly modifying the schema to try to accommodate or you're imposing some artificial restriction ("no book can have more than 3 authors") to force things to fit.

What is the problem with many-to-many relationships?

Probably most (all??) RDMS implement a M:N relationship by creating a table containing two columns with the FKs.
So there is no advantage to explicitely model the bridge table.

But in most realistic cases you want to store additional information (besides the fact of its existence) about the relationship instance, e.g. timestamp and user from the creation. That means that you need to model the bridge table anyway.

Is a many to many relationship bad? What about this example?

The relationships can be diagrammed as follows:

VIDEO >-- TITLE --< TITLE_ACTOR >-- ACTOR

Where ">--" is a many-to-one relationship and "--<" is one-to-many.

You're right that a query that joins VIDEO to TITLE_ACTOR, even indirectly via TITLE, is going to match N rows from VIDEO to M rows from TITLE_ACTOR, and the result set will have N*M rows for a given TITLE. That's a Cartesian product between VIDEO and TITLE_ACTOR, if there are no direct join restrictions between those two tables.


Re your comments:

The diagram notation I show can illustrate the difference between a logical many-to-many relationship, and the physical implementation of that relationship, using an intersection table containing two many-to-one relationships. For example, the logical relationship we're trying to represent is the following:

       TITLE >---< ACTOR

You might draw your logical data model this way while you're designing the relationships.

But SQL does not support a way to store a many-to-many relationship. To store it physically in the database, we must use an intersection table:

TITLE --< TITLE_ACTOR >-- ACTOR

For any logical many-to-many relationship, your physical model adds an intersection table and reverses the direction of the arrows. This physical model achieves the same relationship as the logical many-to-many relationship.

Does this mean that on a diagram, that I shouldn't show a direct relationship from video to title_actor?

Yes, I would consider there to be the following relationships:

  • VIDEO references TITLE (M:1)
  • TITLE_ACTOR references TITLE (M:1)
  • TITLE_ACTOR references ACTOR (M:1)

But there is no direct relationship from VIDEO to TITLE_ACTOR. Only an indirect relationship via TITLE.

The references correspond to foreign key constraints. For example, VIDEO contains a foreign key referencing TITLE, but VIDEO does not contain (and has no need for) a foreign key referencing TITLE_ACTOR.

FWIW, the diagram above is my attempt to make a simple ASCII format of an Entity-Relationship Model.

Difference between one-to-many and many-to-one relationship

Yes, it a vice versa. It depends on which side of the relationship the entity is present on.

For example, if one department can employ for several employees then, department to employee is a one to many relationship (1 department employs many employees), while employee to department relationship is many to one (many employees work in one department).

More info on the relationship types:

Database Relationships - IBM DB2 documentation

Many to Many Relationships not saving

It is expected that nothing happens.

What you want to change or add is a relationship between the entities Service and ServiceItem. But you cannot manipulate relationships by setting the state of an entity to Modified. This only updates scalar and complex properties but no navigation properties (= relationships). (For example setting the state of a Service entity to Modified will mark Service.Title and Service.Description, etc. as modified and ensure that those properties are saved to the database. But it doesn't care about the content of Service.ServiceItems.)

The only exception where you can change a relationship by setting the state to Modified are Foreign Key Associations. These are associations that have foreign key properties exposed in your model entity and they can only occur for one-to-many or one-to-one associations. Many-to-many relationships are always Independent Associations which means they can never have a foreign key property in an entity. (Because the FKs are in the join table, but the join table is not an entity and "hidden" from your model classes.)

There is a way to directly manipulate relationships for a many-to-many association but it requires to go down to the ObjectContext and its RelationshipManager which is - in my opinion - pretty advanced and tricky.

The usual and straight-forward way to add and remove relationship entries to/from a many-to-many association is by just adding items to and removing items from the collections while the entities are attached to the context. EF's change tracking mechanism will recognize the changes you have done and generate the appropriate INSERT, UPDATE and DELETE statements when you call SaveChanges.

The exact procedure depends on if you also want to save Service and/or ServiceItem as new entities or if you only want to add relationships between existing entities. Here are a few examples:

  • service should be INSERTed, all serviceItems should be INSERTed and the relationships between the entities should be INSERTed into the join table as well:

    using (var context = new MyContext())
    {
    var service = new Service();
    var serviceItem1 = new ServiceItem();
    var serviceItem2 = new ServiceItem();
    service.ServiceItems.Add(serviceItem1);
    service.ServiceItems.Add(serviceItem2);

    context.Services.Add(service);

    context.SaveChanges();
    }

    Adding the "root" service of the object graph is enough because EF will recognize that all other entities in the graph are not attached to the context and assume that they have to be INSERTed into the database.

  • service already exists and should NOT be INSERTed, all serviceItems should be INSERTed and the relationships between the entities should be INSERTed into the join table as well:

    using (var context = new MyContext())
    {
    var service = new Service { ServiceID = 15 };
    context.Services.Attach(service);

    var serviceItem1 = new ServiceItem();
    var serviceItem2 = new ServiceItem();
    service.ServiceItems.Add(serviceItem1);
    service.ServiceItems.Add(serviceItem2);

    context.SaveChanges();
    }

    EF recognizes here (when SaveChanges is called) that service is attached but the other entities are not. No INSERT for service happens but the serviceItem1/2 will be INSERTed together with the relationship entries.

  • service already exists and should NOT be INSERTed, all serviceItems already exist and should NOT be INSERTed, but the relationships between the entities should be INSERTed into the join table:

    using (var context = new MyContext())
    {
    var service = new Service { ServiceID = 15 };
    context.Services.Attach(service);

    var serviceItem1 = new ServiceItem { ServiceItemID = 23 };
    context.ServiceItems.Attach(serviceItem1);

    var serviceItem2 = new ServiceItem { ServiceItemID = 37 };
    context.ServiceItems.Attach(serviceItem2);

    service.ServiceItems.Add(serviceItem1);
    service.ServiceItems.Add(serviceItem2);

    context.SaveChanges();
    }
  • For completeness: How to remove relationships between existing entities?

    using (var context = new MyContext())
    {
    var service = context.Services
    .Include(s => s.ServiceItems) // load the existing Items
    .Single(s => s.ServiceID == 15);

    var serviceItem1 = service.ServiceItems
    .Single(s => s.ServiceItemID == 23); // query in memory, no DB query
    var serviceItem2 = service.ServiceItems
    .Single(s => s.ServiceItemID == 37); // query in memory, no DB query

    service.ServiceItems.Remove(serviceItem1);
    service.ServiceItems.Remove(serviceItem2);

    context.SaveChanges();
    }

    The two relationship rows in the join table that link service 15 with serviceItem 23 and 37 will be deleted.

Alternativly instead of calling Attach you can load the existing entities from the database. It will work as well:

var service = context.Services.Single(s => s.ServiceID == 15);

And the same for existing ServiceItems.

Many to many relationship?

Standard practice would be this, as an example...

You have a "tblEngineer" table...

 tblEngineer
-----------
(PK) EngineerId
EngineerName

And a "tblTicket" table...

 tblTicket
---------
(PK) TicketId
TicketDetails

You now add a link table called "tblEngineerTickets" (or similar) which references the Ids of both the Engineer and their tickets...

 tblEngineerTickets
------------------
(PK) EngineerTicketId
(FK) EngineerId
(FK) TicketId

So that way, you keep all the Ticket Details and the Engineer details separately, and link them using ONLY the Ids... the link table would look something like this...

  EngineerId | TicketId
------------+----------
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3

This way, you can have multiple engineers assigned to one ticket, and/or multiple tickets assigned to an engineer.

This is best practice and it gives you the most opportunity for expansion. If you were to just add fields to your existing Engineer tables saying "Ticket1", "Ticket2", "Ticket3" etc... you would be effectively be placing a limit on the code, and potentially you'd have to keep going in to the code to add columns.

Many to many relationship with itself

What you have done is reasonable and straightforward.

PS There are further candidate keys (UNIQUE NOT NULL column sets not containing smaller ones) than the PRIMARY KEYs that you have indicated (via #):

Users (username) -- likely -- though not on Stack Overflow
User_friends (user_id, friend_id)
Friend_request (user_id, friend_Id)

PPS Re title "Many to many relationship with itself": I guess you mean that in User_friends and in Friend_request entity class Users has a "many to many relationship with itself". Tables represent (application) relationships. Values in rows are thus related, and entities they designate are thus related. There are the FOREIGN KEYs from the user_id and friend_id columns to Users id. These are often called "relationships" although they are facts/constraints. (If you queried for the user names of a user and requested or accepted friend then you would JOIN Users with itself.)

How to avoid many to many relationships

Actually, if you follow the entity-relationship model strictly as originally described by Peter Chen, you would have 5 tables - 3 entity relations and 2 relationship relations.

Many-to-one-to-many ER

Many-to-one-to-many tables

There is a transitive many-to-many relationship between Milkshake and Coronary Care via the two one-to-many relationships. In this case, the Person table need not be involved - you could join Owns to Attends on the Person key they have in common:

SELECT milkshake_id, coronary_care_id
FROM Owns
INNER JOIN Attends ON Owns.person_id = Attends.person_id

Now, if you denormalize tables with the same determinant, you get the 3 tables you had in mind in your question:

Denormalized tables

In this case, the relationship between Milkshake and Person is recorded in the Milkshake table, and the relationship between Person and CoronaryCare is recorded in the CoronaryCare table. (Remember that foreign key constraints are not relationships.)

You can still get the many-to-many relation:

SELECT milkshake_id, coronary_care_id
FROM Milkshake
INNER JOIN CoronaryCare ON Milkshake.person_id = CoronaryCare.person_id

As you can see, the Person table is not the link. Even when denormalized, you still have two one-to-many relationships, except that they're now recorded in the Milkshake and CoronaryCare tables.

The sources you refer to that recommend a link table refer to direct many-to-many relationships, not transitive many-to-many relationships. We don't need to record transitive relationships since we can generate them by combining two or more direct relationships.

Many-to-many relationships are a natural part of relational and entity-relationship data modeling and don't cause any issues in queries, as long as you understand them.



Related Topics



Leave a reply



Submit