How to Deep Copy a Set of Data, and Change Fk References to Point to All the Copies

How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?

Here is an example with three tables that can probably get you started.

DB schema

CREATE TABLE users
(user_id int auto_increment PRIMARY KEY,
user_name varchar(32));
CREATE TABLE agenda
(agenda_id int auto_increment PRIMARY KEY,
`user_id` int, `agenda_name` varchar(7));
CREATE TABLE events
(event_id int auto_increment PRIMARY KEY,
`agenda_id` int,
`event_name` varchar(8));

An SP to clone a user with his agenda and events records

DELIMITER $$
CREATE PROCEDURE clone_user(IN uid INT)
BEGIN
DECLARE last_user_id INT DEFAULT 0;

INSERT INTO users (user_name)
SELECT user_name
FROM users
WHERE user_id = uid;

SET last_user_id = LAST_INSERT_ID();

INSERT INTO agenda (user_id, agenda_name)
SELECT last_user_id, agenda_name
FROM agenda
WHERE user_id = uid;

INSERT INTO events (agenda_id, event_name)
SELECT a3.agenda_id_new, e.event_name
FROM events e JOIN
(SELECT a1.agenda_id agenda_id_old,
a2.agenda_id agenda_id_new
FROM
(SELECT agenda_id, @n := @n + 1 n
FROM agenda, (SELECT @n := 0) n
WHERE user_id = uid
ORDER BY agenda_id) a1 JOIN
(SELECT agenda_id, @m := @m + 1 m
FROM agenda, (SELECT @m := 0) m
WHERE user_id = last_user_id
ORDER BY agenda_id) a2 ON a1.n = a2.m) a3
ON e.agenda_id = a3.agenda_id_old;
END$$
DELIMITER ;

To clone a user

CALL clone_user(3);

Here is SQLFiddle demo.

Deep Copy/Duplicating Object with Virtual Navigation Properties

It depends on the relationship. References are important in EF, so you need to consider whether you want the new clone to reference the same UserData or a new and distinct UserData with the same data. Typically in a Many-to-one relationship you want to use the same reference, or update the reference to match. If the original was modified by "John Smith" ID #201, a clone would be modified by "John Smith" ID #201, or changed to the current user "Jane Doe" ID #405 which would be the same "Jane Doe" reference as any other record that user modified. You likely would not want EF to create a new "John Doe" which would end up with an ID #545 because EF was given a brand new reference to a UserData that has a copy of "John Doe".

So in your case, I would assume that you would want to refer to the same, existing user instance, so your approach is correct. Where you would need to be careful is when using a shortcut like Serialization/Deserialization to make clones. In that case serializing the Project and any loaded UpdatedBy reference would create a new instance of a UserData with the same fields and even PK value. However, when you go to save this new Project with its new UserData reference, you're either going to end up with a duplicate PK exception, an "Object with same key already tracked" exception, or find yourself with a new "John Doe" record with an ID of #545 if that entity is set up to expect an Identity column for it's PK.

Regarding the typical advice on the use of navigation properties vs. FK fields: My advice is to use one or the other, not both. The reason for this is that when you use both you have two sources of truth for the relationship and depending on the state of the entity, when you change one, the other does not necessarily reflect the change automatically. For instance some code my look at the relationship by going: project.UpdatedByFk, while other code might use project.UpdatedByFkNavigation.Id. Your naming convention is a bit odd when it comes to the navigation property. For your example I would have expected:

public virtual UserData UpdatedBy { get; set; }

In general I would use the navigation property solely and rely on a shadow property in EF for the FK. This would look like:

public partial class Project
{
[Key]
public int Id { get; set; }
[Required]
[StringLength(150)]
public string ProjectName { get; set; }

[ForeignKey("UpdatedBy_Fk")] // EF Core.. For EF6 this needs to be done via configuration using .Map(MapKey()).
public virtual UserData UpdatedBy { get; set; }
}

Here we define the navigation property and by nominating the FK column name, EF will create a field behind the scenes for that FK which isn't directly accessible. Our code exposes one source of truth for the relationship.

In certain cases where speed is important and I have little to no need for the related data, I will declare the FK property and no navigation property.

In reference to this:

[InverseProperty(nameof(UserData.ProjectUpdatedByFkNavigations))]

I would also recommend avoiding bi-directional references unless they are absolutely necessary for the same reason. If I want all projects last modified by a given user, I don't really stand to gain anything by:

var projects = context.Users
.Where(x => x.Id == userId)
.SelectMany(x => x.UpdatedProjects)
.ToList();

I would just use:

var projects = context.Projects
.Where(x => x.UpdatedBy.Id == userId)
.ToList();

In general you should look to organize your domain and the relationships within it by aggregate roots: Essentially entities that are of top-level importance within the application. Bidirectional references have similar issues of having two sources of truth that don't necessarily match at a given point of time when modifying those relationships from one side. It depends largely on whether all relationships are eager loaded or not.

Where both entities are aggregate roots and the relationship is important enough, then this can afford a bi-directional reference and the extra attention it deserves. A good example of that might be many-to-many relationships like the relationship between a CourseClass (I.e. Math Class A) and Students where a CourseClass has many students, while a Student has many CourseClasses and it makes sense from a CourseClass perspective to list it's Students, and from a Student perspective to list their CourseClasses.

Duplicate Django Model Instance and All Foreign Keys Pointing to It

You can create new instance and save it like this

def duplicate(self):
kwargs = {}
for field in self._meta.fields:
kwargs[field.name] = getattr(self, field.name)
# or self.__dict__[field.name]
kwargs.pop('id')
new_instance = self.__class__(**kwargs)
new_instance.save()
# now you have id for the new instance so you can
# create related models in similar fashion
fkeys_qs = self.fkeys.all()
new_fkeys = []
for fkey in fkey_qs:
fkey_kwargs = {}
for field in fkey._meta.fields:
fkey_kwargs[field.name] = getattr(fkey, field.name)
fkey_kwargs.pop('id')
fkey_kwargs['foreign_key_field'] = new_instance.id
new_fkeys.append(fkey_qs.model(**fkey_kwargs))
fkeys_qs.model.objects.bulk_create(new_fkeys)
return new_instance

I'm not sure how it'll behave with ManyToMany fields. But for simple fields it works. And you can always pop the fields you are not interested in for your new instance.

The bits where I'm iterating over _meta.fields may be done with copy but the important thing is to use the new id for the foreign_key_field.

I'm sure it's programmatically possible to detect which fields are foreign keys to the self.__class__ (foreign_key_field) but since you can have more of them it'll better to name the one (or more) explicitly.

How to deep copy a Hibernate entity while using a newly generated entity identifier

I am also working with Hibernate and I got the same requirement you got. What I followed was to implement Cloneable. Below is a code example of how to do it.

class Person implements Cloneable {

private String firstName;
private String lastName;

public Object clone() {

Person obj = new Person();
obj.setFirstName(this.firstName);
obj.setLastName(this.lastName);

return obj;
}

public String getFirstName() {
return firstName;
}

public void setFirstName(String firstName) {
this.firstName = firstName;
}

public String getLastName() {
return lastName;
}

public void setLastName(String lastName) {
this.lastName = lastName;
}
}

Or you could go to a reflection based solution but I won't recommend that. Check this website for more details.



Related Topics



Leave a reply



Submit