How to find duplicates in a List to merge them
You can collect them to a Map
based on the id
and merge the children
using the mergeFunction
. Then map them back to final objects as:
private Collection<Foo> mergeDuplicates(Collection<Foo> fooCollection) {
return fooCollection.stream()
.collect(Collectors.toMap(Foo::getId, Foo::getChildren, this::mergeChildren))
.entrySet().stream()
.map(e -> new Foo(e.getKey(), e.getValue()))
.collect(Collectors.toCollection(ArrayList::new)); // collect accordingly
}
with the updated mergeChildren
method implemented in the same class as :
private Collection<String> mergeChildren(Collection<String> foo1Children, Collection<String> foo2Children) {
foo1Children.addAll(foo2Children);
return foo1Children;
}
Note: The mergeFunction((a,b) -> {...}
) is executed only when the id
based duplicates are identified.
How to find duplicate values in a list and merge them
Sort the list then use itertools.groupby
:
>>> from itertools import groupby
>>> l = ['a','b','a','b','c','c']
>>> [list(g) for _, g in groupby(sorted(l))]
[['a', 'a'], ['b', 'b'], ['c', 'c']]
EDIT: this is probably not the fastest approach, sorting is O(n log n) time complexity for the average case and not required for all solutions (see the comments)
Find duplicate object values in an array and merge them - JAVASCRIPT
I'm not sure if you're looking for pure JavaScript, but if you are, here's one solution. It's a bit heavy on nesting, but it gets the job done.
// Loop through all objects in the array
for (var i = 0; i < jsonData.length; i++) {
// Loop through all of the objects beyond i
// Don't increment automatically; we will do this later
for (var j = i+1; j < jsonData.length; ) {
// Check if our x values are a match
if (jsonData[i].x == jsonData[j].x) {
// Loop through all of the keys in our matching object
for (var key in jsonData[j]) {
// Ensure the key actually belongs to the object
// This is to avoid any prototype inheritance problems
if (jsonData[j].hasOwnProperty(key)) {
// Copy over the values to the first object
// Note this will overwrite any values if the key already exists!
jsonData[i][key] = jsonData[j][key];
}
}
// After copying the matching object, delete it from the array
// By deleting this object, the "next" object in the array moves back one
// Therefore it will be what j is prior to being incremented
// This is why we don't automatically increment
jsonData.splice(j, 1);
} else {
// If there's no match, increment to the next object to check
j++;
}
}
}
Note there is no defensive code in this sample; you probably want to add a few checks to make sure the data you have is formatted correctly before passing it along.
Also keep in mind that you might have to decide how to handle instances where two keys overlap but do not match (e.g. two objects both having machine1
, but one with the value of 5
and the other with the value of 9
). As is, whatever object comes later in the array will take precedence.
Checking Duplicate values and merge them php mysql
Here is a snippet that will work with the data format you posted.
$initialData = $data = [
[
'max_start' => '2020-07-02 05:30:00',
'max_end' => '2020-07-02 06:30:00',
],
[
'max_start' => '2020-07-02 07:00:00',
'max_end' => '2020-07-02 07:30:00',
],
[
'max_start' => '2020-07-02 06:30:00',
'max_end' => '2020-07-02 07:00:00',
],
[
'max_start' => '2020-07-02 06:30:00',
'max_end' => '2020-07-02 07:30:00',
]
];
// Order the list chronologically by the "max_start" value, to make comparison easier later
usort($data, function($a, $b){
return $a['max_start'] <=> $b['max_start'];
});
// Final result will be collected here
$result = [];
// Work with the first list value as long there is one
while ($currentInterval = array_shift($data)) {
// Compare with each other value in the list
foreach ($data as $index => $interval) {
// Check if intervals start at the same time
if ($interval['max_start'] == $currentInterval['max_start']) {
// Merge when needed
$currentInterval['max_end'] = max ($currentInterval['max_end'], $interval['max_end']);
// Remove the merged interval
unset($data[$index]);
}
}
// Add to result
$result[] = $currentInterval;
}
echo 'Initial list: ', PHP_EOL, print_r($initialData, true);
echo 'Merged list: ', PHP_EOL, print_r($result, true);
This snippet has the following output:
Initial list:
Array
(
[0] => Array
(
[max_start] => 2020-07-02 05:30:00
[max_end] => 2020-07-02 06:30:00
)
[1] => Array
(
[max_start] => 2020-07-02 07:00:00
[max_end] => 2020-07-02 07:30:00
)
[2] => Array
(
[max_start] => 2020-07-02 06:30:00
[max_end] => 2020-07-02 07:00:00
)
[3] => Array
(
[max_start] => 2020-07-02 06:30:00
[max_end] => 2020-07-02 07:30:00
)
)
Merged list:
Array
(
[0] => Array
(
[max_start] => 2020-07-02 05:30:00
[max_end] => 2020-07-02 06:30:00
)
[1] => Array
(
[max_start] => 2020-07-02 06:30:00
[max_end] => 2020-07-02 07:30:00
)
[2] => Array
(
[max_start] => 2020-07-02 07:00:00
[max_end] => 2020-07-02 07:30:00
)
)
Let me know it if fits your needs or if further tweaking is required.
For PHP
versions prior to 7.0
, replace the usort
code with this one:
usort($data, function($a, $b){
if ($a['max_start'] == $b['max_start']) {
return 0;
}
return $a['max_start'] > $b['max_start'] ? -1 : 1;
});
Note that PHP 5.6
reached its end of life status on 31 December 2018, it is not recommended to use it anymore.
How to merge 2 List<T> and removing duplicate values from it in C#
Have you had a look at Enumerable.Union
This method excludes duplicates from the return set. This is different
behavior to the Concat
method, which returns all the elements
in the input sequences including
duplicates.
List<int> list1 = new List<int> { 1, 12, 12, 5};
List<int> list2 = new List<int> { 12, 5, 7, 9, 1 };
List<int> ulist = list1.Union(list2).ToList();
// ulist output : 1, 12, 5, 7, 9
Merge duplicate values in a dictionary
Similar to the established solutions you found, you can store a representation of the lists (and lists of lists) as strings, which makes using them as dictionary keys straightforward.
def timesheetMerge(timesheet):
output = []
unique_shifts = {}
for key, val in timesheet.items():
if str(val) not in unique_shifts.keys():
unique_shifts[str(val)] = len(output)
output.append({"weekdays": [int(key)], "time_spans": val})
else:
output[unique_shifts[str(val)]]["weekdays"].append(int(key))
return output
Related Topics
Importerror: No Module Named Bs4 (Beautifulsoup)
Python Number With 1000 Separator
How to Find the Maximum Consecutive Occurrences of a Number in Python
Get Rid of Columns With Null Value in Json Output
Using Pyserial to Send Binary Data
Tf.Data.Dataset: How to Get the Dataset Size (Number of Elements in an Epoch)
Install Utils Package in Python Facing With Error Package Not Found
Sqlalchemy, Prevent Duplicate Rows
How to Delete a Character in an Item in a List (Python)
Permissionerror: [Errno 13] Permission Denied Flask.Run()
Python Flask Threaded True Not Working
How to Close an Internet Tab With Cmd/Python
Count Duplicates Between 2 Lists
Airflow:Passing a Dynamic Value to Sub Dag Operator
Find Row Where Values for Column Is Maximal in a Pandas Dataframe