Difference Between Two Lists with Duplicates in Python
You didn't specify if the order matters. If it does not, you can do this in >= Python 2.7:
l1 = ['a', 'b', 'c', 'b', 'c']
l2 = ['a', 'b', 'c', 'b']
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
diff = c1-c2
print list(diff.elements())
Get difference between two lists with Unique Entries
To get elements which are in temp1
but not in temp2
(assuming uniqueness of the elements in each list):
In [5]: list(set(temp1) - set(temp2))
Out[5]: ['Four', 'Three']
Beware that it is asymmetric :In [5]: set([1, 2]) - set([2, 3])
Out[5]: set([1])
where you might expect/want it to equal set([1, 3])
. If you do want set([1, 3])
as your answer, you can use set([1, 2]).symmetric_difference(set([2, 3]))
. How can I compare two lists in python and return matches
Not the most efficient one, but by far the most obvious way to do it is:
>>> a = [1, 2, 3, 4, 5]
>>> b = [9, 8, 7, 6, 5]
>>> set(a) & set(b)
{5}
if order is significant you can do it with list comprehensions like this:>>> [i for i, j in zip(a, b) if i == j]
[5]
(only works for equal-sized lists, which order-significance implies). Intersection of two lists including duplicates?
You can use collections.Counter
for this, which will provide the lowest count found in either list for each element when you take the intersection.
from collections import Counter
c = list((Counter(a) & Counter(b)).elements())
Outputs:[1, 1, 2, 3, 4]
How to compare 2 lists and remove duplicates from 1 efficiently?
Convert to set
, remove elements, then convert back to list
.
s1 = set(array1)
s2 = set(array2)
array2 = list(s2.difference(s1))
Edit: To keep track of duplicates, you can use collections.Counter
and reconstruct the list.from collections import Counter
s1 = set(array1)
array2 = [x for x in array2 if x not in s1]
# d2 = Counter(array2)
# array2 = [z for k, v in d2.items() if k not in s1 for z in [k] * v]
EDIT2: I thought using Counter
would be faster, but the secondary list construction in the comprehension seems to nullify any gains. You are better off just making the first set
, then using that for existence checks.Tests: Counter
and double comprehension
%%timeit
array1 = [random.randint(0, 10000) for _ in range(200000)]
array2 = [random.randint(0, 20000) for _ in range(200000)]
s1 = set(array1)
d2 = Counter(array2)
[z for k, v in d2.items() if k not in s1 for z in [k]*v]
# returns:
525 ms ± 19.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Test: single comprehension with existence check%%timeit
array1 = [random.randint(0, 10000) for _ in range(200000)]
array2 = [random.randint(0, 20000) for _ in range(200000)]
s1 = set(array1)
#d2 = Counter(array2)
[x for x in array1 if x not in s1]
# returns:
510 ms ± 17.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Compare two lists of floats where order and duplicates matter in Python
If I understand correctly, this should work:
sum(a != b for a, b in zip(listA, listB))
Gives expected output of 2
.Note that because your problem description states that order is important, sets will be no use here as they are not ordered.
How do i subtract two lists with non-unique elements in Python?
If the order is not important, you can make Counter
s from the lists and subtract them.
from collections import Counter
list1 = ['a', 'c', 'a', 'b']
list2 = ['a', 'a', 'a', 'a', 'b', 'c', 'c', 'd', 'e', 'f']
final = Counter(list2) - Counter(list1)
print(list(final.elements())) # -> ['a', 'a', 'c', 'd', 'e', 'f']
It's being used as a multiset.There are some caveats to "order is not important", like the fact that dicts in Python 3.7+ will preserve insertion order, hence why the output here is ordered.
Related Topics
Apt Command Line Interface-Like Yes/No Input
Can't Use '\1' Backreference to Capture-Group in a Function Call in Re.Sub() Repr Expression
What Is the Purpose of Meshgrid in Python/Numpy
Vectorized Numpy Linspace for Multiple Start and Stop Values
Scipy Curve_Fit Doesn't Like Math Module
What Does Model.Train() Do in Pytorch
Where Do the Python Unit Tests Go
Python Sorting by Multiple Criteria
How to Have Shared Log Files Under Windows
Wrapping Long Y Labels in Matplotlib Tight Layout Using Setp
Multiprocessing:Use Tqdm to Display a Progress Bar
Python - Datetime with Timezone to Epoch
Numpy 1.21.2 May Not Yet Support Python 3.10
How to Format a Date in Jinja2
Typeerror: Expected a Character Buffer Object - While Trying to Save Integer to Textfile