How to Make Lists Contain Only Distinct Element in Python

How to make lists contain only distinct element in Python?

The simplest is to convert to a set then back to a list:

my_list = list(set(my_list))

One disadvantage with this is that it won't preserve the order. You may also want to consider if a set would be a better data structure to use in the first place, instead of a list.

Add only unique values to a list in python

To eliminate duplicates from a list, you can maintain an auxiliary list and check against.

myList = ['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and', 
'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is', 'kill', 'light',
'moon', 'pale', 'sick', 'soft', 'sun', 'sun', 'the', 'the', 'the',
'through', 'what', 'window', 'with', 'yonder']

auxiliaryList = []
for word in myList:
if word not in auxiliaryList:
auxiliaryList.append(word)

output:

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'breaks', 'east', 
'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick',
'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder']

This is very simple to comprehend and code is self explanatory. However, code simplicity comes on the expense of code efficiency as linear scans over a growing list makes a linear algorithm degrade to quadratic.


If the order is not important, you could use set()

A set object is an unordered collection of distinct hashable objects.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

Since the average case for membership checking in a hash-table is O(1), using a set is more efficient.

auxiliaryList = list(set(myList))

output:

['and', 'envious', 'already', 'fair', 'is', 'through', 'pale', 'yonder', 
'what', 'sun', 'Who', 'But', 'moon', 'window', 'sick', 'east', 'breaks',
'grief', 'with', 'light', 'It', 'Arise', 'kill', 'the', 'soft', 'Juliet']

How to get only distinct values from a list?

This is what you needed with set():

>>> lst1 = ['A','A','A','B','C','C','D','D','D','B','B']
>>> list(set(lst1))
['A', 'B', 'D', 'C']

Another solution OrderedDict to keep the order of keys during insertion.

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(lst1))
['A', 'B', 'C', 'D']

In case you have liberty to use pandas then try below ones..

>>> import pandas as pd
>>> drop_dups = pd.Series(lst1).drop_duplicates().tolist()
>>> drop_dups
['A', 'B', 'C', 'D']

In case you are looking for common values between two files:

$ cat getcomn_vals.py
#!/python/v3.6.1/bin/python3
def print_common_members(a, b):
"""
Given two sets, print the intersection, or "No common elements".
Remove the List construct and directly adding the elements to the set().
Hence assigned the dataset1 & dataset2 directly to set()
"""

print('\n'.join(s.strip('\n') for s in a & b) or "No common element")

with open('file1.txt') as file1, open('file2.txt') as file2:
dataset1 = set(file1)
dataset2 = set(file2)
print_common_members(dataset1, dataset2)

Way to create numpy array to contain only unique elements for lists within it

arr=np.array([list(['nan', 'nan']),
list(['nan', 'nan', 'apple', 'apple', 'banana', 'nan', 'nan']),
list(['red', 'red']), ...,
list(['nan', 'festival'])], dtype=object)

try via list comprehension:

out=[np.unique(x).tolist() for x in arr]

OR

out=[list(np.unique(x)) for x in arr]

output of out:

[['nan'], ['apple', 'banana', 'nan'], ['red'], [Ellipsis], ['festival', 'nan']]

Dividing a Python list into lists containing only unique values

It sounds like you want to generate random samples from your IDs, without replacement, for each day in your schedule.

To do this you can use numpy.random.choice. You will see from the docs that it takes a keyword argument, size, which is the number of samples to take, and another keyword argument, replace, whose default value is True.

So something like:

numpy.random.choice(IDs, size=numberOfTasks, replace=False)

will generate one day's worth of scheduling for you.

A more complete, but simple example is as follows:

import numpy
import itertools

ndays = 7
njobs = 10
people = range(17)

days = [numpy.random.choice(people, size=njobs, replace=False) for d in range(ndays)]
schedule = numpy.array(days)

which gives the example schedule:

array([[11, 12, 14,  2,  0,  3, 10,  1,  6, 13],
[ 8, 15, 7, 0, 12, 3, 1, 6, 10, 13],
[ 2, 9, 16, 4, 5, 15, 0, 8, 7, 11],
[ 1, 4, 10, 16, 6, 12, 2, 15, 13, 9],
[ 8, 1, 7, 13, 12, 0, 3, 15, 4, 9],
[ 2, 5, 7, 3, 9, 10, 13, 15, 0, 8],
[ 7, 13, 14, 6, 8, 16, 3, 11, 1, 9]])

A 'fair' schedule

Your requirement for some sort of fairness is more difficult to enforce. I'm not totally convinced that your strategy of using a kind of worker pool works in general, though it may work reasonably well most of the time. Here is a short example which uses a pool. (Note that the extra work of finding the remainder extra and supplementing the pool with randomly sampled workers is not IMO necessary, since you'll be randomly sampling from the pool anyway)

import numpy
import itertools

ndays = 7
njobs = 10
people = [p for p in range(17)]
pool = people * ndays

schedule = numpy.zeros((7, 10), numpy.int)

for day in range(ndays):
schedule[day, :] = numpy.random.choice(numpy.unique(pool), size=njobs, replace=False)
for person in schedule[day, :]:
pool.remove(person)

which gives the example schedule:

array([[12, 13,  0,  1,  2, 11,  6,  8, 16,  9],
[15, 8, 3, 10, 5, 12, 7, 0, 11, 4],
[12, 7, 13, 4, 0, 3, 15, 9, 14, 10],
[14, 6, 16, 9, 4, 15, 11, 5, 10, 3],
[ 0, 13, 6, 1, 12, 5, 15, 4, 7, 9],
[13, 15, 16, 3, 5, 2, 8, 4, 6, 7],
[ 2, 3, 15, 5, 4, 10, 0, 8, 9, 1]])

(you can get a (10, 7) shape schedule with schedule.T)

With regard to your original example, the line pool.remove(pickedPerson) looks suspicious to me, and is more likely intended as plannablePeople.remove(pickedPerson). There is also a small mistake elsewhere, such as the indexing in daySchedule[i] = pickedPerson which probably should be daySchedule[j] = pickedPerson. After correcting these, the example code in your question works well for me.

Notice also that your problem almost identical to the problem of generating Latin Squares (actually Latin Rectangle in your case, which you could obtain from any Latin Square large enough), although generating a single Latin Square is easy enough, random sampling uniformly (i.e. fairly) from all Latin Squares is so hard that it is NP-complete AFAIK. This hints (though its certainly not a proof) that it might be very hard to enforce the fairness requirements in your problem too.

How to find unique elements in a list in python? (Without using set)

This is the simplest way to do it:

a = [1, 2, 2, 3]
b = []
for i in a:
if i not in b:
b.append(i)
print (b)
[1, 2, 3]

Get unique values in List of Lists

array = [['a','b'], ['a', 'b','c'], ['a']]
result = {x for l in array for x in l}


Related Topics



Leave a reply



Submit