Python Equivalent of Filter() Getting Two Output Lists (I.E. Partition of a List)

python equivalent of filter() getting two output lists (i.e. partition of a list)

Try this:

def partition(pred, iterable):
    trues = []
    falses = []
    for item in iterable:
        if pred(item):
            trues.append(item)
        else:
            falses.append(item)
    return trues, falses

Usage:

>>> trues, falses = partition(lambda x: x > 10, [1,4,12,7,42])
>>> trues
[12, 42]
>>> falses
[1, 4, 7]

There is also an implementation suggestion in itertools recipes:

from itertools import filterfalse, tee

def partition(pred, iterable):
    'Use a predicate to partition entries into false entries and true entries'
    # partition(is_odd, range(10)) --> 0 2 4 6 8   and  1 3 5 7 9
    t1, t2 = tee(iterable)
    return filterfalse(pred, t1), filter(pred, t2)

The recipe comes from the Python 3.x documentation. In Python 2.x filterfalse is called ifilterfalse.

python equivalent of scala partition

Using filter (require two iteration):

>>> items = [1,2,3,4,5]
>>> inGroup = filter(is_even, items) # list(filter(is_even, items)) in Python 3.x
>>> outGroup = filter(lambda n: not is_even(n), items)
>>> inGroup
[2, 4]
>>> outGroup

Simple loop:

def partition(item, filter_):
    inGroup, outGroup = [], []
    for n in items:
        if filter_(n):
            inGroup.append(n)
        else:
            outGroup.append(n)
    return inGroup, outGroup

Example:

>>> items = [1,2,3,4,5]
>>> inGroup, outGroup = partition(items, is_even)
>>> inGroup
[2, 4]
>>> outGroup
[1, 3, 5]

List splitting by predicate

Maybe

for r in results:
    (okays if success_condition(r) else errors).append(r)

But that doesn't look/feel very Pythonic.

Not directly relevant, but if one is looking for efficiency, caching the method look-ups would be better:

okays_append = okays.append
errors_append = errors.append

for r in results:
    (okays_append if success_condition(r) else errors_append)(r)

Which is even less Pythonic.

I'm getting a list of lists in my reducer output rather than a paired value and I am unsure of what to change in my code

From the documentation for Counter.most_common explains why you get a list of lists.

most_common(n=None) method of collections.Counter instance
    List the n most common elements and their counts from the most
    common to the least.  If n is None, then list all element counts.
    
    >>> Counter('abracadabra').most_common(3)
    [('a', 5), ('b', 2), ('r', 2)]

Because sorting from highest to lowest frequency is like sorting in descending order, but sorting alphabetically is in ascending order, you can use a custom tuple where you take the negative of the frequency and sort everything in ascending order.

from collections import Counter

words = Counter(['coronavirus'] * 4 + ['economy'] * 2 + ['china'] * 2 + ['whatever'])
x = Counter(words)
most_common = x.most_common(3)
# After sorting you need to discard the freqency from each (word, freq) tuple
result = ' '.join(word for word, _ in sorted(most_common, key=lambda x: (-x[1], x[0])))

How to split a list based on a condition?

good = [x for x in mylist if x in goodvals]
bad  = [x for x in mylist if x not in goodvals]
is there a more elegant way to do this?

That code is perfectly readable, and extremely clear!

# files looks like: [ ('file1.jpg', 33L, '.jpg'), ('file2.avi', 999L, '.avi'), ... ]
IMAGE_TYPES = ('.jpg','.jpeg','.gif','.bmp','.png')
images = [f for f in files if f[2].lower() in IMAGE_TYPES]
anims  = [f for f in files if f[2].lower() not in IMAGE_TYPES]

Again, this is fine!

There might be slight performance improvements using sets, but it's a trivial difference, and I find the list comprehension far easier to read, and you don't have to worry about the order being messed up, duplicates being removed as so on.

In fact, I may go another step "backward", and just use a simple for loop:

images, anims = [], []

for f in files:
    if f.lower() in IMAGE_TYPES:
        images.append(f)
    else:
        anims.append(f)

The a list-comprehension or using set() is fine until you need to add some other check or another bit of logic - say you want to remove all 0-byte jpeg's, you just add something like..

if f[1] == 0:
    continue

Filter a list into two parts by a predicate

I don't think there is a built-in and your version is sub-optimal because it traverses the list twice and calls the predicate on each list element twice.

(defun filter-list-into-two-parts (predicate list)
  (loop for x in list
    if (funcall predicate x) collect x into yes
    else collect x into no
    finally (return (values yes no))))

I return two values instead of the list thereof; this is more idiomatic (you will be using multiple-value-bind to extract yes and no from the multiple values returned, instead of using destructuring-bind to parse the list, it conses less and is faster, see also values function in Common Lisp).

A more general version would be

(defun split-list (key list &key (test 'eql))
  (let ((ht (make-hash-table :test test)))
    (dolist (x list ht)
      (push x (gethash (funcall key x) ht '())))))
(split-list (lambda (x) (mod x 3)) (loop for i from 0 to 9 collect i))
==> #S(HASH-TABLE :TEST FASTHASH-EQL (2 . (8 5 2)) (1 . (7 4 1)) (0 . (9 6 3 0)))

List comprehension vs. lambda + filter

It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter+lambda, but use whichever you find easier.

There are two things that may slow down your use of filter.

The first is the function call overhead: as soon as you use a Python function (whether created by def or lambda) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn't think much about performance until you've timed your code and found it to be a bottleneck, but the difference will be there.

The other overhead that might apply is that the lambda is being forced to access a scoped variable (value). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value through a closure and this difference won't apply.

The other option to consider is to use a generator instead of a list comprehension:

def filterbyvalue(seq, value):
   for el in seq:
       if el.attribute==value: yield el

Then in your main code (which is where readability really matters) you've replaced both list comprehension and filter with a hopefully meaningful function name.

Is it possible to compile results into unique lists from inside of a comprehension?

Not without using side effects and throwing away the result. You can do this though:

even = []
odd = []
for i in my_list:
    (odd if i % 2 else even).append(i)

This problem in general is called partitioning the list, you can find some solutions by searching SO, but none are much cleaner (in Python).

if-else in python list comprehensions

You can only construct one list at a time with list comprehension. You'll want something like:

nums = [foo for foo in mixed_list if foo.isdigit()]
strings = [foo for foo in mixed_list if not foo.isdigit()]

Python Equivalent of Filter() Getting Two Output Lists (I.E. Partition of a List)