List comprehension vs. lambda + filter
It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter
+lambda
, but use whichever you find easier.
There are two things that may slow down your use of filter
.
The first is the function call overhead: as soon as you use a Python function (whether created by def
or lambda
) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn't think much about performance until you've timed your code and found it to be a bottleneck, but the difference will be there.
The other overhead that might apply is that the lambda is being forced to access a scoped variable (value
). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value
through a closure and this difference won't apply.
The other option to consider is to use a generator instead of a list comprehension:
def filterbyvalue(seq, value):
for el in seq:
if el.attribute==value: yield el
Then in your main code (which is where readability really matters) you've replaced both list comprehension and filter with a hopefully meaningful function name.
When would using the filter function be used instead of a list comprehension?
There's no harm in using either. A similar comment can be made about map
.
I tend to use whatever one feels easier to read. In your case I would avoid using the lambda as it is a bit verbose, and instead use the comprehension.
I would use filter
or map
methods if I already had a function existing I could just pass to the method, which would be more terse than the comprehension.
For example, say I write a program for finding the length of the largest name:
# Using map
longest = max(map(len, names))
# Using generator expression
longest = max(len(name) for name in names))
In the above example I would choose map
over the generator expression, but it's entirely personal preference.
lambda versus list comprehension performance
Your tests are doing very different things. With S being 1M elements and T being 300:
[x for x in S for y in T if x==y]= 54.875
This option does 300M equality comparisons.
filter(lambda x:x in S,T)= 0.391000032425
This option does 300 linear searches through S.
[val for val in S if val in T]= 12.6089999676
This option does 1M linear searches through T.
list(set(S) & set(T))= 0.125
This option does two set constructions and one set intersection.
The differences in performance between these options is much more related to the algorithms each one is using, rather than any difference between list comprehensions and lambda
.
Python: list comprehensions vs. lambda
When the list is so small there is no significant difference between the two. If the input list can grow large then there is a worse problem: you're iterating over the whole list, while you could stop at the first element. You could accomplish this with a for loop, but if you want to use a comprehension-like statement, here come generator expressions:
# like list comprehensions but with () instead of []
gen = (b for a, b in foo if a == 'b')
my_element = next(gen)
or simply:
my_element = next(b for a, b in foo if a == 'b')
If you want to learn more about generator expressions give a look at PEP 289.
Note that even with generators and iterators you have more than one choice.
# Python 3:
my_element = next(filter(lambda x: x[0] == 'b', foo))
# Python 2:
from itertools import ifilter
my_element = next(ifilter(lambda (x, y): x == 'b', foo))
I personally don't like and don't recommend this because it is much less readable. It turns out that this is actually slower than my first snippet, but more in general using filter()
instead of a generator expression might be faster in some special cases.
In any case if you need benchmarking your code, I recommend using the timeit
module.
List comprehension vs map
map
may be microscopically faster in some cases (when you're NOT making a lambda for the purpose, but using the same function in map and a listcomp). List comprehensions may be faster in other cases and most (not all) pythonistas consider them more direct and clearer.
An example of the tiny speed advantage of map when using exactly the same function:
$ python -m timeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop
$ python -m timeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop
An example of how performance comparison gets completely reversed when map needs a lambda:
$ python -m timeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop
$ python -m timeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop
Is list comprehension implemented via map and lambda function?
No, list comprehensions are not implemented by map and lambda under the hood, not in CPython and not in Pypy3 either.
CPython (3.9.13 here) compiles the list comprehension into a special code object that outputs a list and calls it as a function:
~ $ echo 'x = [a + 1 for a in [1, 2, 3, 4]]' | python3 -m dis
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x107446f50, file "<stdin>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_CONST 2 ((1, 2, 3, 4))
8 GET_ITER
10 CALL_FUNCTION 1
12 STORE_NAME 0 (x)
14 LOAD_CONST 3 (None)
16 RETURN_VALUE
Disassembly of <code object <listcomp> at 0x107446f50, file "<stdin>", line 1>:
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 12 (to 18)
6 STORE_FAST 1 (a)
8 LOAD_FAST 1 (a)
10 LOAD_CONST 0 (1)
12 BINARY_ADD
14 LIST_APPEND 2
16 JUMP_ABSOLUTE 4
>> 18 RETURN_VALUE
Whereas the equivalent list(map(lambda: ...))
thing is just function calls:
~ $ echo 'x = list(map(lambda a: a + 1, [1, 2, 3, 4]))' | python3 -m dis
1 0 LOAD_NAME 0 (list)
2 LOAD_NAME 1 (map)
4 LOAD_CONST 0 (<code object <lambda> at 0x102701f50, file "<stdin>", line 1>)
6 LOAD_CONST 1 ('<lambda>')
8 MAKE_FUNCTION 0
10 BUILD_LIST 0
12 LOAD_CONST 2 ((1, 2, 3, 4))
14 LIST_EXTEND 1
16 CALL_FUNCTION 2
18 CALL_FUNCTION 1
20 STORE_NAME 2 (x)
22 LOAD_CONST 3 (None)
24 RETURN_VALUE
Disassembly of <code object <lambda> at 0x102701f50, file "<stdin>", line 1>:
1 0 LOAD_FAST 0 (a)
2 LOAD_CONST 1 (1)
4 BINARY_ADD
6 RETURN_VALUE
List comprehension instead of lambda in DataFrame.apply()?
What about
G['year'] = ["'{:02d}".format(x % 100) for x in G.year]
?
Filter Lambda Function
There are two issues with your code:
- The
list
variable name shadows thelist()
builtin -- pick a different name for your original list instead. - Your lambda function isn't correct. Instead of
lambda x: x == k
, it should belambda x: 'k' in x
.
data = ["rabbit", "chuck", "Joe", "war", "rock", "docker"]
listfilter = list(filter(lambda x: ('k' in x), data))
# Prints ["chuck", "rock", "docker"]
print(listfilter)
Related Topics
How to Use Subprocess.Popen to Connect Multiple Processes by Pipes
How to Use a Decimal Step Value For Range()
Tkinter - Executing Functions Over Time
How to Search and Replace Text in a File
Flask View Raises Typeerror: 'Bool' Object Is Not Callable
How to Save/Restore a Model After Training
How to Split the Definition of a Long String Over Multiple Lines
Pass a List to a Function to Act as Multiple Arguments
Recursive Function Returning None in Python
Why Should Exec() and Eval() Be Avoided
Count the Number of Occurrences of a Character in a String
Which Python Memory Profiler Is Recommended
Remove All Occurrences of a Value from a List