How Do Python'S Any and All Functions Work

How do Python's any and all functions work?

You can roughly think of any and all as series of logical or and and operators, respectively.

any

any will return True when at least one of the elements is Truthy. Read about Truth Value Testing.

all

all will return True only when all the elements are Truthy.

Truth table

+-----------------------------------------+---------+---------+
|                                         |   any   |   all   |
+-----------------------------------------+---------+---------+
| All Truthy values                       |  True   |  True   |
+-----------------------------------------+---------+---------+
| All Falsy values                        |  False  |  False  |
+-----------------------------------------+---------+---------+
| One Truthy value (all others are Falsy) |  True   |  False  |
+-----------------------------------------+---------+---------+
| One Falsy value (all others are Truthy) |  True   |  False  |
+-----------------------------------------+---------+---------+
| Empty Iterable                          |  False  |  True   |
+-----------------------------------------+---------+---------+

Note 1: The empty iterable case is explained in the official documentation, like this

any

Return True if any element of the iterable is true. If the iterable is empty, return False

Since none of the elements are true, it returns False in this case.

all

Return True if all elements of the iterable are true (or if the iterable is empty).

Since none of the elements are false, it returns True in this case.

Note 2:

Another important thing to know about any and all is, it will short-circuit the execution, the moment they know the result. The advantage is, entire iterable need not be consumed. For example,

>>> multiples_of_6 = (not (i % 6) for i in range(1, 10))
>>> any(multiples_of_6)
True
>>> list(multiples_of_6)
[False, False, False]

Here, (not (i % 6) for i in range(1, 10)) is a generator expression which returns True if the current number within 1 and 9 is a multiple of 6. any iterates the multiples_of_6 and when it meets 6, it finds a Truthy value, so it immediately returns True, and rest of the multiples_of_6 is not iterated. That is what we see when we print list(multiples_of_6), the result of 7, 8 and 9.

This excellent thing is used very cleverly in this answer.

With this basic understanding, if we look at your code, you do

any(x) and not all(x)

which makes sure that, atleast one of the values is Truthy but not all of them. That is why it is returning [False, False, False]. If you really wanted to check if both the numbers are not the same,

print [x[0] != x[1] for x in zip(*d['Drd2'])]

How does this input work with the Python 'any' function?

If you use any(lst) you see that lst is the iterable, which is a list of some items. If it contained [0, False, '', 0.0, [], {}, None] (which all have boolean values of False) then any(lst) would be False. If lst also contained any of the following [-1, True, "X", 0.00001] (all of which evaluate to True) then any(lst) would be True.

In the code you posted, x > 0 for x in lst, this is a different kind of iterable, called a generator expression. Before generator expressions were added to Python, you would have created a list comprehension, which looks very similar, but with surrounding []'s: [x > 0 for x in lst]. From the lst containing [-1, -2, 10, -4, 20], you would get this comprehended list: [False, False, True, False, True]. This internal value would then get passed to the any function, which would return True, since there is at least one True value.

But with generator expressions, Python no longer has to create that internal list of True(s) and False(s), the values will be generated as the any function iterates through the values generated one at a time by the generator expression. And, since any short-circuits, it will stop iterating as soon as it sees the first True value. This would be especially handy if you created lst using something like lst = range(-1,int(1e9)) (or xrange if you are using Python2.x). Even though this expression will generate over a billion entries, any only has to go as far as the third entry when it gets to 1, which evaluates True for x>0, and so any can return True.

If you had created a list comprehension, Python would first have had to create the billion-element list in memory, and then pass that to any. But by using a generator expression, you can have Python's builtin functions like any and all break out early, as soon as a True or False value is seen.

How exactly does the .any() Python method work?

The any method of numpy arrays returns a boolean value, so when you write:

if popul_num.any() < 0:

popul_num.any() will be either True (=1) or False (=0) so it will never be less than zero. Thus, you will never enter this if-statement.

What any() does is evaluate each element of the array as a boolean and return whether any of them are truthy. For example:

>>> np.array([0.0]).any()
False

>>> np.array([1.0]).any()
True

>>> np.array([0.0, 0.35]).any()
True

As you can see, Python/numpy considers 0 to be falsy and all other numbers to be truthy. So calling any on an array of numbers tells us whether any number in the array is nonzero. But you want to know whether any number is negative, so we have to transfrom the array first. Let's introduce a negative number into your array to demonstrate.

>>> popul_num = np.array([200, 100, 0, -1])
>>> popul_num < 0  # Test is applied to all elements in the array
np.ndarray([False, False, False, True])
>>> (popul_num < 0).any()
True

You asked about any on lists versus arrays. Python's builtin list has no any method:

>>> [].any()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute 'any'

There is a builtin function (not method since it doesn't belong to a class) called any that serves the same purpose as the numpy .any method. These two expressions are logically equivalent:

any(popul_num < 0)

(popul_num < 0).any()

We would generally expect the second one to be faster since numpy is implemented in C. However only the first one will work with non-numpy types such as list and set.

Using any() and all() to check if a list contains one set of values or another

Generally speaking:

all and any are functions that take some iterable and return True, if

in the case of all(), no values in the iterable are falsy;
in the case of any(), at least one value is truthy.

A value x is falsy iff bool(x) == False.
A value x is truthy iff bool(x) == True.

Any non-booleans in the iterable will be fine — bool(x) will map (or coerce, if you prefer) any x according to these rules: 0, 0.0, None, [], (), [], set(), and other empty collections get mapped to False, anything else to True. The docstring for bool uses the terms 'true'/'false' for 'truthy'/'falsy', and True/False for the concrete boolean values.

In your specific code samples:

You misunderstood a little bit how these functions work. Hence, the following does something completely not what you thought:

if any(foobars) == big_foobar:

...because any(foobars) would first be evaluated to either True or False, and then that boolean value would be compared to big_foobar, which generally always gives you False (unless big_foobar coincidentally happened to be the same boolean value).

Note: the iterable can be a list, but it can also be a generator/generator expression (≈ lazily evaluated/generated list) or any other iterator.

What you want instead is:

if any(x == big_foobar for x in foobars):

which basically first constructs an iterable that yields a sequence of booleans—for each item in foobars, it compares the item to big_foobar and emits the resulting boolean into the resulting sequence:

tmp = (x == big_foobar for x in foobars)

then any walks over all items in tmp and returns True as soon as it finds the first truthy element. It's as if you did the following:

In [1]: foobars = ['big', 'small', 'medium', 'nice', 'ugly']                                        

In [2]: big_foobar = 'big'                                                                          

In [3]: any(['big' == big_foobar, 'small' == big_foobar, 'medium' == big_foobar, 'nice' == big_foobar, 'ugly' == big_foobar])        
Out[3]: True

Note: As DSM pointed out, any(x == y for x in xs) is equivalent to y in xs but the latter is more readable, quicker to write and runs faster.

Some examples:

In [1]: any(x > 5 for x in range(4))
Out[1]: False

In [2]: all(isinstance(x, int) for x in range(10))
Out[2]: True

In [3]: any(x == 'Erik' for x in ['Erik', 'John', 'Jane', 'Jim'])
Out[3]: True

In [4]: all([True, True, True, False, True])
Out[4]: False

See also: http://docs.python.org/2/library/functions.html#all

any() function in Python with a callback

How about:

>>> any(isinstance(e, int) and e > 0 for e in [1,2,'joe'])
True

It also works with all() of course:

>>> all(isinstance(e, int) and e > 0 for e in [1,2,'joe'])
False

Clojure equivalent to Python's any and all functions?

(every? f data) ^[docs] is the same as all(f(x) for x in data).

(some f data) ^[docs] is like any(f(x) for x in data) except that it returns the value of f(x) (which must be truthy), instead of just true.

If you want the exact same behaviour as in Python, you can use the identity function, which will just return its argument (equivalent to (fn [x] x)).

user=> (every? identity [1, true, "non-empty string"])
true
user=> (some identity [1, true "non-empty string"])
1
user=> (some true? [1, true "non-empty string"])
true

Functions as objects in Python: what exactly is stored in memory?

A function object's data is divided into two primary parts. The parts that would be the same for all functions created by the same function definition are stored in the function's code object, while the parts that can change even between functions created from the same function definition are stored in the function object.

The most interesting part of a function is probably its bytecode. This is the core data structure that says what to actually do to execute a function. It's stored as a bytestring in the function's code object, and you can examine it directly:

>>> def fib(i):
...     x, y = 0, 1
...     for _ in range(i):
...         x, y = y, x+y
...     return x
... 
>>> fib.__code__.co_code
b'd\x03\\\x02}\x01}\x02x\x1et\x00|\x00\x83\x01D\x00]\x12}\x03|\x02|\x01|\x02\x17\x00\x02\x00}\x01}\x02q\x1
2W\x00|\x01S\x00'

...but it's not designed to be human-readable.

With enough knowledge of the implementation details of Python bytecode, you could parse that yourself, but describing all that would take way too long. Instead, we'll use the dis module to disassemble the bytecode for us:

>>> import dis
>>> dis.dis(fib)
  2           0 LOAD_CONST               3 ((0, 1))
              2 UNPACK_SEQUENCE          2
              4 STORE_FAST               1 (x)
              6 STORE_FAST               2 (y)

  3           8 SETUP_LOOP              30 (to 40)
             10 LOAD_GLOBAL              0 (range)
             12 LOAD_FAST                0 (i)
             14 CALL_FUNCTION            1
             16 GET_ITER
        >>   18 FOR_ITER                18 (to 38)
             20 STORE_FAST               3 (_)
  4          22 LOAD_FAST                2 (y)
             24 LOAD_FAST                1 (x)
             26 LOAD_FAST                2 (y)
             28 BINARY_ADD
             30 ROT_TWO
             32 STORE_FAST               1 (x)
             34 STORE_FAST               2 (y)
             36 JUMP_ABSOLUTE           18
        >>   38 POP_BLOCK
  5     >>   40 LOAD_FAST                1 (x)
             42 RETURN_VALUE

There are a number of columns in the output here, but we're mostly interested in the one with the ALL_CAPS and the columns to the right of that.

The ALL_CAPS column shows the function's bytecode instructions. For example, LOAD_CONST loads a constant value, and BINARY_ADD is the instruction to add two objects with +. The next column, with the numbers, is for bytecode arguments. For example, LOAD_CONST 3 says to load the constant at index 3 in the code object's constants. These are always integers, and they're packed into the bytecode string along with the bytecode instructions. The last column mostly provides human-readable explanations of the bytecode arguments, for example, saying that the 3 in LOAD_CONST 3 corresponds to the constant (0, 1), or that the 1 in STORE_FAST 1 corresponds to local variable x. The information in this column doesn't actually come from the bytecode string; it's resolved by examining other parts of the code object.

The rest of a function object's data is primarily stuff needed to resolve bytecode arguments, like the function's closure or its global variable dict, and stuff that just exists because it's handy for introspection, like the function's __name__.

If we take a look at the Python 3.6 function object struct definition at C level:

typedef struct {
    PyObject_HEAD
    PyObject *func_code;    /* A code object, the __code__ attribute */
    PyObject *func_globals; /* A dictionary (other mappings won't do) */
    PyObject *func_defaults;    /* NULL or a tuple */
    PyObject *func_kwdefaults;  /* NULL or a dict */
    PyObject *func_closure; /* NULL or a tuple of cell objects */
    PyObject *func_doc;     /* The __doc__ attribute, can be anything */
    PyObject *func_name;    /* The __name__ attribute, a string object */
    PyObject *func_dict;    /* The __dict__ attribute, a dict or NULL */
    PyObject *func_weakreflist; /* List of weak references */
    PyObject *func_module;  /* The __module__ attribute, can be anything */
    PyObject *func_annotations; /* Annotations, a dict or NULL */
    PyObject *func_qualname;    /* The qualified name */

    /* Invariant:
     *     func_closure contains the bindings for func_code->co_freevars, so
     *     PyTuple_Size(func_closure) == PyCode_GetNumFree(func_code)
     *     (func_closure may be NULL if PyCode_GetNumFree(func_code) == 0).
     */
} PyFunctionObject;

we can see that there's the code object, and then

the global variable dict,
the default argument values,
the keyword-only argument default values,
the function's closure cells,
the docstring,
the name,
the __dict__,
the list of weak references to the function,
the __module__,
the annotations, and
the __qualname__, the fully qualified name

Inside the PyObject_HEAD macro, there's also the type pointer and the refcount (and some other metadata in a debug build). The GC also places some GC metadata right before each PyFunctionObject struct in memory.

We didn't have to go straight to C to examine most of that - we could have looked at the dir and filtered out non-instance attributes, since most of that data is available at Python level - but the struct definition provides a nice, commented, uncluttered list.

You can examine the code object struct definition too, but the contents aren't as clear if you're not already familiar with code objects, so I'm not going to embed it in the post. I'll just explain code objects.

The core component of a code object is a bytestring of Python bytecode instructions and arguments. We examined one of those earlier. In addition, the code object contains things like a tuple of the constants the function refers to, and a lot of other internal metadata required to figure out how to actually execute each instruction. Not all the metadata - some of it comes from the function object - but a lot of it. Some of it, like that tuple of constants, is fairly easily understandable, and some of it, like co_flags (a bunch of internal flags) or co_stacksize (the size of the stack used for temporary values) is more esoteric.

Opposite of any() function

There is also the all function which does the opposite of what you want, it returns True if all are True and False if any are False. Therefore you can just do:

not all(l)

Reason for all and any result on empty lists

How about some analogies...

You have a sock drawer, but it is currently empty. Does it contain any black sock? No - you don't have any socks at all so you certainly don't have a black one. Clearly any([]) must return false - if it returned true this would be counter-intuitive.

The case for all([]) is slightly more difficult. See the Wikipedia article on vacuous truth. Another analogy: If there are no people in a room then everyone in that room can speak French.

Mathematically all([]) can be written:

where the set A is empty.

There is considerable debate about whether vacuous statements should be considered true or not, but from a logical viewpoint it makes the most sense:

The main argument that all vacuously true statements are true is as follows: As explained in the article on logical conditionals, the axioms of propositional logic entail that if P is false, then P => Q is true. That is, if we accept those axioms, we must accept that vacuously true statements are indeed true.

Also from the article:

There seems to be no direct reason to pick true; it’s just that things blow up in our face if we don’t.

Defining a "vacuously true" statement to return false in Python would violate the principle of least astonishment.

What does -> mean in Python function definitions?

It's a function annotation.

In more detail, Python 2.x has docstrings, which allow you to attach a metadata string to various types of object. This is amazingly handy, so Python 3 extends the feature by allowing you to attach metadata to functions describing their parameters and return values.

There's no preconceived use case, but the PEP suggests several. One very handy one is to allow you to annotate parameters with their expected types; it would then be easy to write a decorator that verifies the annotations or coerces the arguments to the right type. Another is to allow parameter-specific documentation instead of encoding it into the docstring.

How Do Python'S Any and All Functions Work