Numpy: Function for Simultaneous Max() and Min()

NumPy: function for simultaneous max() and min()

Is there a function in the numpy API that finds both max and min with only a single pass through the data?

No. At the time of this writing, there is no such function. (And yes, if there were such a function, its performance would be significantly better than calling numpy.amin() and numpy.amax() successively on a large array.)

Python min/max of numpy array column within a dictonary

You can concatenate your individual lists into a single Numpy array and then just use min and max along the desired axis:

total = {}
total['test1'] = np.array([[1,1.5,2],[14,20,8],[5,9,2]])
total['book'] = np.array([[4,8,12],[44,2,81],[3,8,3]])
total['panda'] = np.array([[1,3,8],[104,4,51]])

stacked = np.concatenate(list(total.values()))

stacked.min(axis=0)
# array([1. , 1.5, 2. ])
stacked.max(axis=0)
# array([104., 20., 81.])

Is there a numpy max minus min function?

Indeed there is such a function -- it's called numpy.ptp() for "peak to peak".

Python : numpy.arange equivalent function with min, max, and 'number of bins' instead of 'steps'?

Just change function to np.linspace :)

np.linspace(0, 1, 1000)

Local min and max difference in Numpy python

If I got the question right

import numpy as np
a = np.array([0,2,5,44,-12,3,-5])
_min = min(a)
index_min = np.where(a == _min)[0][0] #first occurence
_max = max(a[:index_min])
print(_min-_max)

Replace max and min, second max and second min and so on in numpy

Assuming you want to swap the unique values, you can use np.unique to get both the sorted unique values and the index of the selected unique value in the sorted unique value. Then, you can revert the array of unique values to swap the min and max values. After that, the index will reference swapped min-max values, so you can just extract them using and indirect indexing. Here is the resulting fully vectorized code:

arr = np.array([1, 2, 1, 3, 2, 4, 1, 4])
uniqueItems, ids = np.unique(arr, return_inverse=True)
out = uniqueItems[::-1][ids]
# out: [4, 3, 4, 2, 3, 1, 4, 1]

Is this function to simultaneously retrieve min and max values any faster than using min and max separately?

The expensive part in finding out the minimum or maximum value in a list is not the comparison. Comparing values is pretty fast, and won’t make a problem here. Instead, what impacts the run time is the loop.

When you are using min() or max(), then each of those will have to iterate over the iterable once. They do that separately, so when you need both the minimum and the maximum value, by using the built-in functions, you are iterating twice.

Your function just iterates once over it, so its theoretical run time is shorter. Now as chepter mentioned in the comments, min and max are implemented in native code, so they are most certainly faster than when implementing it in Python code yourself.

Now, it depends a lot on your iterable whether the two native loops will be faster than your Python function. For longer lists, where iterating it is already expensive, iterating it once will definitely be better, but for shorter ones, you probably get better results with the native code. I can’t tell where the exact threshold is, but you can easily test out for your actual data what’s faster. In most cases though, it rarely matters as a min/max won’t be the bottleneck of your application, so you just shouldn’t worry about it until it becomes a problem.


Btw. your implementation has a few problems right now, which you should fix if you want to use it:

  • It requires iterable to be a sequence, and not an iterable (as you use indexes on it)
  • You also require it to have at least one item—which technically isn’t required either. While you do check for not iterable, that won’t necessarily tell you something about the length of the sequence/iterable. Custom types can easily provide their own boolean value and/or sequence behavior.
  • Finally, you initialize your _min and _max with the keyed values of the iterable item, but later you (correctly) just assign the original item from the iterable.

So I would suggest you to use iterators instead, and fix that key thing—you can also store the key results to save some computation for more complex key functions:

it = iter(iterable)
try:
min_ = max_ = next(it)
minv = maxv = key(min_)
except StopIteration:
return None, None

for i in it:
k = key(i)
if k > maxv:
max_, maxv = i, k
elif k < minv:
min_, minv = i, k

I did some testing on this, and it turns out that—without a custom key function—using the built-in max/min is kind-of impossible to beat. Even for very large lists, the purce C implementation is just way too fast. However, as soon as you add in a key function (which is written in Python code), the situation is completely reversed. With a key function, you get pretty much the same timing result for a single min or max call as for the full function doing both. So using the solution written in Python is a lot faster.

So this lead to the idea that, maybe, the implementation in Python wasn’t the actual problem, but instead the key function that is used. And indeed, the actual key function is what makes the Python implementation expensive. And it makes a lot of sense too. Even with an identity-lamba, you still have the overhead of function calls; len(iterable) many function calls (with my optimized variant above). And function calls are quite expensive.

In my tests, with support for the key function taken out, the actually expected results appeared: Iterating just once is faster than twice. But for non very-large iterables, the difference is really small. So unless iterating the iterable is very expensive (although you could then use tee and still iterate twice) or you want to loop over it anyway (in which case you would combine that with the min/max check), using the built-in max() and min() functions separately will be faster and also a lot easier to use. And, they both come with the internal optimization that they skip key functions if you don’t specify one.

Finally though, how could you add that key function optimization into your code? Well, unfortunately, there’s only one way to do this and that involves duplicating code. You essentially have to check whether or not a key function is specified and skip the function call when it wasn’t. So, something like this:

def min_max(iterable, key=None):
if key:
# do it with a key function
else:
# do it without

min, max and mean over large NumPy arrays in Python

All arrays generated by basic slicing are always views of the original array.

http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

So, yes, just use slices.



Related Topics



Leave a reply



Submit