NumPy: function for simultaneous max() and min()
Is there a function in the numpy API that finds both max and min with only a single pass through the data?
No. At the time of this writing, there is no such function. (And yes, if there were such a function, its performance would be significantly better than calling numpy.amin()
and numpy.amax()
successively on a large array.)
Python min/max of numpy array column within a dictonary
You can concatenate your individual lists into a single Numpy array and then just use min
and max
along the desired axis:
total = {}
total['test1'] = np.array([[1,1.5,2],[14,20,8],[5,9,2]])
total['book'] = np.array([[4,8,12],[44,2,81],[3,8,3]])
total['panda'] = np.array([[1,3,8],[104,4,51]])
stacked = np.concatenate(list(total.values()))
stacked.min(axis=0)
# array([1. , 1.5, 2. ])
stacked.max(axis=0)
# array([104., 20., 81.])
Is there a numpy max minus min function?
Indeed there is such a function -- it's called numpy.ptp()
for "peak to peak".
Python : numpy.arange equivalent function with min, max, and 'number of bins' instead of 'steps'?
Just change function
to np.linspace
:)
np.linspace(0, 1, 1000)
Local min and max difference in Numpy python
If I got the question right
import numpy as np
a = np.array([0,2,5,44,-12,3,-5])
_min = min(a)
index_min = np.where(a == _min)[0][0] #first occurence
_max = max(a[:index_min])
print(_min-_max)
Replace max and min, second max and second min and so on in numpy
Assuming you want to swap the unique values, you can use np.unique
to get both the sorted unique values and the index of the selected unique value in the sorted unique value. Then, you can revert the array of unique values to swap the min and max values. After that, the index will reference swapped min-max values, so you can just extract them using and indirect indexing. Here is the resulting fully vectorized code:
arr = np.array([1, 2, 1, 3, 2, 4, 1, 4])
uniqueItems, ids = np.unique(arr, return_inverse=True)
out = uniqueItems[::-1][ids]
# out: [4, 3, 4, 2, 3, 1, 4, 1]
Is this function to simultaneously retrieve min and max values any faster than using min and max separately?
The expensive part in finding out the minimum or maximum value in a list is not the comparison. Comparing values is pretty fast, and won’t make a problem here. Instead, what impacts the run time is the loop.
When you are using min()
or max()
, then each of those will have to iterate over the iterable once. They do that separately, so when you need both the minimum and the maximum value, by using the built-in functions, you are iterating twice.
Your function just iterates once over it, so its theoretical run time is shorter. Now as chepter mentioned in the comments, min
and max
are implemented in native code, so they are most certainly faster than when implementing it in Python code yourself.
Now, it depends a lot on your iterable whether the two native loops will be faster than your Python function. For longer lists, where iterating it is already expensive, iterating it once will definitely be better, but for shorter ones, you probably get better results with the native code. I can’t tell where the exact threshold is, but you can easily test out for your actual data what’s faster. In most cases though, it rarely matters as a min/max won’t be the bottleneck of your application, so you just shouldn’t worry about it until it becomes a problem.
Btw. your implementation has a few problems right now, which you should fix if you want to use it:
- It requires
iterable
to be a sequence, and not an iterable (as you use indexes on it) - You also require it to have at least one item—which technically isn’t required either. While you do check for
not iterable
, that won’t necessarily tell you something about the length of the sequence/iterable. Custom types can easily provide their own boolean value and/or sequence behavior. - Finally, you initialize your
_min
and_max
with the keyed values of the iterable item, but later you (correctly) just assign the original item from the iterable.
So I would suggest you to use iterators instead, and fix that key thing—you can also store the key results to save some computation for more complex key functions:
it = iter(iterable)
try:
min_ = max_ = next(it)
minv = maxv = key(min_)
except StopIteration:
return None, None
for i in it:
k = key(i)
if k > maxv:
max_, maxv = i, k
elif k < minv:
min_, minv = i, k
I did some testing on this, and it turns out that—without a custom key function—using the built-in max/min is kind-of impossible to beat. Even for very large lists, the purce C implementation is just way too fast. However, as soon as you add in a key function (which is written in Python code), the situation is completely reversed. With a key function, you get pretty much the same timing result for a single min
or max
call as for the full function doing both. So using the solution written in Python is a lot faster.
So this lead to the idea that, maybe, the implementation in Python wasn’t the actual problem, but instead the key
function that is used. And indeed, the actual key function is what makes the Python implementation expensive. And it makes a lot of sense too. Even with an identity-lamba, you still have the overhead of function calls; len(iterable)
many function calls (with my optimized variant above). And function calls are quite expensive.
In my tests, with support for the key function taken out, the actually expected results appeared: Iterating just once is faster than twice. But for non very-large iterables, the difference is really small. So unless iterating the iterable is very expensive (although you could then use tee
and still iterate twice) or you want to loop over it anyway (in which case you would combine that with the min/max check), using the built-in max()
and min()
functions separately will be faster and also a lot easier to use. And, they both come with the internal optimization that they skip key functions if you don’t specify one.
Finally though, how could you add that key function optimization into your code? Well, unfortunately, there’s only one way to do this and that involves duplicating code. You essentially have to check whether or not a key function is specified and skip the function call when it wasn’t. So, something like this:
def min_max(iterable, key=None):
if key:
# do it with a key function
else:
# do it without
min, max and mean over large NumPy arrays in Python
All arrays generated by basic slicing are always views of the original array.
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
So, yes, just use slices.
Related Topics
How Do Python Functions Handle the Types of Parameters That You Pass In
Unicodedecodeerror: 'Ascii' Codec Can't Decode Byte 0Xe2 in Position 13: Ordinal Not in Range(128)
How to Write Tests for the Argparse Portion of a Python Module
How to Tell If a String Repeats Itself in Python
How to Convert a Python Datetime.Datetime to Excel Serial Date Number
Opencv - Apply Mask to a Color Image
How to Escape Latex Code Received Through User Input
Python 'If X Is Not None' or 'If Not X Is None'
How to Get Rid of Beautifulsoup User Warning
Differencebetween I = I + 1 and I += 1 in a 'For' Loop
Pygame How to Let Balls Collide
What's the Difference Between "Update" and "Update_Idletasks"
How to Use Python to Get the System Hostname
How to Show Explosion Image When Collision Happens