Numpy Max VS Amax VS Maximum

numpy max vs amax vs maximum

np.max is just an alias for np.amax. This function only works on a single input array and finds the value of maximum element in that entire array (returning a scalar). Alternatively, it takes an axis argument and will find the maximum value along an axis of the input array (returning a new array).

>>> a = np.array([[0, 1, 6],
                  [2, 4, 1]])
>>> np.max(a)
6
>>> np.max(a, axis=0) # max of each column
array([2, 4, 6])

The default behaviour of np.maximum is to take two arrays and compute their element-wise maximum. Here, 'compatible' means that one array can be broadcast to the other. For example:

>>> b = np.array([3, 6, 1])
>>> c = np.array([4, 2, 9])
>>> np.maximum(b, c)
array([4, 6, 9])

But np.maximum is also a universal function which means that it has other features and methods which come in useful when working with multidimensional arrays. For example you can compute the cumulative maximum over an array (or a particular axis of the array):

>>> d = np.array([2, 0, 3, -4, -2, 7, 9])
>>> np.maximum.accumulate(d)
array([2, 2, 3, 3, 3, 7, 9])

This is not possible with np.max.

You can make np.maximum imitate np.max to a certain extent when using np.maximum.reduce:

>>> np.maximum.reduce(d)
9
>>> np.max(d)
9

Basic testing suggests the two approaches are comparable in performance; and they should be, as np.max() actually calls np.maximum.reduce to do the computation.

numpy.max or max ? Which one is faster?

Well from my timings it follows if you already have numpy array a you should use a.max (the source tells it's the same as np.max if a.max available). But if you have built-in list then most of the time takes converting it into np.ndarray => that's why max is better in your timings.

In essense: if np.ndarray then a.max, if list and no need for all the machinery of np.ndarray then standard max.

NumPy: function for simultaneous max() and min()

Is there a function in the numpy API that finds both max and min with only a single pass through the data?

No. At the time of this writing, there is no such function. (And yes, if there were such a function, its performance would be significantly better than calling numpy.amin() and numpy.amax() successively on a large array.)

What does numpy.max function do?

You can't replicate the behavior of np.max very easily in pure Python, simply because multi-dimensional arrays aren't standard in Python. If the A and B in your code are such arrays, it would be best to keep the NumPy function.

For flat (one-dimensional) arrays, the Python max and np.max do the same thing and could be exchanged:

>>> a = np.arange(27)
>>> max(a)
26
>>> np.max(a)
26

For arrays with more than one dimension, max won't work:

>>> a = a.reshape(3, 3, 3)
>>> max(a)
ValueError: The truth value of an array with more than one element is ambiguous [...]
>>> np.max(a)
26

By default, np.max flattens the 3D array and returns the maximum. (You can also find the maximum along particular axes, and so on.) The Python max cannot do this.

To replace np.max, you'd need to write nested loops over the axes of the array; effectively trying to find the maximum in a list of nested lists. This is certainly possible, but is likely to be very slow:

>>> max([max(y) for y in x for x in a])
26

How can I index each occurrence of a max value along a given axis of a numpy array?

Try with np.where:

np.where(Q == Q.max(axis=1)[:,None])

Output:

(array([0, 0, 1, 1, 2]), array([1, 2, 0, 2, 1]))

Not quite the output you want, but contains equivalent information.

You can also use np.argwhere which gives you the zip data:

np.argwhere(Q==Q.max(axis=1)[:,None])

Output:

array([[0, 1],
       [0, 2],
       [1, 0],
       [1, 2],
       [2, 1]])

Python: numpy.amax vs Python's max : 'int' object is not iterable

At some point your code is trying to apply max to an integer, as opposed to a list or other iterable

To illustrate:

In [354]: max(123)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-354-8de2de84b04d> in <module>()
----> 1 max(123)

TypeError: 'int' object is not iterable
In [355]: np.max(123)
Out[355]: 123
In [356]: np.max(np.array(123))
Out[356]: 123

np.max works because it first turns the argument into an array.

 tempAvgMetric = [[] for dmx in range(numBins)]
 ....
       tempAvgMetric[xxx].extend(binnedMetric[expDir][idx][nameNum][xxx])
 ....
        tempAvgMetric[idx] = 0

With this code, some tempAvgMetric elements will be lists (they all start as []), but for the idx case they are the integer 0.

Changing that assignment to:

tempAvgMetric[idx] = [0]

In [357]: max([0])
Out[357]: 0

Be aware that max([]) and np.max([]) both produce an error.

if tempAvgMetric

test doesn't make much sense. When would this be False? Only if the list was empty, i.e. if numBins==0.

why numpy max function(np.max) return wrong output?

There are two problems here

It looks like column you are trying to find maximum for has the data type object. It's not recommended if you are sure that your column should contain numerical data since it may cause unpredictable behaviour not only in this particular case. Please check data types for your dataframe(you can do this by typing df.dtypes) and change it so that it corresponds to data you expect(for this case df[column_name].astype(np.float64)). This is also the reason for np.nanmax not working properly.
You don't want to use np.max on arrays, containing nans.

Solution

If you are sure about having object data type of column:
1.1. You can use the max method of Series, it should cast data to float automatically.
df.iloc[3].max()
1.2. You can cast data to propper type only for nanmax function.
np.nanmax(df.values[:,3].astype(np.float64)
1.3 You can drop all nan's from dataframe and find max[not recommended]:
```
np.max(test_data[column_name].dropna().values)
```

If type of your data is float64 and it shouldn't be object data type [recommended]:
```
df[column_name] = df[column_name].astype(np.float64)

np.nanmax(df.values[:,3])
```

Code to illustrate problem

#python
import pandas as pd
import numpy as np 

test_data = pd.DataFrame({
                   'objects_column': np.array([0.7,0.5,1.0,1.64,np.nan,0.07]).astype(object),
                   'floats_column': np.array([0.7,0.5,1.0,1.64,np.nan,0.07]).astype(np.float64)})

print("********Using np.max function********")
print("Max of objects array:", np.max(test_data['objects_column'].values))
print("Max of floats array:", np.max(test_data['floats_column'].values))

print("\n********Using max method of series function********")
print("Max of objects array:", test_data["objects_column"].max()) 
print("Max of floats array:", test_data["objects_column"].max())

Returns:

********Using np.max function********
Max of objects array: 0.07
Max of floats array: nan

********Using max method of series function********
Max of objects array: 1.64
Max of floats array: 1.64

Determining min and max of a numpy array using a loop

Min and max are quite easy. Iterate through the items setting the min and max to the new value if required. For the median sort the columns (at least to just over half way) and return the middle item ( length is odd, or the average of the two items closest to the middle ( length is even ).

import numpy as np

arr = np.array([[  5., 162.,  60.], [  2., 110.,  60.], [ 12., 101., 101.], 
    [ 12., 105.,  37.], [ 13., 155.,  58.], [  4., 101.,  42.], 
    [  8., 101.,  38.], [  6., 125.,  40.], [ 15., 200.,  40.], 
    [ 17., 251., 250.], [ 17., 120.,  38.], [ 13., 210., 115.], 
    [ 14., 215., 105.], [  1.,  50.,  50.], [  6.,  70.,  31.], 
    [ 12., 210., 120.], [  4.,  60.,  25.], [ 11., 230.,  80.], 
    [ 15., 225.,  73.], [  2., 110.,  43.]] )

def minmax( arr ):
    """  Return min & max arrays of a 2d array. """
    mn = arr[0].copy()
    mx = arr[0].copy()
    for row in arr[1:]:
        mn = np.minimum(mn, row) # Item by item minimum
        mx = np.maximum(mx, row) # item by item maximum
    return mn, mx    

def median( arr ):
    data = arr.copy()  # data will be modified. 
    # Sort lowest 'half'+1 of data.  Once the middle two items are known 
    # the median can be calculated so no need top sort all.
    size = len(data)
    for ix, d in enumerate( data[:size // 2 + 1 ] ):
        mn = d      # Set mn to the next item in the array
        mnix = ix   # Set mnix to the next index
        # Find min in the rest of the array
        for jx, s in enumerate( data[ ix+1: ] ):
            if s < mn:             # If a new mn 
                mn = s             # Set mn to s
                mnix = jx + ix+1   # Set mnix to the index
        # Swap contents of data[ix] and data[mnix], the minimum found.
        # If mnix == ix it still works.
        data[ix], data[mnix] = mn, data[ix]
    key0 = (size - 1) // 2
    key1 = size - 1 - key0
    return 0.5 * ( data[key0] + data[key1] )
    # Return average of the two middle keys 
    # ( the keys are the same if a odd number of items in arr)

def medians( arr ):
    res = np.zeros_like( arr[0] )
    # Iterate through arr transposed. i.e. column by column
    for ix, col in enumerate( arr.T ):
        res[ix] = median( col )
    return res

print( minmax( arr ), medians( arr ) )
# (array([ 1., 50., 25.]), array([ 17., 251., 250.])) [ 11.5 122.5  54. ]
# Numpy versions
print( arr.min( axis = 0 ), arr.max( axis = 0 ), np.median( arr, axis = 0 ))
# [ 1. 50. 25.] [ 17. 251. 250.] [ 11.5 122.5  54. ]

It shows how much effort numpy saves you and it runs faster too.

Replacing numpy array with max value

You should use

np.max(a, axis=1)

Link to documentation

Numpy Max VS Amax VS Maximum