numpy max vs amax vs maximum
np.max
is just an alias for np.amax
. This function only works on a single input array and finds the value of maximum element in that entire array (returning a scalar). Alternatively, it takes an axis
argument and will find the maximum value along an axis of the input array (returning a new array).
>>> a = np.array([[0, 1, 6],
[2, 4, 1]])
>>> np.max(a)
6
>>> np.max(a, axis=0) # max of each column
array([2, 4, 6])
The default behaviour of np.maximum
is to take two arrays and compute their element-wise maximum. Here, 'compatible' means that one array can be broadcast to the other. For example:
>>> b = np.array([3, 6, 1])
>>> c = np.array([4, 2, 9])
>>> np.maximum(b, c)
array([4, 6, 9])
But np.maximum
is also a universal function which means that it has other features and methods which come in useful when working with multidimensional arrays. For example you can compute the cumulative maximum over an array (or a particular axis of the array):
>>> d = np.array([2, 0, 3, -4, -2, 7, 9])
>>> np.maximum.accumulate(d)
array([2, 2, 3, 3, 3, 7, 9])
This is not possible with np.max
.
You can make np.maximum
imitate np.max
to a certain extent when using np.maximum.reduce
:
>>> np.maximum.reduce(d)
9
>>> np.max(d)
9
Basic testing suggests the two approaches are comparable in performance; and they should be, as np.max()
actually calls np.maximum.reduce
to do the computation.
numpy.max or max ? Which one is faster?
Well from my timings it follows if you already have numpy array a
you should use a.max
(the source tells it's the same as np.max
if a.max
available). But if you have built-in list then most of the time takes converting it into np.ndarray => that's why max
is better in your timings.
In essense: if np.ndarray
then a.max
, if list
and no need for all the machinery of np.ndarray
then standard max
.
NumPy: function for simultaneous max() and min()
Is there a function in the numpy API that finds both max and min with only a single pass through the data?
No. At the time of this writing, there is no such function. (And yes, if there were such a function, its performance would be significantly better than calling numpy.amin()
and numpy.amax()
successively on a large array.)
What does numpy.max function do?
You can't replicate the behavior of np.max
very easily in pure Python, simply because multi-dimensional arrays aren't standard in Python. If the A
and B
in your code are such arrays, it would be best to keep the NumPy function.
For flat (one-dimensional) arrays, the Python max
and np.max
do the same thing and could be exchanged:
>>> a = np.arange(27)
>>> max(a)
26
>>> np.max(a)
26
For arrays with more than one dimension, max
won't work:
>>> a = a.reshape(3, 3, 3)
>>> max(a)
ValueError: The truth value of an array with more than one element is ambiguous [...]
>>> np.max(a)
26
By default, np.max
flattens the 3D array and returns the maximum. (You can also find the maximum along particular axes, and so on.) The Python max
cannot do this.
To replace np.max
, you'd need to write nested loops over the axes of the array; effectively trying to find the maximum in a list of nested lists. This is certainly possible, but is likely to be very slow:
>>> max([max(y) for y in x for x in a])
26
How can I index each occurrence of a max value along a given axis of a numpy array?
Try with np.where
:
np.where(Q == Q.max(axis=1)[:,None])
Output:
(array([0, 0, 1, 1, 2]), array([1, 2, 0, 2, 1]))
Not quite the output you want, but contains equivalent information.
You can also use np.argwhere
which gives you the zip data:
np.argwhere(Q==Q.max(axis=1)[:,None])
Output:
array([[0, 1],
[0, 2],
[1, 0],
[1, 2],
[2, 1]])
Python: numpy.amax vs Python's max : 'int' object is not iterable
At some point your code is trying to apply max
to an integer, as opposed to a list or other iterable
To illustrate:
In [354]: max(123)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-354-8de2de84b04d> in <module>()
----> 1 max(123)
TypeError: 'int' object is not iterable
In [355]: np.max(123)
Out[355]: 123
In [356]: np.max(np.array(123))
Out[356]: 123
np.max
works because it first turns the argument into an array.
tempAvgMetric = [[] for dmx in range(numBins)]
....
tempAvgMetric[xxx].extend(binnedMetric[expDir][idx][nameNum][xxx])
....
tempAvgMetric[idx] = 0
With this code, some tempAvgMetric
elements will be lists (they all start as []
), but for the idx
case they are the integer 0
.
Changing that assignment to:
tempAvgMetric[idx] = [0]
In [357]: max([0])
Out[357]: 0
Be aware that max([])
and np.max([])
both produce an error.
if tempAvgMetric
test doesn't make much sense. When would this be False? Only if the list was empty, i.e. if numBins==0
.
why numpy max function(np.max) return wrong output?
There are two problems here
It looks like column you are trying to find maximum for has the data type
object
. It's not recommended if you are sure that your column should contain numerical data since it may cause unpredictable behaviour not only in this particular case. Please check data types for your dataframe(you can do this by typingdf.dtypes
) and change it so that it corresponds to data you expect(for this casedf[column_name].astype(np.float64)
). This is also the reason fornp.nanmax
not working properly.You don't want to use
np.max
on arrays, containing nans.
Solution
If you are sure about having
object
data type of column:1.1. You can use the max method of Series, it should cast data to float automatically.
df.iloc[3].max()
1.2. You can cast data to propper type only for nanmax function.
np.nanmax(df.values[:,3].astype(np.float64)
1.3 You can drop all nan's from dataframe and find max[not recommended]:
np.max(test_data[column_name].dropna().values)
If type of your data is float64 and it shouldn't be
object
data type [recommended]:df[column_name] = df[column_name].astype(np.float64)
np.nanmax(df.values[:,3])
Code to illustrate problem
#python
import pandas as pd
import numpy as np
test_data = pd.DataFrame({
'objects_column': np.array([0.7,0.5,1.0,1.64,np.nan,0.07]).astype(object),
'floats_column': np.array([0.7,0.5,1.0,1.64,np.nan,0.07]).astype(np.float64)})
print("********Using np.max function********")
print("Max of objects array:", np.max(test_data['objects_column'].values))
print("Max of floats array:", np.max(test_data['floats_column'].values))
print("\n********Using max method of series function********")
print("Max of objects array:", test_data["objects_column"].max())
print("Max of floats array:", test_data["objects_column"].max())
Returns:
********Using np.max function********
Max of objects array: 0.07
Max of floats array: nan
********Using max method of series function********
Max of objects array: 1.64
Max of floats array: 1.64
Determining min and max of a numpy array using a loop
Min and max are quite easy. Iterate through the items setting the min and max to the new value if required. For the median sort the columns (at least to just over half way) and return the middle item ( length is odd, or the average of the two items closest to the middle ( length is even ).
import numpy as np
arr = np.array([[ 5., 162., 60.], [ 2., 110., 60.], [ 12., 101., 101.],
[ 12., 105., 37.], [ 13., 155., 58.], [ 4., 101., 42.],
[ 8., 101., 38.], [ 6., 125., 40.], [ 15., 200., 40.],
[ 17., 251., 250.], [ 17., 120., 38.], [ 13., 210., 115.],
[ 14., 215., 105.], [ 1., 50., 50.], [ 6., 70., 31.],
[ 12., 210., 120.], [ 4., 60., 25.], [ 11., 230., 80.],
[ 15., 225., 73.], [ 2., 110., 43.]] )
def minmax( arr ):
""" Return min & max arrays of a 2d array. """
mn = arr[0].copy()
mx = arr[0].copy()
for row in arr[1:]:
mn = np.minimum(mn, row) # Item by item minimum
mx = np.maximum(mx, row) # item by item maximum
return mn, mx
def median( arr ):
data = arr.copy() # data will be modified.
# Sort lowest 'half'+1 of data. Once the middle two items are known
# the median can be calculated so no need top sort all.
size = len(data)
for ix, d in enumerate( data[:size // 2 + 1 ] ):
mn = d # Set mn to the next item in the array
mnix = ix # Set mnix to the next index
# Find min in the rest of the array
for jx, s in enumerate( data[ ix+1: ] ):
if s < mn: # If a new mn
mn = s # Set mn to s
mnix = jx + ix+1 # Set mnix to the index
# Swap contents of data[ix] and data[mnix], the minimum found.
# If mnix == ix it still works.
data[ix], data[mnix] = mn, data[ix]
key0 = (size - 1) // 2
key1 = size - 1 - key0
return 0.5 * ( data[key0] + data[key1] )
# Return average of the two middle keys
# ( the keys are the same if a odd number of items in arr)
def medians( arr ):
res = np.zeros_like( arr[0] )
# Iterate through arr transposed. i.e. column by column
for ix, col in enumerate( arr.T ):
res[ix] = median( col )
return res
print( minmax( arr ), medians( arr ) )
# (array([ 1., 50., 25.]), array([ 17., 251., 250.])) [ 11.5 122.5 54. ]
# Numpy versions
print( arr.min( axis = 0 ), arr.max( axis = 0 ), np.median( arr, axis = 0 ))
# [ 1. 50. 25.] [ 17. 251. 250.] [ 11.5 122.5 54. ]
It shows how much effort numpy saves you and it runs faster too.
Replacing numpy array with max value
You should use
np.max(a, axis=1)
Link to documentation
Related Topics
Most Efficient Way to Search the Last X Lines of a File
How to Highlight Specific X-Value Ranges
How to Schedule a Function to Run Every Hour on Flask
Python | Accessing Dll Using Ctypes
Does Flask Support Regular Expressions in Its Url Routing
How to Print Utf-8 Encoded Text to the Console in Python < 3
How to Find Tags with Only Certain Attributes - Beautifulsoup
How to Assign the Value of a Variable Using Eval in Python
How to Remove Square Bracket from Pandas Dataframe
Remove Duplicate Rows from Pandas Dataframe Where Only Some Columns Have the Same Value
Re.Sub Erroring with "Expected String or Bytes-Like Object"
How to Create an Object for a Django Model with a Many to Many Field
Attributeerror: 'List' Object Has No Attribute 'Click' - Selenium Webdriver
How to Add Hours to Current Time in Python