Calculate mean of each 2d array in a numpy array
I think this would give you your expected output
By passing multi-dim in axis
- see doc for more about axis
param
b.mean(axis=(1,2))
array([3.5, 2. ])
How to get mean value with 1D array and 2D array?
There is an error in the summary definition instead of list
you want to have x
as parameter.
With this it works fine for me:
import numpy as np
def summary(x):
mean1 = np.mean(x)
Dict = {"mean":mean1}
return Dict
a = summary([1, 2, 2, 3, 4])
print(a)
b = summary([[1, 2], [3, 4]])
print(b)
Result is:
{'mean': 2.4}
{'mean': 2.5}
[Update]
If you want to have the mean along a specific axis you can do it like the following. You have to check the array shape, because you want it in direction 1, which is not prsent for a 1D array.
import numpy as np
def summary(x):
arr = np.array(x)
if len(arr.shape) == 2:
mean1 = np.mean(arr, axis=1)
else:
mean1 = np.mean(arr)
Dict = {"mean":mean1}
return Dict
a = summary([1, 2, 2, 3, 4])
print(a)
b = summary([[1, 2], [3, 4]])
print(b)
which returns
{'mean': 2.4}
{'mean': array([1.5, 3.5])}
Numpy calculate mean for sub range both rows and columns in 2D array
You can take advantage of numpy
stride_tricks module to reshape your array into a block shape.
one-liner solution:
from numpy.lib.stride_tricks import as_strided
as_strided(a, shape=(4, 4, 2, 2), strides=(128, 16, 64, 8)).mean(axis=(2,3))
output:
array([[2. , 3.5 , 5.5 , 6. ],
[4.25, 6.5 , 5.75, 5.75],
[5.5 , 4. , 6.5 , 4.75],
[4.25, 6.75, 4.5 , 4. ]])
Note that I define explicitly the shape and the strides but they can easily be inferred (for the generic case of 2*2 average pooling)
strides = tuple(map(lambda x: x*2, a.strides)) + a.strides
shape = tuple(map(lambda x: int(x / 2), a.shape)) + (2, 2)
The performance of this trick 10 time faster compared to naive loop
%%timeit
as_strided(a, shape=(4, 4, 2, 2), strides=(128, 16, 64, 8)).mean(axis=(2,3))
11.5 µs ± 44.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
ni = nj = 2
dim_i = a.shape[0]
dim_j = a.shape[1]
b = np.empty((int(a.shape[0]/ni), int(a.shape[1]/nj)))
for ii, i in enumerate(range(0, dim_i, ni)):
for jj, j in enumerate(range(0, dim_j, nj)):
flat = np.array([a[i][j:j+ni], a[i+1][j:j+ni]]).flatten()
b[ii,jj] = np.mean(flat)
128 µs ± 1.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
The gain is more significant for larger arrays(X200 for 1000*1000 array)
Compute mean of values for each index across multiple arrays
It's possible by just one line command with numpy
import numpy as np
arr=[np.array([2.4, 3.5, 2.9]),
np.array([4.5, 1.8, 1.4])]
np.mean(arr, axis = 0)
Get element wise average of multiple arrays
to obtain the average for each list in the matrix:
averaged_array = np.array(array_1).mean(axis=1)
to obtain the average for each column:
averaged_array = np.array(array_1).mean(axis=0)
block mean of 2D numpy array (in both dimensions)
You can do:
# sample data
a=np.arange(24).reshape((6,4))
rows, cols = a.shape
a.reshape(rows//2, 2, cols//2, 2).mean(axis=(1,-1))
Output:
array([[ 2.5, 4.5],
[10.5, 12.5],
[18.5, 20.5]])
Calculate mean across a row of an array using a row of a masking array
If you want to use masked arrays, here is a streamlined way of doing that:
import numpy as np
# create some mock data
R_mat = np.arange(16).reshape(4, 4)
Y_mat = np.random.randint(0, 2, (4, 4))
R_mat
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15]])
Y_mat
# array([[0, 1, 0, 1],
# [0, 1, 1, 0],
# [0, 1, 0, 1],
# [0, 0, 1, 0]])
# compute all row means or all column means at once
# use Y_mat==0 to invert and convert to bool in one go
row_means = np.ma.MaskedArray(R_mat, Y_mat==0).mean(axis=1)
col_means = np.ma.MaskedArray(R_mat, Y_mat==0).mean(axis=0)
row_means
# masked_array(data=[2.0, 5.5, 10.0, 14.0],
# mask=[False, False, False, False],
# fill_value=1e+20)
col_means
# masked_array(data=[--, 5.0, 10.0, 7.0],
# mask=[ True, False, False, False],
# fill_value=1e+20)
# or take just one row or column and get the mean
np.ma.MaskedArray(R_mat, Y_mat==0)[2].mean()
# 10.0
np.ma.MaskedArray(R_mat, Y_mat==0)[:, 0].mean()
# masked
If for some reason you want to avoid masked arrays:
nrow, ncol = R_mat.shape
I, J = np.where(Y_mat)
row_means = np.bincount(I, R_mat[I, J], nrow) / np.bincount(I, None, nrow)
J, I = np.where(Y_mat.T)
col_means = np.bincount(J, R_mat[I, J], ncol) / np.bincount(J, None, ncol)
# __main__:1: RuntimeWarning: invalid value encountered in true_divide
row_means
# array([ 2. , 5.5, 10. , 14. ])
col_means
# array([nan, 5., 10., 7.])
Related Topics
Splitting a Pandas Dataframe Column by Delimiter
Problem with Multi Threaded Python App and Socket Connections
Peak-Finding Algorithm for Python/Scipy
What's the Fastest Way in Python to Calculate Cosine Similarity Given Sparse Matrix Data
Can Python Pickle Lambda Functions
Download File Through Google Chrome in Headless Mode
How to Login to a Website with Python
How to Filter Rows in Pandas by Regex
How to Get Rid of Python Tkinter Root Window
How to Load a Module from Code in a String
Flask to Return Image Stored in Database
How to Resolve Typeerror: Can Only Concatenate Str (Not "Int") to Str
When I Catch an Exception, How to Get the Type, File, and Line Number
Split a Generator into Chunks Without Pre-Walking It
How to Start and Stop a Thread