Calculate Mean Across Dimension in a 2D Array

Calculate mean of each 2d array in a numpy array

I think this would give you your expected output

By passing multi-dim in axis - see doc for more about axis param

b.mean(axis=(1,2))
array([3.5, 2. ])

How to get mean value with 1D array and 2D array?

There is an error in the summary definition instead of list you want to have x as parameter.

With this it works fine for me:

import numpy as np
def summary(x):
mean1 = np.mean(x)

Dict = {"mean":mean1}
return Dict

a = summary([1, 2, 2, 3, 4])
print(a)
b = summary([[1, 2], [3, 4]])
print(b)

Result is:

{'mean': 2.4}
{'mean': 2.5}

[Update]

If you want to have the mean along a specific axis you can do it like the following. You have to check the array shape, because you want it in direction 1, which is not prsent for a 1D array.

import numpy as np
def summary(x):
arr = np.array(x)
if len(arr.shape) == 2:
mean1 = np.mean(arr, axis=1)
else:
mean1 = np.mean(arr)

Dict = {"mean":mean1}
return Dict

a = summary([1, 2, 2, 3, 4])
print(a)
b = summary([[1, 2], [3, 4]])
print(b)

which returns

{'mean': 2.4}
{'mean': array([1.5, 3.5])}

Numpy calculate mean for sub range both rows and columns in 2D array

You can take advantage of numpy stride_tricks module to reshape your array into a block shape.
one-liner solution:

from numpy.lib.stride_tricks import as_strided
as_strided(a, shape=(4, 4, 2, 2), strides=(128, 16, 64, 8)).mean(axis=(2,3))

output:

array([[2.  , 3.5 , 5.5 , 6.  ],
[4.25, 6.5 , 5.75, 5.75],
[5.5 , 4. , 6.5 , 4.75],
[4.25, 6.75, 4.5 , 4. ]])

Note that I define explicitly the shape and the strides but they can easily be inferred (for the generic case of 2*2 average pooling)

strides = tuple(map(lambda x: x*2, a.strides)) + a.strides
shape = tuple(map(lambda x: int(x / 2), a.shape)) + (2, 2)

The performance of this trick 10 time faster compared to naive loop

%%timeit
as_strided(a, shape=(4, 4, 2, 2), strides=(128, 16, 64, 8)).mean(axis=(2,3))
11.5 µs ± 44.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%%timeit
ni = nj = 2
dim_i = a.shape[0]
dim_j = a.shape[1]
b = np.empty((int(a.shape[0]/ni), int(a.shape[1]/nj)))

for ii, i in enumerate(range(0, dim_i, ni)):
for jj, j in enumerate(range(0, dim_j, nj)):
flat = np.array([a[i][j:j+ni], a[i+1][j:j+ni]]).flatten()
b[ii,jj] = np.mean(flat)
128 µs ± 1.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The gain is more significant for larger arrays(X200 for 1000*1000 array)

Compute mean of values for each index across multiple arrays

It's possible by just one line command with numpy

import numpy as np

arr=[np.array([2.4, 3.5, 2.9]),
np.array([4.5, 1.8, 1.4])]
np.mean(arr, axis = 0)

Get element wise average of multiple arrays

to obtain the average for each list in the matrix:

averaged_array = np.array(array_1).mean(axis=1)

to obtain the average for each column:

averaged_array = np.array(array_1).mean(axis=0)

block mean of 2D numpy array (in both dimensions)

You can do:

# sample data
a=np.arange(24).reshape((6,4))

rows, cols = a.shape
a.reshape(rows//2, 2, cols//2, 2).mean(axis=(1,-1))

Output:

array([[ 2.5,  4.5],
[10.5, 12.5],
[18.5, 20.5]])

Calculate mean across a row of an array using a row of a masking array

If you want to use masked arrays, here is a streamlined way of doing that:

import numpy as np

# create some mock data
R_mat = np.arange(16).reshape(4, 4)
Y_mat = np.random.randint(0, 2, (4, 4))

R_mat
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15]])
Y_mat
# array([[0, 1, 0, 1],
# [0, 1, 1, 0],
# [0, 1, 0, 1],
# [0, 0, 1, 0]])

# compute all row means or all column means at once
# use Y_mat==0 to invert and convert to bool in one go
row_means = np.ma.MaskedArray(R_mat, Y_mat==0).mean(axis=1)
col_means = np.ma.MaskedArray(R_mat, Y_mat==0).mean(axis=0)

row_means
# masked_array(data=[2.0, 5.5, 10.0, 14.0],
# mask=[False, False, False, False],
# fill_value=1e+20)
col_means
# masked_array(data=[--, 5.0, 10.0, 7.0],
# mask=[ True, False, False, False],
# fill_value=1e+20)

# or take just one row or column and get the mean
np.ma.MaskedArray(R_mat, Y_mat==0)[2].mean()
# 10.0
np.ma.MaskedArray(R_mat, Y_mat==0)[:, 0].mean()
# masked

If for some reason you want to avoid masked arrays:

nrow, ncol = R_mat.shape

I, J = np.where(Y_mat)
row_means = np.bincount(I, R_mat[I, J], nrow) / np.bincount(I, None, nrow)

J, I = np.where(Y_mat.T)
col_means = np.bincount(J, R_mat[I, J], ncol) / np.bincount(J, None, ncol)
# __main__:1: RuntimeWarning: invalid value encountered in true_divide

row_means
# array([ 2. , 5.5, 10. , 14. ])
col_means
# array([nan, 5., 10., 7.])


Related Topics



Leave a reply



Submit