Fitting a 2D Gaussian Function Using Scipy.Optimize.Curve_Fit - Valueerror and Minpack.Error

Fitting a 2D Gaussian function using scipy.optimize.curve_fit - ValueError and minpack.error

The output of twoD_Gaussian needs to be 1D. What you can do is add a .ravel() onto the end of the last line, like this:

def twoD_Gaussian(xy, amplitude, xo, yo, sigma_x, sigma_y, theta, offset):
x, y = xy
xo = float(xo)
yo = float(yo)
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo)
+ c*((y-yo)**2)))
return g.ravel()

You'll obviously need to reshape the output for plotting, e.g:

# Create x and y indices
x = np.linspace(0, 200, 201)
y = np.linspace(0, 200, 201)
x, y = np.meshgrid(x, y)

#create data
data = twoD_Gaussian((x, y), 3, 100, 100, 20, 40, 0, 10)

# plot twoD_Gaussian data generated above
plt.figure()
plt.imshow(data.reshape(201, 201))
plt.colorbar()

Do the fitting as before:

# add some noise to the data and try to fit the data generated beforehand
initial_guess = (3,100,100,20,40,0,10)

data_noisy = data + 0.2*np.random.normal(size=data.shape)

popt, pcov = opt.curve_fit(twoD_Gaussian, (x, y), data_noisy, p0=initial_guess)

And plot the results:

data_fitted = twoD_Gaussian((x, y), *popt)

fig, ax = plt.subplots(1, 1)
#ax.hold(True) For older versions. This has now been deprecated and later removed
ax.imshow(data_noisy.reshape(201, 201), cmap=plt.cm.jet, origin='lower',
extent=(x.min(), x.max(), y.min(), y.max()))
ax.contour(x, y, data_fitted.reshape(201, 201), 8, colors='w')
plt.show()

Sample Image

Determening begin parameters 2D gaussian fit

If I use the runnable code from the linked question and substitute your definition of initial_guess:

mean_gauss_x = sum(x * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_x = np.sqrt(sum(data.reshape(201,201) * (x - mean_gauss_x)**2) / sum(data.reshape(201,201)))

mean_gauss_y = sum(y * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_y = np.sqrt(sum(data.reshape(201,201) * (y - mean_gauss_y)**2) / sum(data.reshape(201,201)))

initial_guess = (np.max(data), mean_gauss_x, mean_gauss_y, sigma_gauss_x, sigma_gauss_y,0,10)

Then

print(inital_guess)

yields

(13.0, array([...]), array([...]), array([...]), array([...]), 0, 10)

Notice that some of the values in initial_guess are arrays. The optimize.curve_fit function expects initial_guess to be a tuple of scalars. This is the source of the problem.


The error message

ValueError: setting an array element with a sequence

often arises when an array-like is supplied when a scalar value is expected. It is a hint that the source of the problem may have to do with an array having the wrong number of dimensions. For example, it might arise if you pass a 1D array to a function that expects a scalar.


Let's look at this piece of code taken from the linked question:

x = np.linspace(0, 200, 201)
y = np.linspace(0, 200, 201)
X, Y = np.meshgrid(x, y)

x and y are 1D arrays, while X and Y are 2D arrays. (I've capitalized all 2D arrays to help distinguish them from 1D arrays).

Now notice that Python sum and NumPy's sum method behave differently when applied to 2D arrays:

In [146]: sum(X)
Out[146]:
array([ 0., 201., 402., 603., 804., 1005., 1206., 1407.,
1608., 1809., 2010., 2211., 2412., 2613., 2814., 3015.,
...
38592., 38793., 38994., 39195., 39396., 39597., 39798., 39999.,
40200.])

In [147]: X.sum()
Out[147]: 4040100.0

The Python sum function is equivalent to

total = 0
for item in X:
total += item

Since X is a 2D array, the loop for item in X is iterating over the rows of X. Each item is therefore a 1D array representing a row of X. Thus, total ends up being a 1D array.

In contrast, X.sum() sums all the elements in X and returns a scalar.

Since initial_guess should be a tuple of scalars,
everywhere you use sum you should instead use the NumPy sum method. For example, replace

mean_gauss_x = sum(x * data) / sum(data)

with

mean_gauss_x = (X * DATA).sum() / (DATA.sum())

import numpy as np
import scipy.optimize as optimize
import matplotlib.pyplot as plt

# define model function and pass independant variables x and y as a list
def twoD_Gaussian(data, amplitude, xo, yo, sigma_x, sigma_y, theta, offset):
X, Y = data
xo = float(xo)
yo = float(yo)
a = (np.cos(theta) ** 2) / (2 * sigma_x ** 2) + (np.sin(theta) ** 2) / (
2 * sigma_y ** 2
)
b = -(np.sin(2 * theta)) / (4 * sigma_x ** 2) + (np.sin(2 * theta)) / (
4 * sigma_y ** 2
)
c = (np.sin(theta) ** 2) / (2 * sigma_x ** 2) + (np.cos(theta) ** 2) / (
2 * sigma_y ** 2
)
g = offset + amplitude * np.exp(
-(a * ((X - xo) ** 2) + 2 * b * (X - xo) * (Y - yo) + c * ((Y - yo) ** 2))
)
return g.ravel()

# Create x and y indices
x = np.linspace(0, 200, 201)
y = np.linspace(0, 200, 201)
X, Y = np.meshgrid(x, y)

# create data
data = twoD_Gaussian((X, Y), 3, 100, 100, 20, 40, 0, 10)
data_noisy = data + 0.2 * np.random.normal(size=data.shape)
DATA = data.reshape(201, 201)

# add some noise to the data and try to fit the data generated beforehand
mean_gauss_x = (X * DATA).sum() / (DATA.sum())
sigma_gauss_x = np.sqrt((DATA * (X - mean_gauss_x) ** 2).sum() / (DATA.sum()))

mean_gauss_y = (Y * DATA).sum() / (DATA.sum())
sigma_gauss_y = np.sqrt((DATA * (Y - mean_gauss_y) ** 2).sum() / (DATA.sum()))

initial_guess = (
np.max(data),
mean_gauss_x,
mean_gauss_y,
sigma_gauss_x,
sigma_gauss_y,
0,
10,
)
print(initial_guess)
# (13.0, 100.00000000000001, 100.00000000000001, 57.106515650488404, 57.43620227324201, 0, 10)
# initial_guess = (3,100,100,20,40,0,10)

popt, pcov = optimize.curve_fit(twoD_Gaussian, (X, Y), data_noisy, p0=initial_guess)

data_fitted = twoD_Gaussian((X, Y), *popt)

fig, ax = plt.subplots(1, 1)
ax.imshow(
data_noisy.reshape(201, 201),
cmap=plt.cm.jet,
origin="bottom",
extent=(X.min(), X.max(), Y.min(), Y.max()),
)
ax.contour(X, Y, data_fitted.reshape(201, 201), 8, colors="w")
plt.show()

Scipy Curve Fit: Result from function call is not a proper array of floats.

data_array is (2, 2400, 2400) float64 (from added print)

testmap is (2400, 2400) float64 (again a diagnostic print)

curve_fit docs talk about M length or (k,M) arrays.

You are providing (2,N,N) and (N,N) shape arrays.

Lets try flattening the N,N dimensions:

In the objective function:

def twoD_Gaussian(data_list, amplitude, xo, yo, sigma_x, sigma_y, offset):
x = data_list[0]
y = data_list[1]
x = x.reshape(2400,2400)
y = y.reshape(2400,2400)
theta = 0 # don't care about theta for the moment but want to leave the option in
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo) + c*((y-yo)**2)))
return g.ravel()

and

and in the calls:

testmap = twoD_Gaussian(data_array.reshape(2,-1), initial_guess[0], initial_guess[1], initial_guess[2], initial_guess[3], initial_guess[4], initial_guess[5])
# shape (5760000,) float64
print(type(testmap),testmap.shape, testmap.dtype)
popt, pcov = opt.curve_fit(twoD_Gaussian, data_array.reshape(2,-1), testmap, p0=initial_guess)

And it runs:

1624:~/mypy$ python3 stack65587542.py 
(2, 2400, 2400) float64
<class 'numpy.ndarray'> (5760000,) float64

popt and pcov:

[-3.0e+00  1.2e+03  1.2e+03  1.0e+02  1.0e+02 -1.0e+00] 
[[ 0. -0. -0. 0. 0. -0.]
[-0. 0. -0. -0. -0. -0.]
[-0. -0. 0. -0. -0. -0.]
[ 0. -0. -0. 0. 0. 0.]
[ 0. -0. -0. 0. 0. 0.]
[-0. -0. -0. 0. 0. 0.]]

The popt values are the same as initial_guess as expected with the exact testmap.

So the basic issue is that you did not take the documented specifications seriously. That

ValueError: object too deep for desired array

error message is a bit obscure, though I vaguely recall seeing it before. Sometimes we get errors like this when inputs are ragged arrays and the result arrays is object dtype. But here it's simply a matter of shape.

A past SO with similar problem and fix:

Scipy curve_fit for Two Dimensions Not Working - Object Too Deep?

ValueError When Performing scipy.stats test on Pandas Column Selection by Row

Fitting a 2D Gaussian function using scipy.optimize.curve_fit - ValueError and minpack.error

This is just a subset of SO with the same error message. Other scipy functions produce it. And often the problem is with shapes like (m,1) instead of (N,N). I'd be tempted to close this as a duplicate, but my long answer with debugging details may be instructive.



Related Topics



Leave a reply



Submit