Scipy Curve_Fit Doesn't Like Math Module

scipy curve_fit doesn't like math module

Be careful with numpy-arrays, operations working on arrays and operations working on scalars!

Scipy optimize assumes the input (initial-point) to be a 1d-array and often things go wrong in other cases (a list for example becomes an array and if you assumed to work on lists, things go havoc; those kind of problems are common here on StackOverflow and debugging is not that easy to do by the eye; code-interaction helps!).

import numpy as np
import math

x = np.ones(1)

np.sin(x)
> array([0.84147098])

math.sin(x)
> 0.8414709848078965 # this only works as numpy has dedicated support
# as indicated by the error-msg below!
x = np.ones(2)

np.sin(x)
> array([0.84147098, 0.84147098])

math.sin(x)
> TypeError: only size-1 arrays can be converted to Python scalars

To be honest: this is part of a very basic understanding of numpy and should be understood when using scipy's somewhat sensitive functions.

scipy curve_fit doesn't work well

First of all, curve fitting is not a magical device that creates a good curve for any given data set. You can't fit an exponential curve well to a logarithmic data set. If you look at your data, does it look like it is well described by the function you define? Doesn't it rather look like an overlay of a linear and a sine function?

Then curve fitting is an iterative process, that is highly dependent on start values. From the scipy manual:

p0 : None, scalar, or N-length sequence, optional
Initial guess for the parameters. If None, then the initial values will all be 1

Why not provide a better guess for p0?

Last but not least, you get back two arrays. I would read out both, even if you only need one. It simplifies your code.
Try

p0 = (10, 20, 20, 1.5)
res, _popcv = scipy.optimize.curve_fit(fseries, x, y, p0, maxfev=10000)
xt = np.linspace(0, 10, 100)
yt = fseries(xt, *res)

and you get already a better fit. Sample Image
You can improve the fit further, when you define a better fit function with

def fseries(x, a0, a1, b1, w):
f = a0 * x + (a1 * np.cos(w * x)) + (b1 * np.sin(w * x))
return f

Sample Image

Whether this function is useful, you have to decide. Just because it fits better the data set, doesn't mean it is the right descriptor in your situation.

scipy curve_fit fails on easy linear fit?

NaN values will produce meaningless results - you need to exclude them from your data before doing any fitting. You use boolean indexing to do this:

valid = ~(np.isnan(y1) | np.isnan(y2))
popt, pcov = scop.curve_fit(f, y2[valid], y1[valid])

As mentioned in the comments, in versions of scipy newer than 0.15.0 curve_fit will automatically check for NaNs and Infs in your input arrays and will raise a ValueError if they are found. This behavior can be optionally disabled using the check_finite parameter.

Based on your question and comments, I'm assuming you must be using an older version - you should probably consider upgrading.

scipy curve_fit not working correctly

Using one of the ideas in the comments I got it to work:

from scipy.optimize import curve_fit
import matplotlib.pyplot as pyplot
import numpy as np

data = np.loadtxt(open("scipycurve.csv", "rb"), delimiter=",", skiprows=1)
xdata = data[:,0]
ydata = data[:,1]

def func(x, k, s, u):
x=np.array(x)
return k * (1 / (x * s * np.sqrt(2*np.pi))) * np.exp( - np.power((np.log(x)-u),2) / (2*np.power(s , 2)))

p0 = [1000,1,10]
popt, pcov = curve_fit(func, xdata, ydata, p0)

pyplot.figure()
pyplot.plot(xdata, ydata, label='Data', marker='o')
pyplot.plot(xdata, func(xdata, popt[0], popt[1], popt[2]), 'g--')
pyplot.show()

print (popt)

plot of data and fitted curve

[ 6.84279941e+07 5.09882839e-01 1.05414859e+01]

Hope it helps. Just looks like the algorithm needs some help in this case by giving it parameters.

Fit experimental data and obtain 2 parameters

The problem you experience is that you use the functions of the math module (they work on scalars) instead of the numpy functions that are designed to work on arrays. A simplified version of your problem (I am too lazy to go through all the functions you declared) would be:

from math import cos
import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
import pandas as pd

#works perfectly fine with the numpy arrays from the pandas data frame
def fun1(x, a, b, c, d):
return a * np.cos(b * x / 1000 + c) + d

#this version doesn't work because math works only on scalars
#def fun1(x, a, b, c, d):
# return a * cos(b * x / 1000 + c) + d

#read data from file you provided
mydata = pd.read_csv("test.txt", names = ["spin", "energy"])

#curve fit
params, extras = curve_fit(fun1,mydata.spin,mydata.energy)
print(params)

#generation of the fitting curve
x_fit = np.linspace(np.min(mydata.spin), np.max(mydata.spin), 1000)
y_fit = fun1(x_fit, *params)

#plotting of raw data and fit function
plt.plot(mydata.spin, mydata.energy, "ro", label = "data")
plt.plot(x_fit, y_fit, "b", label = "fit")
plt.legend()
plt.show()

So, the solution is to find all math functions in your script like sin, cos, sqrt and substitute them with their numpy equivalents. Luckily, they are usually simply np.sin etc.

scipy.integrate error:only size-1 arrays can be converted to Python scalars

Two problems in your code:

  • As mentioned, you should use numpy instead of math module.
  • Your f function's argument order is wrong. Scipy assumes that the first variable is x followed by other parameters. So it should be f(x, sigma), not f(sigma, x).

Solution:

import numpy as np
import scipy.integrate as integrate

def f(x, sigma):
return np.exp(-1*(x**2)/(2*sigma**2))/(sigma*np.sqrt(2*np.pi))

def prob_at_nsigma(sigma):
value = 0.
ans, anserr = integrate.quadrature(f, -1, 1, args=(sigma,), maxiter=21)
return ans

print(prob_at_nsigma(1))
# 0.6826894922280757


Related Topics



Leave a reply



Submit