问题

I am fitting a set of experimental data (sample) within two different experimental regions and can be expressed with two mathematical functions as follows:

1st region:

y = m*x + c ( the slope can be constrained to zero)

2nd region:

y = d*exp(-k*x)

the experimental data is shown below and I coded it in python as follows:

def func(x, m, c, d, k):
   return m*x+ c + d*np.exp(-k*x) 
popt, pcov = curve_fit(func, t, y)

Unfortunately, my data is not fitting properly and fitted (returned) parameters do not make sense (see picture below).

Any assistance will be appreciated.

回答1:

Very interesting question. As said by a_guest, you will have to fit to the two regions separately. However, I think you probably also want the two regions to connect smoothly at the point t0, the point where we switch from one model to the other. In order to do this, we need to add the constraint that y1 == y2 at the point t0.

In order to do this with scipy, look at scipy.optimize.minimize with the SLSQP method. However, I wrote a scipy wrapper to make this kind of thing easier, called symfit. I will show you how to do this with symfit, because I think it's better suited to the task, but with this example you should also be able to implement it with pure scipy if you prefer.

from symfit import parameters, variables, Fit, Piecewise, exp, Eq
import numpy as np
import matplotlib.pyplot as plt

t, y = variables('t, y')
m, c, d, k, t0 = parameters('m, c, d, k, t0')

# Help the fit by bounding the switchpoint between the models
t0.min = 0.6
t0.max = 0.9

# Make a piecewise model
y1 = m * t + c
y2 = d * exp(- k * t)
model = {y: Piecewise((y1, t <= t0), (y2, t > t0))}

# As a constraint, we demand equality between the two models at the point t0
# to do this, we substitute t -> t0 and demand equality using `Eq`
constraints = [Eq(y1.subs({t: t0}), y2.subs({t: t0}))]

# Read the data
tdata, ydata = np.genfromtxt('Experimental Data.csv', delimiter=',', skip_header=1).T

fit = Fit(model, t=tdata, y=ydata, constraints=constraints)
fit_result = fit.execute()
print(fit_result)

plt.scatter(tdata, ydata)
plt.plot(tdata, fit.model(t=tdata, **fit_result.params).y)
plt.show()

回答2:

Since your data shows different behavior in different regions you also need to fit the data on these different regions. That is instead of making a sum of the two models (functions) you should fit one the left region with y = m*x + c and separately on the right region with y = d*exp(-k*x). If you have trouble finding the boundary of the two regions you could assess this by comparing the goodness of fit.

popt_1, pcov_1 = curve_fit(lambda x, m, c: m*x + c, t[t < 0.8], y[t < 0.8], p0=(1, 0))
popt_2, pcov_2 = curve_fit(lambda x, d, k: d*exp(-k*x), t[t >= 0.8], y[t >= 0.8], p0=(400, 1))

Edit

Example code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.optimize import curve_fit


df = pd.read_csv('test.csv', index_col=None)
t = df.t.values
y = df.Y.values

boundary = t[y.argmax()]
t1 = t[t < boundary]
y1 = y[t < boundary]
t2 = t[t >= boundary]
y2 = y[t >= boundary]

f1 = lambda x, m, c: m*x + c
f2 = lambda x, d, k: d*np.exp(-k*x)
popt_1 ,pcov_1 = curve_fit(f1, t1, y1, p0=((y1[-1] - y1[0]) / (t1[-1] - t1[0]), y1[0]))
popt_2 ,pcov_2 = curve_fit(f2, t2, y2, p0=(y2[0], 1))

plt.title('Fitted data on two different domains')
plt.xlabel('t [a.u.]')
plt.ylabel('y [a.u.]')
plt.plot(t, y, '-o', label='Data')
plt.plot(t1, f1(t1, *popt_1), '--', color='#ff7f0e', lw=3, label='Fit')
plt.plot(t2, f2(t2, *popt_2), '--', color='#ff7f0e', lw=3, label='_nolegend_')
plt.grid()
plt.legend()
plt.show()

Which produces the following plot:

Note that the resulting "compound" function is not continuous at the boundary. If that is undesired you can resolve it by fixing one the fit parameters (e.g.k) before fitting the other domain (one way or the other). Alternatively you could fit both regions separately, then determine the value at the boundary as the average of the two separate functions (i.e. y_b = (f1(t1[-1], *popt_1) + f2(t2[0], *popt_2)) / 2) and then repeat the fitting by constraining the parameters such that this boundary condition is fulfilled.

For example fitting the linear function first and then fixing the d parameter of the exponential in order to have a continuous transition at the boundary (note that the linear function f1 is extrapolated outside its domain at t2[0] in order to ensure the continuity):

f1 = lambda x, m, c: m*x + c
popt_1, pcov_1 = curve_fit(f1, t1, y1, p0=((y1[-1] - y1[0]) / (t1[-1] - t1[0]), y1[0]))

d = f1(t2[0], *popt_1)
f2 = lambda x, k: d*np.exp(-k*(x - boundary))
popt_2, pcov_2 = curve_fit(f2, t2, y2, p0=(1,))

Which produces the following plot:

回答3:

If you would prefer to use a single equation, I found that the Hocket-Sherby equation "y = b - (b-a) * exp(-c * (x**d))" seems like an OK fit to your data, yielding an R-squared of 0.99 and RMSE of 11.2 with parameters a = 1.1262189756312683E+01, b = 3.2040596733114870E+02, c = 3.9385197507261771E-01, and d = -4.7723382040098095E+00

来源：https://stackoverflow.com/questions/52030866/fitting-of-experimental-data-within-two-different-regions

标签

python

scipy

curve-fitting

Fitting of experimental data within two different regions

问题

回答1:

回答2:

Edit

回答3: