高斯拟合数据差距

Question

我想在反射光谱中的吸收带上拟合高斯函数。问题是，在这部分光谱中，我的数据中有一个间隙（图中的空白区域），大约在 0.85 到 1.3 µm 之间。我真的需要这个高斯，因为在此之后我想计算光谱参数，例如全半极大值和高斯占据的面积。

这是我用来执行高斯拟合的函数（归功于此post）：

def gauss(x, H, A, x0, sigma):
    return H + A * np.exp(-(x - x0) ** 2 / (2 * sigma ** 2))

def gauss_fit(x, y):
    mean = sum(x * y) / sum(y)
    sigma = np.sqrt(sum(y * (x - mean) ** 2) / sum(y))
    popt, pcov = scipy.optimize.curve_fit(gauss, x, y, p0=[min(y), max(y), mean, sigma])
    return popt

我如何应用它：

H, A, x0, sigma = gauss_fit(x, y)

并绘制它：

plt.plot(x, y, '.k', label='data')
plt.plot(x, gauss(x, *gauss_fit(x, y)), '-r', label='fit')
plt.xlabel('Wavelength (µm)')
plt.ylabel('Reflectance')
plt.legend()

这是我得到的：高斯拟合数据差距

如您所见，在达到间隙之前，配合似乎工作正常。

如果我能有一个干净的高斯拟合，那对我真的很有帮助，如果我随后可以轻松计算 FMWH 和高斯面积，就会加分。

请注意，解决方案不应过于具体，因为我有大量数据集需要处理。因此它应该可以在循环中实现。

我能找到的唯一谈论这个问题的帖子是这个one，但它没有为我提供令人满意的解决方案。

这是我第一次在 Stack Overflow 上发帖，如果需要，请随时询问任何补充信息。

编辑1

我认为 Tino D 和 lastchance 的答案解决了我代码的主要问题，那就是我定义了一个正常的高斯拟合而不是减去的拟合。我获得了这个新的fit。

然而，正如你所看到的，它仍然不完美，而且奇怪的是，它在拟合的右侧超出了 y=1 。我认为问题现在出在我的数据本身，所以here它是按照要求的。

我使用 pandas 来管理我的数据，因此文件采用 pickle 格式，因为我将整个 nupy 数组存储在唯一的 DataFrame 单元格中。对于这种拟合，我们只需要“波长”和“归一化反射率”列。我们也只需要部分光谱，所以这是我用来采样我需要的代码：

#df_merged is the name of the DataFrame

R0700=FindNearest(df_merged.at[0,'Wavelength'], df_merged.at[0,'Reflectance Normalized'], 0.7, show=True)
R1800=FindNearest(df_merged.at[0,'Wavelength'], df_merged.at[0,'Reflectance Normalized'], 1.8, show=True)

x=df_merged.at[0,'Wavelength'][R0700[1]:R1800[1]]
y=df_merged.at[0,'Reflectance Normalized'][R0700[1]:R1800[1]]

FindNearest 是我的一个函数，我用它来查找数组中的特定值，它的定义如下：

def FindNearest(wl, rf, x, show=False):      # wl = wavelength array ; rf = reflectance array ; x = value to find
    
    wl_dif=np.abs(wl-x)
    idx=np.argmin(wl_dif)
    
# Get reflectance and wavelength values
    
    wl_x=wl[idx]
    rf_x=rf[idx]
    
    if show:
        print("Nearest wavelength :", wl_x, ", Index :", idx, ", Corresponding reflectance :", rf_x)

    
# Values : [0] = nearest wavelength value, [1] = position of nearest value in array 
# (both reflectance and wavelength since same size), [2] = nearest reflectance value
    
    return(wl_x, idx, rf_x)

我们几乎得到了这个，非常感谢你2！

Answer 1

正如lastchance所说和我的评论所暗示的，你拟合了错误的曲线（或者你最初的猜测不兼容）。您编写的函数是普通高斯函数，而不是减法函数。所以可以使用以下

import numpy as np
%matplotlib notebook
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def gauss(x, H, A, x0, sigma):
    return H - A * np.exp(-(x - x0) ** 2 / (2 * sigma ** 2))
def gauss_fit(x, y):
    mean = sum(x * y) / sum(y)
    sigma = np.sqrt(sum(y * (x - mean) ** 2) / sum(y))
    popt, pcov = curve_fit(gauss, x, y, p0=[min(y), max(y), mean, sigma])
    return popt
# generate toydataset
x = np.linspace(0.5, 2, 100)
H, A, x0, sigma = 1, 0.5, 1.2, 0.3
y = gauss(x, H, A, x0, sigma) + 0.005 * np.random.normal(size=x.size)
# enerate gap in data
gapStart = int(0.1 * len(x))
gapEnd = int(0.4 * len(x))
xGap = np.concatenate([x[:gapStart], x[gapEnd:]])
yGap = np.concatenate([y[:gapStart], y[gapEnd:]])
popt = gauss_fit(xGap, yGap)
plt.figure()
plt.scatter(xGap, yGap, label='Gap in data', color='blue')
plt.plot(x, gauss(x, *popt), label='Fit', color='red', linestyle='--')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.title("30% missing")
plt.grid()

这会产生以下结果：

现在我确实生成了玩具数据，但想法保持不变。至于计算：

H_fit, A_fit, x0_fit, sigma_fit = popt
'''
full width at half maximum taken from here and area calculation here:
https://www.physicsforums.com/threads/area-under-gaussian-peak-by-easy-measurements.419285/
'''
FWHM = 2.35 * sigma_fit 
Area = H_fit*FWHM/(2.35*0.3989)
print(f"FWHM = {FWHM}")
print(f"Area = {Area}")

结果：

FWHM = 0.7030608784583746
Area = 0.7486638847495126

希望这有帮助！

高斯拟合数据差距

问题描述投票：0回答：1

编辑1

1个回答

最新问题

高斯拟合数据差距

问题描述 投票：0回答：1

编辑1

1个回答

最新问题

问题描述投票：0回答：1