Scipy curve_fit 给出幂律错误的结果

Question

我有网络中节点度数列表，我想将幂律分布拟合到该经验度分布。我可以使用 powerlaw 库来做到这一点，但由于某种原因，在尝试使用 scipy curve_fit 函数时会得到不好的结果。

我有一个名为

degree

的网络的度列表。我能够通过幂律库将幂律分布拟合到这些度数。我这样做：

import powerlaw as pwl

fit_function = pwl.Fit(degree,xmin=1)

alpha = fit_function.power_law.alpha
sigma = fit_function.power_law.sigma
xmin = fit_function.power_law.xmin

print("alpha = ", alpha)

我最终得到 alpha = 2.5，这恰好在社交网络的预期范围内。然后，我可以根据这个拟合分布在对数对数图上绘制数据，并获得漂亮的结果：

def powerlaw(x,alpha):
    y = (alpha - 1) *(x)**(-alpha)
    return y

x = np.linspace(min(unique_degrees), max(unique_degrees), 1000)
y = powerlaw(x, alpha)

plt.loglog(unique_degrees, counts_pdf, 'o', label='Degree Distribution')
plt.loglog(x, y, c='r', label='Fitted Power Law')

plt.legend()
plt.show()

现在，当我使用 scipy

curve_fit

问题尝试时：

def powerlaw(x,alpha):
    y = (alpha - 1) *(x)**(-alpha)
    return y

bounds = ([2, 3])
p0 = [2.5]
params, _ = curve_fit(powerlaw,unique_degrees, counts_pdf,p0=p0,bounds=bounds)

print("alpha = ", params[0]

当我无限制地执行此操作时，我得到的值为 alpha = 1.46。当我包含界限时，假设下限始终大于 1.46，则我的 alpha 始终等于该下限。我尝试在对数图上绘制数据，但结果看起来不太好：

x = np.linspace(min(unique_degrees), max(unique_degrees), 1000)
y1 = powerlaw(x,\*params)

plt.loglog(unique_degrees, counts_pdf, 'o', label='Degree Distribution')
plt.loglog(x, y1, c='g', label='Fitted Power Law 1')

plt.legend()
plt.show()

我做错了什么吗？

Answer 1

当您以非线性方式制定回归问题时，您可能会在误差函数上形成多个最小值以使其最小化。

以下是此类情况的示例：

MCVE

让我们创建一个合理的数据集：

import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize

def model(x, alpha):
    return (alpha - 1.) * np.power(x[:, 0], -alpha)

# Generate synthetic data with proportional noise
np.random.seed(12345)
X = np.linspace(1, 50, 30).reshape(-1, 1)
y = model(X, 2.5)
s = 0.05 * np.abs(y)
yn = y + s * np.random.normal(size=y.shape)

如果我们不加预防地适应，我们会得到：

# Optimization with default initial guess (unitary)
popt, pcov = optimize.curve_fit(model, X, yn, sigma=s, absolute_sigma=True)
# (array([1.00692318]), array([[6.30590102e-09]]))

如果我们绘制误差函数，我们会看到 NLLS 公式中存在两个最小值：

为了达到最佳效果，将初始参数移至峰值右侧就足够了：

# Optimization with educated guess:
popt, pcov = optimize.curve_fit(model, X, yn, sigma=s, p0=(3.,), absolute_sigma=True)
# (array([2.49840598]), array([[1.36086962e-05]]))

Scipy curve_fit 给出幂律错误的结果

问题描述投票：0回答：1

1个回答

MCVE

最新问题

Scipy curve_fit 给出幂律错误的结果

问题描述 投票：0回答：1

1个回答

MCVE

最新问题

问题描述投票：0回答：1