为什么在使用 3 参数 pdf 函数进行威布尔拟合期间,我的图中会出现这种间隙?

问题描述 投票:0回答:1

我想将数据拟合到 3 参数威布尔分布,但在我的图中,原始数据和拟合值之间始终存在差距。我有什么错吗?为什么当我使用这些initial_params = [1,1,1]时我会得到如此最差的拟合?

这是我的代码:

import numpy as np  
from scipy.optimize import minimize  
from scipy.special import gamma  
import matplotlib.pyplot as plt  
from scipy.stats import weibull_min

def weibull_log_likelihood(params, data):
    shape, scale, loc = params
    log_likelihood = -np.sum(weibull_min.logpdf(data, shape, loc=loc, scale=scale))
    return -log_likelihood  

def estimate_weibull_params(data):
    initial_params = [shape, scale, loc] 
    bounds=[(1, None), (1, None), (1, None)]
    result = minimize(weibull_log_likelihood, initial_params, args=(data,), method='nelder-mead', bounds=bounds) 
    return result.x  

def weibull_pdf(x, shape, scale, loc):
    return (shape / scale) * ((x - loc) / scale) ** (shape - 1) * np.exp(-((x - loc) / scale) ** shape)

shape = 7.5
scale = 150
loc = 350
size = 100
data = weibull_min.rvs(shape, loc=loc, scale=scale, size=size)
   
estimated_params = estimate_weibull_params(data)
shape, scale, loc = estimated_params

print(f"Estimated Parameters: Shape = {shape}, Scale = {scale}, Location = {loc}")

x = np.arange(1000)
pdf = weibull_pdf(x, shape, scale, loc)  
plt.hist(data, bins=20, density=True, alpha=0.6, color='g')  
plt.plot(x, pdf, 'r-', lw=2)  
plt.xlim(0, 1000)  
plt.show()  

enter image description here

python weibull
1个回答
0
投票

威布尔分布是一种非常极端的分布。它涉及权力的权力,而且会迅速失控。

如果您将 log_likelihood 函数更改为有效的 log(-log(pdf)) 那么您可能会得到更好的结果:

def weibull_log_likelihood(params, data):
    shape, scale, loc = params
    log_minus_log_likelihood = np.sum(np.log( -weibull_min.logpdf(data, shape, loc=loc, scale=scale)))
    return log_minus_log_likelihood

我建议您也将最初的猜测改进为更符合数据的内容:

initial_params = [ 1, np.max( data ) - np.min( data ), np.min( data ) ]

注意,我认为这不是一个寻找参数

loc
的特别好的方法,因为威布尔分布涉及到一个项((x-loc)/scale)^shape, 并且对 loc 的自由猜测可以让您计算负数的非整数幂。

Estimated Parameters: Shape = 5.383509295480495, Scale = 98.8928064025068, Location = 395.6676730395027

(显然,考虑到您的随机数据和相对较小的样本量,您每次都会得到不同的数字。)

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.