如何改进回归模型以将噪声数据与真实曲线匹配?

问题描述 投票:0回答:1

我遇到了回归问题。如图所示,橙色线是真实情况,蓝色散点是我的数据。数据的总体趋势与真实情况相符,但存在大量噪声。如何通过回归从我的数据中得到一条与真实情况基本一致的曲线?我尝试记录数据,然后进行线性回归。它与真实情况非常不同,甚至与我当前的蓝色散点也非常不同。这是我的数据之一:

array([173.4098  , 175.45181 , 174.13388 , 173.40126 , 168.39598 ,
       170.03275 , 174.5293  , 165.87642 , 159.7338  , 161.6138  ,
       162.9032  , 163.47513 , 154.69208 , 158.92336 , 154.58157 ,
       150.18645 , 152.07083 , 150.4797  , 151.29477 , 148.55183 ,
       146.464   , 143.84012 , 145.89365 , 142.02222 , 144.87402 ,
       139.96799 , 142.3692  , 137.01068 , 142.52402 , 132.98807 ,
       137.32303 , 132.90698 , 133.54742 , 130.06049 , 130.66891 ,
       130.84998 , 129.88948 , 123.85362 , 125.86339 , 129.53204 ,
       128.61484 , 128.87492 , 126.015274, 123.903114, 120.53897 ,
       120.38696 , 129.81078 , 120.591125, 119.701645, 119.92349 ,
       120.14763 , 119.11101 , 120.00702 , 117.86407 , 116.26706 ,
       115.46516 , 112.11573 , 113.74388 , 111.47281 , 114.65326 ,
       109.15923 , 111.74715 , 110.95357 , 111.46296 , 109.9637  ,
       109.58853 , 108.28537 , 111.840836, 107.205475, 111.05708 ,
       107.724075, 109.72452 , 106.84272 , 105.18547 , 103.491745,
       107.05888 , 103.77411 ,  99.423706, 102.43909 , 101.53308 ,
       102.69588 , 108.018585, 103.53029 ,  99.62952 , 104.83856 ,
       104.23057 , 101.18348 , 102.52391 , 104.558334,  98.90404 ,
       101.16083 ,  97.97317 ,  95.47827 , 100.85654 , 102.93936 ,
        98.681854,  97.37257 ,  97.05141 ,  92.266624,  98.8342  ,
        94.678894,  92.807495,  92.73536 ,  95.94114 ,  95.84711 ,
        94.92062 ,  97.35336 ,  92.18617 ,  87.458984,  92.3882  ,
        95.487915,  97.04467 ,  93.190155,  91.25882 ,  96.17412 ,
        92.43962 ,  94.880844,  91.82003 ,  95.34397 ,  92.99954 ],
      dtype=float32)

提前谢谢您。

python regression data-science exponential
1个回答
0
投票

您可以使用 scipy.optimize 中的 curve_fit 。我尝试了

y=a.exp(-b x^n)
形式的指数衰减曲线,并搜索参数
a
b
n

如果您可以在参数上指定一些初始搜索范围,效果会更好。

随意定制您想要的贴合功能。

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

def func( X, A, B, n ):
    return A * np.exp( - B * X ** n )


x = np.linspace( 1.0, 120.0, 120 )     # You don't state this, so I'm guessing
y = np.array( [173.4098  , 175.45181 , 174.13388 , 173.40126 , 168.39598 ,
               170.03275 , 174.5293  , 165.87642 , 159.7338  , 161.6138  ,
               162.9032  , 163.47513 , 154.69208 , 158.92336 , 154.58157 ,
               150.18645 , 152.07083 , 150.4797  , 151.29477 , 148.55183 ,
               146.464   , 143.84012 , 145.89365 , 142.02222 , 144.87402 ,
               139.96799 , 142.3692  , 137.01068 , 142.52402 , 132.98807 ,
               137.32303 , 132.90698 , 133.54742 , 130.06049 , 130.66891 ,
               130.84998 , 129.88948 , 123.85362 , 125.86339 , 129.53204 ,
               128.61484 , 128.87492 , 126.015274, 123.903114, 120.53897 ,
               120.38696 , 129.81078 , 120.591125, 119.701645, 119.92349 ,
               120.14763 , 119.11101 , 120.00702 , 117.86407 , 116.26706 ,
               115.46516 , 112.11573 , 113.74388 , 111.47281 , 114.65326 ,
               109.15923 , 111.74715 , 110.95357 , 111.46296 , 109.9637  ,
               109.58853 , 108.28537 , 111.840836, 107.205475, 111.05708 ,
               107.724075, 109.72452 , 106.84272 , 105.18547 , 103.491745,
               107.05888 , 103.77411 ,  99.423706, 102.43909 , 101.53308 ,
               102.69588 , 108.018585, 103.53029 ,  99.62952 , 104.83856 ,
               104.23057 , 101.18348 , 102.52391 , 104.558334,  98.90404 ,
               101.16083 ,  97.97317 ,  95.47827 , 100.85654 , 102.93936 ,
                98.681854,  97.37257 ,  97.05141 ,  92.266624,  98.8342  ,
                94.678894,  92.807495,  92.73536 ,  95.94114 ,  95.84711 ,
                94.92062 ,  97.35336 ,  92.18617 ,  87.458984,  92.3882  ,
                95.487915,  97.04467 ,  93.190155,  91.25882 ,  96.17412 ,
                92.43962 ,  94.880844,  91.82003 ,  95.34397 ,  92.99954 ] )

params, cv = curve_fit( func, x, y, bounds = ( ( 100.0, 0.0, 0.0 ), ( 300.0, 1.0, 2.0 ) ) )

a, b, n = params
print( "Fit: y = a.exp( -b x^n ) where [a, b, n] =", *params )

# plot the results
xfit = np.linspace( min( x ), max( x ), 100 )
yfit = func( xfit, *params )
plt.plot( x, y, 'bo', label="data" )
plt.plot( xfit, yfit, 'k-', label="fitted")
plt.show()

输出:

Fit: y = a.exp( -b x^n ) where [a, b, n] = 192.12413638168618 0.047421045318959514 0.5818668562501786

© www.soinside.com 2019 - 2024. All rights reserved.