当x值为日期时间时,如何使用polyfit获得最佳拟合曲线?

问题描述 投票:1回答:2

概述:我有一个Django网站,可在其中绘制数据图。我需要生成最适合的多项式曲线,但是图形的x值是日期或日期时间。我在numpy数组上使用polyfit。当我尝试将日期时间转换为整数或浮点数时,使用datetime_object.timestamp()会得到非常奇怪的系数值,并且该图根本与数据不匹配。有什么办法可以将polyfit方法与日期时间(或日期)一起使用,以获得更多的更适合数据的正态系数值?

milliseconds = []
for i in pricingDateTimes:
    dt_obj = datetime.strptime(i, '%Y-%m-%d %H:%M:%S')
    milliseconds.append(dt_obj.timestamp() * 1000)  # I have also tried division by powers of 10 to get more reasonable coefficients
    x = array(milliseconds)
    y = array(pricingMetricData)
    quadratic = polyfit(x, y, 2)

因此,有一种方法可以使用polyfit和日期或日期时间作为x值来获取有效的数据系数?还有其他方法可以获取与日期时间或日期最匹配的曲线的系数吗?

谢谢!

python numpy datetime scipy curve-fitting
2个回答
1
投票

您可以找到每个时间条目之间的时差并计算累计和。然后将其用作x值。

>>> timeval
[datetime.datetime(2019, 11, 29, 18, 23, 25, 123830), datetime.datetime(2019, 11, 29, 18, 23, 34, 123830), datetime.datetime(2019, 11, 29, 18, 23, 40, 123830), datetime.datetime(2019, 11, 29, 18, 23, 49, 123830), datetime.datetime(2019, 11, 29, 18, 23, 53, 123830), datetime.datetime(2019, 11, 29, 18, 23, 58, 123830), datetime.datetime(2019, 11, 29, 18, 23, 58, 123830), datetime.datetime(2019, 11, 29, 18, 24, 6, 123830), datetime.datetime(2019, 11, 29, 18, 24, 11, 123830), datetime.datetime(2019, 11, 29, 18, 24, 12, 123830), datetime.datetime(2019, 11, 29, 18, 24, 21, 123830), datetime.datetime(2019, 11, 29, 18, 24, 29, 123830), datetime.datetime(2019, 11, 29, 18, 24, 29, 123830), datetime.datetime(2019, 11, 29, 18, 24, 29, 123830), datetime.datetime(2019, 11, 29, 18, 24, 38, 123830), datetime.datetime(2019, 11, 29, 18, 24, 41, 123830), datetime.datetime(2019, 11, 29, 18, 24, 41, 123830), datetime.datetime(2019, 11, 29, 18, 24, 49, 123830), datetime.datetime(2019, 11, 29, 18, 24, 58, 123830), datetime.datetime(2019, 11, 29, 18, 24, 59, 123830)]
>>> x = np.array([x.seconds for x in np.diff(np.array(timeval))]).cumsum()
>>> x
array([ 9, 15, 24, 28, 33, 33, 41, 46, 47, 56, 64, 64, 64, 73, 76, 76, 84,
       93, 94], dtype=int32)

0
投票

有一个选项可以减去最小的时间戳,因此polyfit可以更稳定地工作:

# setup
times = ['2019-01-01 01:{:02d}:{:02d}'.format(mm,ss) for mm in range(24) for ss in range(0,60,10)]
l = len(times)

data = 3 * np.arange(l)**2 + 2 * np.arange(l) + 2.5 + np.random.rand(l)

# timestamps
timestamps = np.array([datetime.strptime(t,'%Y-%m-%d %H:%M:%S').timestamp()
                         for t in times])
timestamps -= milliseconds.min()

quadratics = np.polyfit(timestamps, data, 2)
y_preds = quadratics[0] * milliseconds**2 + quadratics[1] * milliseconds + quadratics[2]

# data in blue
plt.plot(timestamps, data, linewidth=5)

# prediction in white
plt.plot(timestamps, y_preds, color='w')

输出:

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.