概述:我有一个Django网站,可在其中绘制数据图。我需要生成最适合的多项式曲线,但是图形的x值是日期或日期时间。我在numpy数组上使用polyfit。当我尝试将日期时间转换为整数或浮点数时,使用datetime_object.timestamp()
会得到非常奇怪的系数值,并且该图根本与数据不匹配。有什么办法可以将polyfit方法与日期时间(或日期)一起使用,以获得更多的更适合数据的正态系数值?
milliseconds = []
for i in pricingDateTimes:
dt_obj = datetime.strptime(i, '%Y-%m-%d %H:%M:%S')
milliseconds.append(dt_obj.timestamp() * 1000) # I have also tried division by powers of 10 to get more reasonable coefficients
x = array(milliseconds)
y = array(pricingMetricData)
quadratic = polyfit(x, y, 2)
因此,有一种方法可以使用polyfit和日期或日期时间作为x值来获取有效的数据系数?还有其他方法可以获取与日期时间或日期最匹配的曲线的系数吗?
谢谢!
您可以找到每个时间条目之间的时差并计算累计和。然后将其用作x值。
>>> timeval
[datetime.datetime(2019, 11, 29, 18, 23, 25, 123830), datetime.datetime(2019, 11, 29, 18, 23, 34, 123830), datetime.datetime(2019, 11, 29, 18, 23, 40, 123830), datetime.datetime(2019, 11, 29, 18, 23, 49, 123830), datetime.datetime(2019, 11, 29, 18, 23, 53, 123830), datetime.datetime(2019, 11, 29, 18, 23, 58, 123830), datetime.datetime(2019, 11, 29, 18, 23, 58, 123830), datetime.datetime(2019, 11, 29, 18, 24, 6, 123830), datetime.datetime(2019, 11, 29, 18, 24, 11, 123830), datetime.datetime(2019, 11, 29, 18, 24, 12, 123830), datetime.datetime(2019, 11, 29, 18, 24, 21, 123830), datetime.datetime(2019, 11, 29, 18, 24, 29, 123830), datetime.datetime(2019, 11, 29, 18, 24, 29, 123830), datetime.datetime(2019, 11, 29, 18, 24, 29, 123830), datetime.datetime(2019, 11, 29, 18, 24, 38, 123830), datetime.datetime(2019, 11, 29, 18, 24, 41, 123830), datetime.datetime(2019, 11, 29, 18, 24, 41, 123830), datetime.datetime(2019, 11, 29, 18, 24, 49, 123830), datetime.datetime(2019, 11, 29, 18, 24, 58, 123830), datetime.datetime(2019, 11, 29, 18, 24, 59, 123830)]
>>> x = np.array([x.seconds for x in np.diff(np.array(timeval))]).cumsum()
>>> x
array([ 9, 15, 24, 28, 33, 33, 41, 46, 47, 56, 64, 64, 64, 73, 76, 76, 84,
93, 94], dtype=int32)
有一个选项可以减去最小的时间戳,因此polyfit可以更稳定地工作:
# setup
times = ['2019-01-01 01:{:02d}:{:02d}'.format(mm,ss) for mm in range(24) for ss in range(0,60,10)]
l = len(times)
data = 3 * np.arange(l)**2 + 2 * np.arange(l) + 2.5 + np.random.rand(l)
# timestamps
timestamps = np.array([datetime.strptime(t,'%Y-%m-%d %H:%M:%S').timestamp()
for t in times])
timestamps -= milliseconds.min()
quadratics = np.polyfit(timestamps, data, 2)
y_preds = quadratics[0] * milliseconds**2 + quadratics[1] * milliseconds + quadratics[2]
# data in blue
plt.plot(timestamps, data, linewidth=5)
# prediction in white
plt.plot(timestamps, y_preds, color='w')
输出: