我目前正在尝试使用 tensorflow 更深入地了解 ML。我目前的项目是预测未来 10 天标准普尔 500 指数的价格。我已经有了以下代码,但问题是,结果或多或少不充分,这意味着它们极不稳定,甚至低于零。有没有我忽略的错误?
import math
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
import matplotlib.pyplot as plt
plt.style.use("fivethirtyeight")
file_path = "gspc.csv"
df = pd.read_csv(file_path)
df["Date"] = pd.to_datetime(df["Date"])
df.set_index("Date", inplace=True)
data = df.filter(["Close"])
dataset = data.values
training_data_len = math.ceil(len(dataset) * 0.8)
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(dataset)
train_data = scaled_data[0:training_data_len, :]
x_train = []
y_train = []
for i in range(60, len(train_data)):
x_train.append(train_data[i - 60:i, 0])
y_train.append(train_data[i, 0])
x_train, y_train = np.array(x_train), np.array(y_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(LSTM(50, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
model.compile(optimizer="adam", loss="mean_squared_error")
model.fit(x_train, y_train, batch_size=1, epochs=50)
test_data = scaled_data[training_data_len - 60:, :]
x_test = []
y_test = dataset[training_data_len:, :]
for i in range(60, len(test_data)):
x_test.append(test_data[i-60:i, 0])
x_test = np.array(x_test)
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
rmse = np.sqrt(np.mean(predictions - y_test)**2)
train = data[:training_data_len]
valid = data[training_data_len:]
valid["Predictions"] = predictions
last_60_days = data[-60:].values
last_60_days_scaled = scaler.transform(last_60_days)
prediction_list = []
for i in range(10):
x_input = last_60_days_scaled.reshape((1, 60, 1))
next_price = model.predict(x_input)[0, 0]
prediction_list.append(next_price)
last_60_days = np.append(last_60_days[1:], [[next_price]], axis=0)
last_60_days_scaled = scaler.transform(last_60_days)
prediction_list = scaler.inverse_transform(np.array(prediction_list).reshape(-1, 1))
print(prediction_list)
我尝试使用更多的时代。但是结果还是不尽如人意
您正试图通过假设估值上下波动的固定模式来预测股票价格。这是一个有缺陷的假设。股市是出了名的难以预测,尤其是在这个算法交易时代,这些预测已经融入了估值中。当模型的唯一输入是前几天的价格时,这就变得更加困难了。
话虽这么说,如果可以找到的话,尝试使用 1900 年代初期的历史数据可能会更容易。然后你可以看看你的模型是否可以在市场might(我不是这方面的专家)在价格上涨和下跌方面有更多可预测模式的时间段内找到任何一致的模式。正如我所提到的,仅使用前几天的价格作为输入确实很难。