我是人工神经网络世界的新手,所以如果我犯了一些错误,请原谅我,如果你能纠正我。我想使用一个LSTM模型来能够预测市场上比特币的价格。我知道这个模型的实际局限性,但我是为了教育目的而创建的。
我不知道该把它定义为多层模型还是多变量模型(如果有人能解释其中的区别,我将感激不尽)基本上,一个在收盘价上训练的模型称为 "close",可以通过观察前60天的情况来预测第二天的收盘价。
我从这里建立模型没有问题,我只是和你说了一下,问题是我想用其他信息来训练模型,比如成交量或者当天的最高价。重要的是能够决定在模型中插入哪两类信息。我发现了一个网站,在那里 在Keras中使用LSTMs进行多变量时间序列预测。的详细解释,但我无法将其应用于我的具体案例。你能不能帮我把'成交量'这个变量整合到模型中,看看未来'收盘'收盘价的预测能力是提高了还是恶化了?
数据是这种类型的,可以从kaggle-->这里下载。下载
import pandas as pd
import math
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
import matplotlib.pyplot as plt
#Create a new dataframe with only the 'Close column
data = df.filter(['close'])
#Convert the dataframe to a numpy array
dataset = data.values
#Get the number of rows to train the model on
training_data_len = math.ceil( len(dataset) * .8 )
#scale data
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(dataset)
#Create the scaled training data set
train_data = scaled_data[0:training_data_len , :]
#Split the data into x_train and y_train data sets
x_train = []
y_train = []
for i in range(60, len(train_data)):
x_train.append(train_data[i-60:i, 0])
y_train.append(train_data[i, 0])
# if i<= 61:
# print(x_train)
# print(y_train)
# print()
#Convert the x_train and y_train to numpy arrays
x_train, y_train = np.array(x_train), np.array(y_train)
#Reshape the data
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
print (x_train.shape)
#Build the LSTM model
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(50, return_sequences= False))
model.add(Dense(25))
model.add(Dense(1))
#Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
#Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)
#Create the testing data set
#Create a new array containing scaled values
test_data = scaled_data[training_data_len - 60: , :]
#Create the data sets x_test and y_test
x_test = []
y_test = dataset[training_data_len:, :]
for i in range(60, len(test_data)):
x_test.append(test_data[i-60:i, 0])
#Convert the data to a numpy array
x_test = np.array(x_test)
#Reshape the data
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1 ))
# print (len(x_test))
# #Get the models predicted price values
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
print(predictions)
#Get the root mean squared error (RMSE)
rmse=np.sqrt(np.mean(((predictions- y_test)**2)))
print (rmse)
从代码和评论中,我了解到你是在对单变量数据进行时间序列预测(只有列是... 关闭),现在,想对多变量数据进行时间序列预测(与列。关闭 和 体积).
对你来说,代码的重要部分将是函数。multivariate_data
它根据过去60天的历史返回特征和标签,目标日期为1天。
past_history = 60
future_target = 1
完整的Multi_Variate Data的工作代码(直到训练)如下所示。
import pandas as pd
import math
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
import matplotlib.pyplot as plt
df = pd.read_csv('datasets_101543_240726.csv')
#Create a new dataframe with only the 'Close column
data = df.filter(['close', 'Volume'])
#Convert the dataframe to a numpy array
dataset = data.values
#Get the number of rows to train the model on
training_data_len = math.ceil( len(dataset) * .8 )
# scale data
data_mean = dataset[:training_data_len].mean(axis=0)
data_std = dataset[:training_data_len].std(axis=0)
dataset = (dataset-data_mean)/data_std
def multivariate_data(dataset, target, start_index, end_index, history_size,
target_size):
data = []
labels = []
start_index = start_index + history_size
if end_index is None:
end_index = len(dataset) - target_size
for i in range(start_index, end_index):
indices = range(i-history_size, i)
data.append(dataset[indices])
labels.append(target[i:i+target_size])
return np.array(data), np.array(labels)
past_history = 60
future_target = 1
x_train, y_train = multivariate_data(dataset, dataset[:, 0], 0,
training_data_len, past_history,
future_target)
x_val, y_val = multivariate_data(dataset, dataset[:, 0],
training_data_len, None, past_history,
future_target)
#Reshape the data
#x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
#print (x_train.shape)
#Build the LSTM model
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(50, return_sequences= False))
model.add(Dense(25))
model.add(Dense(1))
#Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
#Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)
在您的代码中测试数据。x_test
和 y_test
可替换为 x_val
和 y_val
而你可以执行 predictions
在该数据上。
请参考 Tensorflow教程 相关的多变量数据的时间序列预测的完整代码。
希望对大家有所帮助。祝大家学习愉快!