用LSTM将单变量时间序列预测转变为多变量时间序列预测。

Question

我是人工神经网络世界的新手，所以如果我犯了一些错误，请原谅我，如果你能纠正我。我想使用一个LSTM模型来能够预测市场上比特币的价格。我知道这个模型的实际局限性，但我是为了教育目的而创建的。

我不知道该把它定义为多层模型还是多变量模型（如果有人能解释其中的区别，我将感激不尽）基本上，一个在收盘价上训练的模型称为 "close"，可以通过观察前60天的情况来预测第二天的收盘价。

我从这里建立模型没有问题，我只是和你说了一下，问题是我想用其他信息来训练模型，比如成交量或者当天的最高价。重要的是能够决定在模型中插入哪两类信息。我发现了一个网站，在那里在Keras中使用LSTMs进行多变量时间序列预测。的详细解释，但我无法将其应用于我的具体案例。你能不能帮我把'成交量'这个变量整合到模型中，看看未来'收盘'收盘价的预测能力是提高了还是恶化了？

数据是这种类型的，可以从kaggle--&gt这里下载。下载

import pandas as pd
import math
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
import matplotlib.pyplot as plt


#Create a new dataframe with only the 'Close column
data = df.filter(['close'])
#Convert the dataframe to a numpy array
dataset = data.values
#Get the number of rows to train the model on
training_data_len = math.ceil( len(dataset) * .8 )

#scale data
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(dataset)

#Create the scaled training data set
train_data = scaled_data[0:training_data_len , :]
#Split the data into x_train and y_train data sets
x_train = []
y_train = []

for i in range(60, len(train_data)):
  x_train.append(train_data[i-60:i, 0])
  y_train.append(train_data[i, 0])
  # if i<= 61:
    # print(x_train)
    # print(y_train)
    # print()

#Convert the x_train and y_train to numpy arrays
x_train, y_train = np.array(x_train), np.array(y_train)

#Reshape the data
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
print (x_train.shape)

#Build the LSTM model
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(50, return_sequences= False))
model.add(Dense(25))
model.add(Dense(1))

#Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
#Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)

#Create the testing data set
#Create a new array containing scaled values 
test_data = scaled_data[training_data_len - 60: , :]
#Create the data sets x_test and y_test
x_test = []
y_test = dataset[training_data_len:, :]
for i in range(60, len(test_data)):
  x_test.append(test_data[i-60:i, 0])

#Convert the data to a numpy array
x_test = np.array(x_test)
#Reshape the data

x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1 ))

# print (len(x_test))
# #Get the models predicted price values
predictions = model.predict(x_test)

predictions = scaler.inverse_transform(predictions)
print(predictions)

#Get the root mean squared error (RMSE)
rmse=np.sqrt(np.mean(((predictions- y_test)**2)))
print (rmse)

Answer 1

从代码和评论中，我了解到你是在对单变量数据进行时间序列预测（只有列是... 关闭)，现在，想对多变量数据进行时间序列预测（与列。关闭和体积).

对你来说，代码的重要部分将是函数。multivariate_data它根据过去60天的历史返回特征和标签，目标日期为1天。

past_history = 60
future_target = 1

完整的Multi_Variate Data的工作代码（直到训练）如下所示。

import pandas as pd
import math
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
import matplotlib.pyplot as plt

df = pd.read_csv('datasets_101543_240726.csv')

#Create a new dataframe with only the 'Close column
data = df.filter(['close', 'Volume'])
#Convert the dataframe to a numpy array
dataset = data.values
#Get the number of rows to train the model on
training_data_len = math.ceil( len(dataset) * .8 )

# scale data
data_mean = dataset[:training_data_len].mean(axis=0)
data_std = dataset[:training_data_len].std(axis=0)
dataset = (dataset-data_mean)/data_std

def multivariate_data(dataset, target, start_index, end_index, history_size,
                      target_size):
  data = []
  labels = []

  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size

  for i in range(start_index, end_index):
    indices = range(i-history_size, i)
    data.append(dataset[indices])

    labels.append(target[i:i+target_size])

  return np.array(data), np.array(labels)

past_history = 60
future_target = 1

x_train, y_train = multivariate_data(dataset, dataset[:, 0], 0,
                                                   training_data_len, past_history,
                                                   future_target)
x_val, y_val = multivariate_data(dataset, dataset[:, 0],
                                               training_data_len, None, past_history,
                                               future_target)


#Reshape the data
#x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
#print (x_train.shape)

#Build the LSTM model
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(50, return_sequences= False))
model.add(Dense(25))
model.add(Dense(1))

#Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
#Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)

在您的代码中测试数据。x_test 和 y_test 可替换为 x_val 和 y_val 而你可以执行 predictions 在该数据上。

请参考 Tensorflow教程相关的多变量数据的时间序列预测的完整代码。

希望对大家有所帮助。祝大家学习愉快!

用LSTM将单变量时间序列预测转变为多变量时间序列预测。

问题描述投票：1回答：1

1个回答

最新问题

用LSTM将单变量时间序列预测转变为多变量时间序列预测。

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1