LSTM模型过拟合还是欠拟合？

Question

我正在研究预测比特币价格的 LSTM 模型。使用：time_steps = 20，epochs = 100，batch_size = 256.

我得到了附加的模型损失图。

我还附上了实际与预测的 BTC 价格。

这个模型是过拟合还是欠拟合……？谢谢！

import numpy as np
import pandas as pd
import tensorflow as tf
import plotly.graph_objects as go
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

# Ensure the same results are produced each time
np.random.seed(42)
tf.random.set_seed(42)

# Load the normalized data from a CSV file
df = pd.read_csv('normalized_dataBTC.csv',parse_dates=['Date'], index_col='Date')

# Split the data into training and testing sets
train_size = int(len(df) * 0.8)  # 80% of data for training
train_data = df.iloc[:train_size].values
test_data = df.iloc[train_size:].values

# Define the number of time steps and features for the LSTM model
time_steps = 20  # number of time steps to use for each input sequence
num_features = 6  # number of features in the input data

# Create training sequences for the LSTM model
X_train = []
y_train = []
for i in range(time_steps, train_size):
    X_train.append(train_data[i-time_steps:i, :])
    y_train.append(train_data[i, 4])  # use the "Close" price as the target

# Convert the training data to numpy arrays
X_train = np.array(X_train)
y_train = np.array(y_train)

# Reshape the training data to fit the LSTM model input shape
X_train = np.reshape(X_train, (X_train.shape[0], time_steps, num_features))

# Create testing sequences for the LSTM model
X_test = []
y_test = []
for i in range(time_steps, len(test_data)):
    X_test.append(test_data[i-time_steps:i, :])
    y_test.append(test_data[i, 4])  # use the "Close" price as the target

# Convert the testing data to numpy arrays
X_test = np.array(X_test)
y_test = np.array(y_test)

# Reshape the testing data to fit the LSTM model input shape
X_test = np.reshape(X_test, (X_test.shape[0], time_steps, num_features))

# Create the LSTM model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(units=64, input_shape=(time_steps, num_features)))
model.add(tf.keras.layers.Dense(1))

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Fit the model to the training data
history = model.fit(X_train, y_train, epochs=100, batch_size=256, validation_data=(X_test, y_test))

# Make predictions
predictions = model.predict(X_test)

# Load the original data from a CSV file
df_orig = pd.read_csv('BTC-USD 2014 2023.csv',parse_dates=['Date'], index_col='Date')
# Define the scale factor for the "Close" price
close_scale = df_orig.iloc[train_size:, 4].values.max()
# Un-normalize the predictions
predictions_unscaled = predictions * close_scale

# Plot the actual vs predicted BTC price using Plotly
fig = go.Figure()
fig.add_trace(go.Scatter(x=df_orig.index[train_size+time_steps:], y=y_test*close_scale, name='Actual'))
fig.add_trace(go.Scatter(x=df_orig.index[train_size+time_steps:], y=predictions_unscaled[:,0], name='Predicted'))
fig.update_layout(title='Actual vs Predicted BTC Price', xaxis_title='Date', yaxis_title='Price ($)')
fig.update_layout(title_x=0.5, title_font_size=24, xaxis_title_font_size=18, yaxis_title_font_size=18)
fig.update_xaxes(tickformat='%d/%m/%Y') #Format x-axis as dates
fig.show()

# Plot the training and validation loss over the epochs
plt.plot(history.history['loss'], label='Training loss')
plt.plot(history.history['val_loss'], label='Validation loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

# Calculate the mean squared error
mse = mean_squared_error(y_test, predictions)
print('Mean squared error:', mse)

# Calculate the root mean squared error
rmse = np.sqrt(mse)
print('Root mean squared error:', rmse)

# Calculate the mean absolute error
mae = np.mean(np.abs(y_test - predictions))
print('Mean absolute error:', mae)

尝试 30 个时间步长、100 个纪元和 256 个批量时的结果。

Answer 1

您的模型似乎既没有过拟合也没有欠拟合。对于这两个图，训练损失与验证损失非常相似，预测价格与实际价格相似。从你的情节来看，30 个时间步长的模型似乎比第一个模型更适合。

您可以尝试使用其他超参数，例如 40 个时间步长，看看模型是否适合。

Answer 2

根据情节，看起来你创建了一个很好的模型，情节中看不到过拟合和欠拟合。你可以通过加强层数和dropout以及增加epoch值来做出更好的模型

Answer 3

您的模型基本上是在预测前一天的价格。它是否过度拟合并不是真正正确的问题，因为它基本上只能做出非常幼稚的预测。

你应该专注于如何重新表述问题，例如预测第二天的价格差异，并将其绘制出来。

LSTM模型过拟合还是欠拟合？

问题描述投票：0回答：3

3个回答

最新问题

LSTM模型过拟合还是欠拟合？

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3