我的 LSTM 神经网络模型在回归问题上有什么问题，它不返回模型作为输出？

Question

所以，问题是：定义神经网络架构时我做错了什么？查看 Define the neural network model 和 Define the learning rate scheduler train the model

部分

详情：

我已经编写了代码，其中 revenue_data 形状是 (1749, 2) 而 weather_data 形状是 (86990, 10) X_train 形状是 ([69010, 14])，y_train 是 ([69010])，X_val 是 ([17253 , 14]), y_val = ([17253]) 并完成了预处理、缩放、删除运算符和拆分数据，如下所示：

将日期和时间列转换为日期时间格式

revenue_data['Date'] = pd.to_datetime(revenue_data['Date'], format='%Y%m%d')
weather_data['dt'] = pd.to_datetime(weather_data['dt'], format='%Y%m%d')
weather_data['time'] = pd.to_datetime(weather_data['time'], format='%H:%M:%S')

将风和条件列转换为嵌入

wind_embeddings = nn.Embedding(len(weather_data['wind'].unique()), 5)
weather_data['wind_code'] = weather_data['wind'].astype('category').cat.codes
wind_vectors = wind_embeddings(torch.tensor(weather_data['wind_code'].values, dtype=torch.long))
weather_data['wind_x'] = wind_vectors[:, 0].detach().numpy()
weather_data['wind_y'] = wind_vectors[:, 1].detach().numpy()
weather_data['wind_z'] = wind_vectors[:, 2].detach().numpy()
weather_data['wind_t'] = wind_vectors[:, 3].detach().numpy()
weather_data['wind_u'] = wind_vectors[:, 4].detach().numpy()

condition_embeddings = nn.Embedding(len(weather_data['condition'].unique()), 3)
weather_data['condition_code'] = weather_data['condition'].astype('category').cat.codes
condition_vectors = condition_embeddings(torch.tensor(weather_data['condition_code'].values, dtype=torch.long))
weather_data['condition_x'] = condition_vectors[:, 0].detach().numpy()
weather_data['condition_y'] = condition_vectors[:, 1].detach().numpy()
weather_data['condition_z'] = condition_vectors[:, 2].detach().numpy()

按日期和小时对天气数据进行分组，并计算每个日期和小时的平均值

weather_data = weather_data.groupby(['dt', 'time']).mean()
weather_data = weather_data.reset_index()
weather_data['Date'] = weather_data['dt']
weather_data.drop(['dt', 'time', 'wind_code', 'condition_code'], axis=1, inplace=True)

合并“日期”列中的收入和天气数据并删除“日期”

merged_data = pd.merge(revenue_data, weather_data, on='Date')
merged_data.drop('Date', axis=1, inplace=True)
merged_data.head()

缩放数据

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(merged_data)

将数据分成输入和目标集

X = scaled_data[:, 1:]
y = scaled_data[:, 0]

from scipy.stats import zscore

计算每个特征的 z 分数 |删除 z-scor 大于 3

的异常值

z_scores = zscore(X)

识别任何特征的 z 得分 > 3

的行

mask = (z_scores > 3).any(axis=1)

从 x 和 y 中删除具有高 z 分数的行

features = X[~mask, :]
target = y[~mask]

将数据分成训练集和验证集

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

将数据转换为 PyTorch 张量

X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
X_val = torch.tensor(X_val, dtype=torch.float32)
y_val = torch.tensor(y_val, dtype=torch.float32)

但是我正在努力意识到定义的神经网络架构有什么问题：

定义神经网络模型

class RevenuePredictor(nn.Module):
def __init__(self):
super().__init__()
self.lstm = nn.LSTM(input_size=14, hidden_size=32, num_layers=1, batch_first=True)
self.fc1 = nn.Linear(32, 16)
self.fc2 = nn.Linear(16, 1)

def forward(self, x, lengths):
print('x shape:', x.shape)
Get the lengths of the input sequences
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
lengths = lengths.to(device)
lengths = lengths.cpu()
print('lengths shape:', lengths.shape)

Sort the input sequences by length
sorted_lengths, sorted_idx = torch.sort(lengths, descending=True)
sorted_x = x[sorted_idx]

Pack the sorted input sequences
packed_x = nn.utils.rnn.pack_padded_sequence(sorted_x, sorted_lengths, batch_first=True)

Convert the packed sequence to a tensor with two dimensions
x_data, batch_sizes = nn.utils.rnn.pad_packed_sequence(packed_x, batch_first=True)

Convert the packed sequence to a tensor with two dimensions
x_data, batch_sizes = x.data, x.batch_sizes
seq_len = batch_sizes[0]
batch_size = len(batch_sizes)
x = x_data.new_zeros((batch_size, seq_len, 14))
s = 0
for i, l in enumerate(batch_sizes):
x[i, :l] = x_data[s:(s+l)]
s += l

Pass the packed input sequences through the LSTM
lstm_output, (h, c) = self.lstm(packed_x)

Unpack the LSTM output sequences
unpacked_output, _ = nn.utils.rnn.pad_packed_sequence(lstm_output, batch_first=True)

Re-sort the output sequences to their original order
unsorted_idx = sorted_idx.sort(0)
output = unpacked_output[unsorted_idx]

Pass the output sequences through the fully connected layers
output = nn.functional.relu(self.fc1(output[:, -1, :]))
output = self.fc2(output)

return output

然后创建模型

model = RevenuePredictor()

之后是损失和指标

loss_fn = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

metrics = {
'mse': MeanSquaredError(),
'mae': MeanAbsoluteError(),
'r2': R2Score(),
}

定义学习率调度器训练模型

scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=10, verbose=True)

best_val_loss = np.inf
for epoch in range(num_epochs):
    # Set the model to training mode
    model.train()

    train_loss = 0.0
    num_batches = 0
    for X_train, y_train in train_loader:
        lengths = torch.ones(X_train.shape[0], dtype=torch.long)
        optimizer.zero_grad()
        output = model(X_train, lengths)
        loss = loss_fn(output, y_train)
        loss.backward()
        optimizer.step()
        train_loss += loss.item()
        num_batches += 1

    val_loss = 0.0
    for X_val, y_val in val_loader:
        lengths = torch.ones(X_val.shape[0], dtype=torch.long)
        output = model(X_val, lengths)
        loss = loss_fn(output, y_val)
        val_loss += loss.item()

    scheduler.step(val_loss)

    val_loss /= len(val_loader)
    val_mse = metrics['mse'].compute()
    val_mae = metrics['mae'].compute()
    val_r2 = metrics['r2'].compute()
    for metric in metrics.values():
        metric.reset()

    if (epoch+1) % 100 == 0:
        print('Epoch [{}/{}], Train Loss: {:.4f}, Val Loss: {:.4f}, MSE: {:.4f}, MAE: {:.4f}, R2: {:.4f}'
              .format(epoch+1, num_epochs, train_loss/num_batches, val_loss, val_mse, val_mae, val_r2))

我收到这个错误，我认为这是因为在定义神经网络模型时出现了错误：

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-164-e20b93c25048>", line 3, in <module>
    output = model(X_train, lengths)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "<ipython-input-163-43b2ef5c15db>", line 38, in forward
    lstm_output, (h, c) = self.lstm(packed_x)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/rnn.py", line 772, in forward
    self.check_forward_args(input, hx, batch_sizes)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/rnn.py", line 697, in check_forward_args
    self.check_input(input, batch_sizes)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/rnn.py", line 206, in check_input
    raise RuntimeError(
# RuntimeError: input must have 2 dimensions, got 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2040, in showtraceback
    stb = value._render_traceback_()
AttributeError: 'RuntimeError' object has no attribute '_render_traceback_'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/IPython/core/ultratb.py", line 1101, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "/usr/local/lib/python3.8/dist-packages/IPython/core/ultratb.py", line 319, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/IPython/core/ultratb.py", line 353, in _fixed_getinnerframes
    records = fix_frame_records_filenames(inspect.getinnerframes(etb, context))
  File "/usr/lib/python3.8/inspect.py", line 1515, in getinnerframes
    frameinfo = (tb.tb_frame,) + getframeinfo(tb, context)
  File "/usr/lib/python3.8/inspect.py", line 1473, in getframeinfo
    filename = getsourcefile(frame) or getfile(frame)
  File "/usr/lib/python3.8/inspect.py", line 708, in getsourcefile
    if getattr(getmodule(object, filename), '__loader__', None) is not None:
  File "/usr/lib/python3.8/inspect.py", line 737, in getmodule
    file = getabsfile(object, _filename)
  File "/usr/lib/python3.8/inspect.py", line 721, in getabsfile
    return os.path.normcase(os.path.abspath(_filename))
  File "/usr/lib/python3.8/posixpath.py", line 379, in abspath
    cwd = os.getcwd()
FileNotFoundError: [Errno 2] No such file or directory
---------------------------------------------------------------------------

我尝试以不同的方式将打包序列转换为具有二维的张量：

x_data, ba
tch_sizes = x.data, x.batch_sizes
seq_len = batch_sizes[0]
batch_size = len(batch_sizes)
x = x_data.new_zeros((batch_size, seq_len, 14))
s = 0
for i, l in enumerate(batch_sizes):
    x[i, :l] = x_data[s:(s+l)]
    s += l

没有工作。

然后尝试将 x 重新调整为具有三个维度，例如：

batch_size, seq_len, input_size = x.shape

没用终于试过了：

unsqueze(-1) 在我定义模型后输出：


model = REvenuePredictor()
output = model(X_train, lengths).unsqueeze(-1)

我的 LSTM 神经网络模型在回归问题上有什么问题，它不返回模型作为输出？

问题描述投票：0回答：0

详情：

将日期和时间列转换为日期时间格式

将风和条件列转换为嵌入

按日期和小时对天气数据进行分组，并计算每个日期和小时的平均值

合并“日期”列中的收入和天气数据并删除“日期”

缩放数据

将数据分成输入和目标集

计算每个特征的 z 分数 |删除 z-scor 大于 3

识别任何特征的 z 得分 > 3

从 x 和 y 中删除具有高 z 分数的行

将数据分成训练集和验证集

将数据转换为 PyTorch 张量

但是我正在努力意识到定义的神经网络架构有什么问题：

定义神经网络模型

然后创建模型

之后是损失和指标

定义学习率调度器训练模型

我收到这个错误，我认为这是因为在定义神经网络模型时出现了错误：

最新问题

我的 LSTM 神经网络模型在回归问题上有什么问题，它不返回模型作为输出？

问题描述 投票：0回答：0

详情：

将日期和时间列转换为日期时间格式

将风和条件列转换为嵌入

按日期和小时对天气数据进行分组，并计算每个日期和小时的平均值

合并“日期”列中的收入和天气数据并删除“日期”

缩放数据

将数据分成输入和目标集

计算每个特征的 z 分数 |删除 z-scor 大于 3

识别任何特征的 z 得分 > 3

从 x 和 y 中删除具有高 z 分数的行

将数据分成训练集和验证集

将数据转换为 PyTorch 张量

但是我正在努力意识到定义的神经网络架构有什么问题：

定义神经网络模型

然后创建模型

之后是损失和指标

定义学习率调度器训练模型

我收到这个错误，我认为这是因为在定义神经网络模型时出现了错误：

最新问题

问题描述投票：0回答：0