实现 LSTM 和线性层输出的二元分类

Question

我目前正在为我的 AI Assistant 项目开发唤醒词模型。我的流程涉及将音频数据转换为 MFCC 特征，将它们传递到 LSTM 层，然后使用线性层进行二元分类。 LSTM 层输出形状为 (4, 32, 32) 的张量，对应于 Directions*num_layers, batch,hidden_size。随后，线性层提供形状为 (4, 32, 1) 的输出张量。

在我的二元分类任务中，我有两个类：0（表示“不要醒来”）和1（表示“唤醒AI”）。然而，我很难理解如何解释线性层的输出。我期望像 (32, 1) 这样的输出对应于批量大小和预测。但是，当前的形状是 (4, 32, 1)，我相信我可能在这里遗漏了一些基本的东西。

有人可以澄清一下如何处理线性层的这个 (4, 32, 1) 输出吗？我在下面提供我的模型的代码以供参考：

class LSTMWakeWord(nn.Module):
    def __init__(self,input_size,hidden_size,num_layers,dropout,bidirectional,num_of_classes, device='cpu'):
        super(LSTMWakeWord, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.device = device
        self.bidirectional = bidirectional
        self.directions = 2 if bidirectional else 1

        self.lstm = nn.LSTM(input_size=input_size,
                            hidden_size = hidden_size,
                            num_layers = num_layers,
                            dropout=dropout,
                            bidirectional=bidirectional,
                            batch_first=True)
        self.layernorm = nn.LayerNorm(input_size)

        self.classifier = nn.Linear(hidden_size , num_of_classes)

    def _init_hidden(self,batch_size):
        n, d, hs = self.num_layers, self.directions, self.hidden_size
        return (torch.zeros(n * d, batch_size, hs).to(self.device),
                torch.zeros(n * d, batch_size, hs).to(self.device))

    def forward(self,x):
        # the values with e+xxx are gone. so it normalizes the values
        x = self.layernorm(x)
        # x shape ->  feature(n_mfcc),batch,seq_len(time)
        hidden = self._init_hidden(x.size()[0])
        out, (hn, cn) = self.lstm(x, hidden)
        print("hn "+str(hn.shape))# directions∗num_layers, batch, hidden_size
        #print("out " + str(out.shape))# batch, seq_len, direction(2 or 1)*hidden_size
        out = self.classifier(hn)
        print("out2 " + str(out.shape))

        return out

我非常感谢有关如何处理二元分类的线性层输出的任何见解或指导。

Answer 1

你可以试试这个：

hn = hn[-1, :, :]
out = self.classifier(hn)

实现 LSTM 和线性层输出的二元分类

问题描述投票：0回答：1

1个回答

最新问题

实现 LSTM 和线性层输出的二元分类

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1