构建一个定制的烤宽面条层,其输出是元素乘积(输入 x 权重)的矩阵,而不是点积

我有一个形状为 (seq_length(19) x Features(21)) 的输入序列,我将其作为神经网络的输入。

我需要一个层来对具有权重的输入执行元素乘法(不是点积),因此输出形状应该是(#units,input_shape)。因为,在我的例子中为 Input_shape(19 x 21),在该层中执行的操作的输出形状也是 (19 x 21)。如果 # 个单位是 8,则输出应该是 (8,19,21)


class ElementwiseMulLayer(lasagne.layers.Layer):
def __init__(self, incoming, num_units, W=lasagne.init.Normal(0.01),**kwargs):
    super(ElementwiseMulLayer, self).__init__(incoming, **kwargs)
    self.num_inputs = self.input_shape[1]
    self.num_units = num_units
    self.W = self.add_param(W, (self.num_inputs,num_units), name='W')

def get_output_for(self, input, **kwargs):
    #return T.dot(input, self.W)
    return result

def get_output_shape_for(self, input_shape):
    return (input_shape[0], self.num_units,self.num_inputs)  


l_in_2 = lasagne.layers.InputLayer(shape=(None, 9*19*21))
l_reshape_l_in_2 = lasagne.layers.ReshapeLayer(l_in_2, (-1, 9,19,21))
l_reshape_l_in_2_EL = lasagne.layers.ExpressionLayer(l_reshape_l_in_2, lambda X: X[:,0,:,:], output_shape='auto') 
l_reshape_l_in_2_EL = lasagne.layers.ReshapeLayer(l_reshape_l_in_2_EL, (-1, 19*21))
l_out1 = ElementwiseMulLayer(l_reshape_l_in_2_EL, num_units=8, name='my_EW_layer')
l_out1 = lasagne.layers.ReshapeLayer(l_out1, (-1, 8*399))
l_out = lasagne.layers.DenseLayer(l_out1,
                                num_units = 19*21,
                                W = lasagne.init.Normal(),
                                nonlinearity = lasagne.nonlinearities.rectify)   

值得注意的是batch size是64。NN总结:

| Layer | Layer_name                | output_shape         |  # parameters  |
|   0   | InputLayer                | (None, 3591)         |          0     |
|   1   | ReshapeLayer              | (None, 9, 19, 21)    |          0     |
|   2   | ExpressionLayer           | (None, 19, 21)       |          0     |
|   3   | ReshapeLayer              | (None, 399)          |          0     |
|   4   | ElementwiseMulLayer       | (None, 8, 399)       |       3192     |
|   5   | ReshapeLayer              | (None, 3192)         |       3192     |
|   6   | DenseLayer                | (None, 399)          |    1277199     |


ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[0] == 399, but the output's size on that axis is 64.
Apply node that caused the error: GpuElemwise{mul,no_inplace}(GpuReshape{2}.0, my_dot_layer.W)
Toposort index: 23
Inputs types: [GpuArrayType<None>(float32, matrix), GpuArrayType<None>(float32, matrix)]
Inputs shapes: [(64, 399), (399, 8)]
Inputs strides: [(14364, 4), (32, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuReshape{2}(GpuElemwise{mul,no_inplace}.0, TensorConstant{[  -1 3192]})]]


self.W = self.add_param(W, (self.num_inputs,num_units, self.num_inputs), name='W')


ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 8, but the output's size on that axis is 64.
Apply node that caused the error: GpuElemwise{mul,no_inplace}(InplaceGpuDimShuffle{x,0,1}.0, my_EW_layer.W)
Toposort index: 26
Inputs types: [GpuArrayType<None>(float32, (True, False, False)), GpuArrayType<None>(float32, 3D)]
Inputs shapes: [(1, 64, 399), (399, 8, 399)]
Inputs strides: [(919296, 14364, 4), (12768, 1596, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuReshape{2}(GpuElemwise{mul,no_inplace}.0, TensorConstant{[  -1 3192]})]]


l_out1 = ElementwiseMulLayer(l_reshape_l_in_2_EL, num_units=8, name='my_EW_layer')

在这一行中,您将 ElementwiseMulLayer 应用于 l_reshape_l_in_2_EL,这是输入张量的重塑版本。然后,将 l_out1 重塑为 (None, 8*399) 的形状。

您遇到的错误消息表明 l_reshape_l_in_2_EL 和权重矩阵 my_EW_layer.W 之间的按元素乘法运算存在维度不匹配。


这是 ElementwiseMulLayer 的更新版本,它可以正确处理重塑的输入:

class ElementwiseMulLayer(lasagne.layers.Layer):
    def __init__(self, incoming, num_units, W=lasagne.init.Normal(0.01), **kwargs):
        super(ElementwiseMulLayer, self).__init__(incoming, **kwargs)
        self.num_inputs = self.input_shape[1]
        self.num_units = num_units
        self.W = self.add_param(W, (self.num_units, self.num_inputs), name='W')

    def get_output_for(self, input, **kwargs):
        # Reshape the input tensor to match the weight matrix dimensions
        input_reshaped = input.dimshuffle(0, 'x', 1)  # Shape: (batch_size, 1, num_inputs)
        # Perform element-wise multiplication
        result = input_reshaped * self.W
        # Reshape the result to match the desired output shape
        result = result.dimshuffle(0, 2, 1)  # Shape: (batch_size, num_units, num_inputs)
        return result

    def get_output_shape_for(self, input_shape):
        return (input_shape[0], self.num_units, self.num_inputs)


此修改应该可以解决您面临的尺寸不匹配问题。确保使用此代码更新您的 ElementwiseMulLayer 类,它应该按预期工作。

