使用条件裁剪或阈值张量，并将PyTorch中的结果零填充

Question

假设我有一个这样的张量

w = [[0.1, 0.7, 0.7, 0.8, 0.3],
    [0.3, 0.2, 0.9, 0.1, 0.5],
    [0.1, 0.4, 0.8, 0.3, 0.4]]

现在，我想根据某些条件消除某些值（例如大于或小于0.5）

w = [[0.1, 0.3],
     [0.3, 0.2, 0.1],
     [0.1, 0.4, 0.3, 0.4]]

然后将其填充到相等的长度：

w = [[0.1, 0.3, 0, 0],
     [0.3, 0.2, 0.1, 0],
     [0.1, 0.4, 0.3, 0.4]]

这就是我在pytorch中实现它的方式：

w = torch.rand(3, 5)
condition = w <= 0.5
w = [w[i][condition[i]] for i in range(3)]
w = torch.nn.utils.rnn.pad_sequence(w)

但是显然，这将非常缓慢，主要是由于列表理解。有没有更好的方法呢？

Answer 1

这是使用boolean masking，tensor splitting，然后最终使用torch.nn.utils.rnn.pad_sequence(...)填充分割张量的一种直接方法。

torch.nn.utils.rnn.pad_sequence(...)

关于效率的简短说明：使用# input tensor to work with In [213]: w Out[213]: tensor([[0.1000, 0.7000, 0.7000, 0.8000, 0.3000], [0.3000, 0.2000, 0.9000, 0.1000, 0.5000], [0.1000, 0.4000, 0.8000, 0.3000, 0.4000]]) # values above this should be clipped from the input tensor In [214]: clip_value = 0.5 # generate a boolean mask that satisfies the condition In [215]: boolean_mask = (w <= clip_value) # we need to sum the mask along axis 1 (needed for splitting) In [216]: summed_mask = boolean_mask.sum(dim=1) # a sequence of splitted tensors In [217]: splitted_tensors = torch.split(w[boolean_mask], summed_mask.tolist()) # finally pad them along dimension 1 (or axis 1) In [219]: torch.nn.utils.rnn.pad_sequence(splitted_tensors, 1) Out[219]: tensor([[0.1000, 0.3000, 0.0000, 0.0000], [0.3000, 0.2000, 0.1000, 0.5000], [0.1000, 0.4000, 0.3000, 0.4000]])是超级有效的，因为它会将分割后的张量作为原始张量的view返回（即不进行任何复制）。

使用条件裁剪或阈值张量，并将PyTorch中的结果零填充

问题描述投票：1回答：1

1个回答

最新问题

使用条件裁剪或阈值张量，并将PyTorch中的结果零填充

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1