我正在尝试完成本教程:https://d2l.ai/chapter_attention-mechanisms/attention.html,但是在Pytorch中,我陷入了此功能:
npx.sequence_mask()
我尝试使用torch.masked_fill和masked_scatter,但没有成功。即,我想要:
a = torch.randn(2, 2, 4)
b = torch.randn(2, 3)
并获得类似npx.sequence_mask()
的结果sequence_mask documentation
([[[0.488994 , 0.511006 , 0. , 0. ],
[0.43654838, 0.56345165, 0. , 0. ]],
[[0.28817102, 0.3519408 , 0.3598882 , 0. ],
[0.29034293, 0.25239873, 0.45725834, 0. ]]])
有人可以帮我任何想法吗?
也许这可行,但是有更好的解决方法吗?
def mask_softmax(vec, mask):
leafs= vec.shape[0]
rows = vec.shape[1]
cols = vec.shape[2]
for k in range(leafs):
stop = int(mask[k])
for j in reversed(range(stop,cols)):
vec[k, :, j] = torch.zeros(rows) #all rows of col i <-- 0
vec = vec - torch.where(vec > 0,
torch.zeros_like(vec),
torch.ones_like(vec)*float('inf')) # switch 0 by -inf
# softmax(-inf) = nan
for k in range(leafs):
for i in range(rows):
vec[k,i] = F.softmax(vec[k, i], dim=0)
vec[vec != vec] = 0 # nan = 0
return vec
# testing
a = torch.rand((2,2,4))
mask = torch.Tensor((1,3))
mask_softmax(a, mask)
>>> tensor([[[0.5027, 0.4973, 0.0000, 0.0000],
[0.6494, 0.3506, 0.0000, 0.0000]],
[[0.3412, 0.3614, 0.2975, 0.0000],
[0.2699, 0.3978, 0.3323, 0.0000]]])