图像的注意力图

Question

我是 pytorch 新手。我想使用 imagenet 图像来了解每个像素对梯度的贡献有多大。为此，我正在尝试为我的图像构建注意力图。但是，在这样做时，我遇到了以下错误：

<ipython-input-64-08560ac86bab>:2: UserWarning: To copy  construct from a tensor, it is recommended to use    sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than   torch.tensor(sourceTensor).
  images_tensor = torch.tensor(images, requires_grad=True)
  <ipython-input-64-08560ac86bab>:3: UserWarning: To copy     construct from a tensor, it is recommended to use     sourceTensor.clone().detach() or   sourceTensor.clone().detach().requires_grad_(True), rather than    torch.tensor(sourceTensor).
  labels_tensor = torch.tensor(labels)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most   recent call last)
<ipython-input-65-49bfbb2b28f0> in <cell line: 20>()
 18     plt.show()
 19 
---> 20 show_attention_maps(X, y)

9 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py  in batch_norm(input, running_mean, running_var, weight, bias,  training, momentum, eps)
   2480         _verify_batch_size(input.size())
   2481 
-> 2482     return torch.batch_norm(
   2483         input, weight, bias, running_mean,     running_var, training, momentum, eps, torch.backends.cudnn.enabled
   2484     )

RuntimeError: running_mean should contain 1 elements not 64

我尝试过在预处理中更改图像大小并将模型更改为resnet152而不是resnet18。根据我所做的研究，我的理解是第一层中的批标准化期望输入大小为 1，但我有 64。我不知道如何改变它。

我的代码在这里：

model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
import torch.nn as nn
new_conv1 = nn.Conv2d(15, 1, kernel_size=1, stride=1, padding=112)     
nn.init.constant_(new_conv1.weight, 1)
model.conv1 = new_conv1
model.eval()

for param in model.parameters():
    param.requires_grad = False

def show_attention_maps(X, y):
X_tensor = torch.cat([preprocess(Image.fromarray(x)) for x in X], dim=0)
y_tensor = torch.LongTensor(y)
attention = compute_attention_maps(X_tensor, y_tensor, model)
attention = attention.numpy()

N = X.shape[0]
for i in range(N):
    plt.subplot(2, N, i + 1)
    plt.imshow(X[i])
    plt.axis('off')
    plt.title(class_names[y[i]])
    plt.subplot(2, N, N + i + 1)
    plt.imshow(attention[i], cmap=plt.cm.gray)
    plt.axis('off')
    plt.gcf().set_size_inches(12, 5)
plt.suptitle('Attention maps')
plt.show()

show_attention_maps(X, y)

def compute_attention_maps(images, labels, model):
    images_tensor = torch.tensor(images, requires_grad=True)
    labels_tensor = torch.tensor(labels)
    predictions = model(images_tensor.unsqueeze(0))
    criterion = torch.nn.CrossEntropyLoss()
    loss = criterion(predictions, labels_tensor)
    model.zero_grad()
    loss.backward()
    gradients = images_tensor.grad
    attention_maps = torch.mean(gradients.abs(), dim=1)
    return attention_maps

提前非常感谢您。

编辑：我改变了我的问题，因为我能够通过更改 resnet 的 conv1 （在我提供的代码的第 3 行中）解决我之前的问题，并且我仍在尝试计算注意力图。

Answer 1

您定义了卷积层来输出单个层，而在原始实现中，它输出

，here。这就是错误的来源，后续的批量归一化层需要

，而不是

。

图像的注意力图

问题描述投票：0回答：1

1个回答

最新问题

图像的注意力图

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1