循环pytorch中的内存泄漏

问题描述 投票:0回答:2

以下循环不会丢弃每次循环导致内存泄漏后生成的任何张量。这是由于以下代码中使用了grad_loss.backward()。我缺少什么吗?pytorch是否有问题。

    for (images, one_hot_labels) in tqdm(batched_train_data):
        # I collect batch size here because the last batch may have a smaller batch_size
        images = images.to(device)
        one_hot_labels = one_hot_labels.to(device)

        batch_size = images.shape[0]

        images.requires_grad = True
        optimizer.zero_grad()
        # as images is not a parameters optimizer.zero_grad() won't reset it's gradient
        if images.grad is not None:
            images.grad.data.zero_()

        probabilities = model.forward(images)

        # I want to use .backward() twice rather than autograd because I want to accumulate the gradients
        loss = loss_func(probabilities, one_hot_labels)
        loss.backward(create_graph=True)
        grad_loss = grad_loss_func(images.grad)
        grad_loss.backward()

        optimizer.step()

        labels = one_hot_labels.detach().argmax(dim=1)
        predictions = probabilities.detach().argmax(dim=1)
        num_correct = int(predictions.eq(labels).sum())

        train_data_length += batch_size
        train_correct += num_correct
        train_loss += float(loss.detach()) * batch_size

        writer.add_graph(model, images)
        writer.close()

        # To stop memory leaks
        del images
        del one_hot_labels
        del probabilities
        del loss
        del grad_loss
        del labels
        del predictions
        del num_correct
python deep-learning neural-network pytorch
2个回答
0
投票

要修复它,您需要替换

images.grad.data.zero_()

images.grad = None

我相信这是因为执行images.grad.data.zero_()不会删除与图像相关的任何计算图,因此允许该图随着循环遍历而增长。

另外,我也被告知,您应该避免在.data上进行操作,因为这样做是不安全的。


0
投票

如果要在代码的某个部分中不想为向后传播建立图形,请使用:

with torch.no_grad():
  #here goes the code

© www.soinside.com 2019 - 2024. All rights reserved.