如何在 google colab 中使用更多 GPU RAM?

问题描述 投票:0回答:1

我正在 pytorch 中从事这个深度学习项目,其中我有 2 个完全连接的神经网络,我需要训练然后测试它们。但是当我在 google colab 中运行代码时,它并不比在我的 PC 上的 CPU 上运行快多少。顺便说一句,我有 colab pro。它还使用 A100 GPU 40GB GPU RAM 中的 0.6 块。

import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim


device = torch.device("cuda:0")
# Define transform
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

# Load FashionMNIST dataset
trainset = torchvision.datasets.FashionMNIST('./data', download=True, train=True, transform=transform)
testset = torchvision.datasets.FashionMNIST('./data', download=True, train=False, transform=transform)

# Create data loaders
trainloader = torch.utils.data.DataLoader(trainset, batch_size=1, shuffle=True, num_workers=2)
testloader = torch.utils.data.DataLoader(testset, batch_size=1  , shuffle=False, num_workers=2)

# Define constant for classes
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
           'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')




# Define the fully connected neural network
class FCNN(nn.Module):
    def __init__(self, num_layers=1):
        super(FCNN, self).__init__()
        self.num_layers = num_layers
        self.fc_layers = nn.ModuleList()
        if self.num_layers == 1:
            self.fc_layers.append(nn.Linear(28 * 28, 1024))
        elif self.num_layers == 2:
            self.fc_layers.append(nn.Linear(28 * 28, 1024))
            self.fc_layers.append(nn.Linear(1024, 1024))
        self.output_layer = nn.Linear(1024, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        for layer in self.fc_layers:
            x = nn.functional.relu(layer(x))
        x = self.output_layer(x)
        return x

# Modify the train function to move inputs and labels to the GPU
def train(net, criterion, optimizer, epochs=15):
    for epoch in range(epochs):
        running_loss = 0.0
        for i, data in enumerate(trainloader, 0):
            inputs, labels = data[0].to(device), data[1].to(device)
            optimizer.zero_grad()

            outputs = net(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()
            if i % 2000 == 1999:
                print('[%d, %5d] loss: %.2f' %
                      (epoch + 1, i + 1, running_loss / 2000))
                running_loss = 0.0

# Define function to test accuracy
def test(net):
    correct = 0
    total = 0
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            outputs = net(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    print('Accuracy: %d %%' % (
            100 * correct / total))

# Main function
if __name__ == "__main__":
    # Define the network
    net1 = FCNN(num_layers=1)
    net2 = FCNN(num_layers=2)
    net2.to(device)

    # Define loss function and optimizer
    criterion = nn.CrossEntropyLoss()
    optimizer1 = optim.SGD(net1.parameters(), lr=0.001, momentum=0.0)
    optimizer2 = optim.SGD(net2.parameters(), lr=0.001, momentum=0.0)

    # Train and test network with 1 FC layer
    #print("Training network with 1 layer...")
    #train(net1, criterion, optimizer1)
    #test(net1)

    # Train and test network with 2 FC layers
    print("Training network with 2 layers...")
    train(net2, criterion, optimizer2)
    test(net2)
tried using different GPUS in google colab
tried adding this line to always use CUDA cores:  
device = torch.device("cuda:0"), 

and had the network use the device:
 device = torch.device("cuda:0")
machine-learning deep-learning pytorch google-colaboratory
1个回答
0
投票

正如您收到的评论所说,实现 A100 GPU 更高利用率的最简单方法是增加批量大小。

重要的是要记住,这里的基本问题是您正在测试一个非常简单的模型,仅由几个层组成,并且对非常简单的数据进行训练,与玩具数据集相差不远。 Fashion-MNIST 被创建为原始 MNIST 数据集的直接替代品,它比它更复杂,但它的图像仍然是灰度和 28x28 像素。为了获得更高的 GPU 利用率,从而获得与本地相比更高的训练性能差异,您应该尝试使用更复杂的模型和数据集。增加模型和数据的复杂性将使差异变得很容易理解。

© www.soinside.com 2019 - 2024. All rights reserved.