Pytorch 中的验证准确率超过 100%?

问题描述 投票:0回答:1

我目前正在做一个项目,试图对乳腺癌肿瘤的图像进行正确分类,以预测它们是良性(正常)还是恶性(危险),我正在使用 Pytorch。我已经设置了扩展 ResNet18 模型的网络并添加了更多层。对于超参数,本来是我自己设置的,但想用随机搜索等方法来找到最好的超参数。然而,在运行验证循环管道以获得准确性后,我发现它超过了 100%?

我像这样加载数据:


import torch.utils
import torch.utils.data
from torchvision.transforms import transforms
import torch.utils.data as data

transformed_data = transforms.Compose([ # using Compose() to chain data transformations
    transforms.ToTensor()
])



training_dataset = DataSetClass(split="train", transform=transformed_data, download=download)
# testing_dataset = DataSetClass(split="test", transform=transformed_data, download=download)

train_size = int(training_ratio * len(training_dataset))
validation_size = int(validation_ratio * len(training_dataset))
testing_size = len(training_dataset) - train_size - validation_size

train_set, validation_set, testing_dataset = torch.utils.data.random_split(
    training_dataset, [train_size, validation_size, testing_size]
)

# convert data into dataloader form
train_loader = data.DataLoader(dataset=train_set, batch_size=BATCH_SIZE, shuffle=True)
validation_loader = data.DataLoader(dataset=validation_set, batch_size=2*BATCH_SIZE, shuffle=False)
test_loader = data.DataLoader(dataset=testing_dataset, batch_size=BATCH_SIZE, shuffle=False)


print(training_dataset)
print("=======================")
print(testing_dataset)

我尝试创建一个列表,存储每个超参数的不同值,如下所示,同时执行验证循环:

import torch.optim as optim
import numpy as np

# Define your hyperparameters space
lr_space = [0.01, 0.02, 0.03, 0.04, 0.05]
epochs_space = [10, 20, 30, 40, 50]
batch_size_space = [32, 64, 128, 256]

# Initialize your network
network = ExtendedNetwork(resnet18)
network.to(device=device)

# Define your loss function
loss_function = nn.BCEWithLogitsLoss()

network.eval()

best_accuracy = 0
best_hyperparameters = None
validation_accuracy = 0


# Perform random search
for _ in range(100): 
    lr = np.random.choice(lr_space)
    epochs = np.random.choice(epochs_space)
    batch_size = np.random.choice(batch_size_space)

    # Use the selected hyperparameters to train your model
    optimizer = optim.Adam(network.parameters(), lr=lr)
    
    for epoch in range(epochs):
        with torch.no_grad():
            for inputs, targets in validation_loader:
                inputs, targets = inputs.to(device), targets.to(device)

                # Forward
                output = network(inputs)

                # Calculate and accumulate accuracy
                validation_accuracy += accuracy(output, targets)

        # Calculate average accuracy over all validation batches
        validation_accuracy /= len(validation_loader)

        print('Validation accuracy:', validation_accuracy)

        # If the current model is better than all previous models, update the best accuracy and best hyperparameters
        if validation_accuracy > best_accuracy:
            best_accuracy = validation_accuracy
            best_hyperparameters = {'lr': lr, 'epochs': epochs, batch_size': batch_size}

print('Best accuracy:', best_accuracy)
print('Best hyperparameters:', best_hyperparameters)

这导致结果:

Validation accuracy: 0.045871559633027525
Validation accuracy: 0.09174311926605505
...
Validation accuracy: 135.73394495412717
Validation accuracy: 135.77981651376018

Best accuracy: 135.77981651376018
Best hyperparameters: {'lr': 0.03, 'epochs': 40, 'batch_size': 256}

但是准确率应该限制在100%?我不明白我的错误是否来自我加载数据的方式或验证循环管道

python machine-learning pytorch resnet
1个回答
0
投票

您应该在每个纪元开始时将

validation_accuracy
重置为
0

© www.soinside.com 2019 - 2024. All rights reserved.