从 torchvision 中查找归一化函数的标准差和平均值

问题描述 投票:0回答:1

我想知道如何找到 3 个图像的平均值和标准差。这将用作 Pytorch 中

Normalize
函数的输入 (
from torchvision.transforms import Normalize
)。

在我工作的特定数据集中,3 个颜色通道位于单独的 tif 文件中。因为这只是一个重复,我将展示红色波段的计算。

方法1

我加载 1x120x120 张量的张量并找到红色通道的平均值并将其附加到列表中以跟踪 3 个图像的平均值(跨像素的平均值)。最后,为了找到红色通道数据集的平均值,我只需找到列表的平均值(图像的平均值)。计算标准差是相同的过程

def get_mean_std(root:str):
    """
    Finds the mean and standard deviation of channels in a dataset
    Inputs 
        - root : Path to Root directory of dataset
    """
    rb_list = []
    gb_list = []
    bb_list = []

    mean = 0
    for data_folder in os.listdir(root)[:3]:
        # Path containing to folder containing 12 tif files and a json file 
        data_folder_pth = os.path.join(root, data_folder)
        # Path to RGB channels | rb refers to red band ...
        rb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B04.tif")][0])
        gb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B03.tif")][0])
        bb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B02.tif")][0])

        # Open each Image and convert to tensor
        rb = ToTensor()(Image.open(rb_pth)).float() #(1,120,120)
        gb = ToTensor()(Image.open(gb_pth)).float() #(1,120,120)
        bb = ToTensor()(Image.open(bb_pth)).float() #(1,120,120)

        # Find the mean of all pixels 
        rb_list.append(rb.mean().item())
    

    mean_of_3_images = np.array(rb_list).mean()
    print(f"rb_list : {rb_list}")
    print(f"mean of red channel : {mean_of_3_images}")

# output
>>> rb_list : [281.01361083984375, 266.2029113769531, 1977.7083740234375]
>>> mean of red channel : 841.6416320800781

方法2

在这篇文章之后(https://saturncloud.io/blog/how-to-normalize-image-dataset-using-pytorch/#step-2-calculate-the-mean-and-standard-deviation-of-the -dataset),但经过修改以使用此数据集。在这里,作者发现跟踪所有像素的计数和平均值,然后将平均值除以像素数。

但是两种方法得到的结果是不同的。

def get_mean_std(root:str):
    """
    Finds the mean and standard deviation of channels in a dataset
    Inputs 
        - root : Path to Root directory of dataset
    """
    mean = 0
    num_pixels = 0

    for data_folder in os.listdir(root)[:3]:
        # Path containing to folder containing 12 tif files and a json file 
        data_folder_pth = os.path.join(root, data_folder)
        # Path to RGB channels | rb refers to red band ...
        rb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B04.tif")][0])
        gb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B03.tif")][0])
        bb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B02.tif")][0])

        # Open each Image and convert to tensor
        rb = ToTensor()(Image.open(rb_pth)).float() #(1,120,120)
        gb = ToTensor()(Image.open(gb_pth)).float() #(1,120,120)
        bb = ToTensor()(Image.open(bb_pth)).float() #(1,120,120)

        batch, height, width = rb.shape #(1,120,120)
        num_pixels += batch * height * width
        mean += rb.mean().sum()
       
    print(mean)
    print(mean / num_pixels)

# Output
>>> tensor(2524.9248)
>>> tensor(0.0584)

我想知道为什么价值观如此不同。知道为什么我的方法不正确吗?

只是为了了解 3 个图像内的红色带的值......

tensor([[[322., 275., 262.,  ..., 260., 225., 268.],
         [283., 271., 259.,  ..., 277., 269., 278.],
         [302., 303., 276.,  ..., 305., 279., 283.],
         ...,
         [398., 341., 374.,  ..., 246., 273., 227.],
         [383., 351., 375.,  ..., 266., 277., 260.],
         [353., 347., 359.,  ..., 280., 260., 227.]]])

tensor([[[153., 214., 242.,  ..., 825., 575., 399.],
         [206., 223., 198.,  ..., 766., 507., 477.],
         [219., 256., 189.,  ..., 593., 365., 384.],
         ...,
         [138., 255., 329.,  ..., 227., 289., 334.],
         [174., 215., 276.,  ..., 402., 395., 350.],
         [216., 212., 214.,  ..., 354., 362., 312.]]])

tensor([[[1727., 1852., 1184.,  ..., 3494., 3539., 3374.],
         [1882., 1868., 1307.,  ..., 3523., 3443., 3278.],
         [1716., 1975., 1919.,  ..., 3280., 3319., 3121.],
         ...,
         [2199., 2214., 2269.,  ..., 2563., 2284., 2147.],
         [2181., 2213., 2312.,  ..., 2686., 2668., 2737.],
         [2208., 2297., 2351.,  ..., 2647., 2904., 3008.]]])
python pytorch computer-vision normalization
1个回答
0
投票

您提到的链接错误地执行了平均值和标准差计算。第一个实现是正确的方法。

© www.soinside.com 2019 - 2024. All rights reserved.