我想知道如何找到 3 个图像的平均值和标准差。这将用作 Pytorch 中
Normalize
函数的输入 (from torchvision.transforms import Normalize
)。
在我工作的特定数据集中,3 个颜色通道位于单独的 tif 文件中。因为这只是一个重复,我将展示红色波段的计算。
方法1
我加载 1x120x120 张量的张量并找到红色通道的平均值并将其附加到列表中以跟踪 3 个图像的平均值(跨像素的平均值)。最后,为了找到红色通道数据集的平均值,我只需找到列表的平均值(图像的平均值)。计算标准差是相同的过程
def get_mean_std(root:str):
"""
Finds the mean and standard deviation of channels in a dataset
Inputs
- root : Path to Root directory of dataset
"""
rb_list = []
gb_list = []
bb_list = []
mean = 0
for data_folder in os.listdir(root)[:3]:
# Path containing to folder containing 12 tif files and a json file
data_folder_pth = os.path.join(root, data_folder)
# Path to RGB channels | rb refers to red band ...
rb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B04.tif")][0])
gb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B03.tif")][0])
bb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B02.tif")][0])
# Open each Image and convert to tensor
rb = ToTensor()(Image.open(rb_pth)).float() #(1,120,120)
gb = ToTensor()(Image.open(gb_pth)).float() #(1,120,120)
bb = ToTensor()(Image.open(bb_pth)).float() #(1,120,120)
# Find the mean of all pixels
rb_list.append(rb.mean().item())
mean_of_3_images = np.array(rb_list).mean()
print(f"rb_list : {rb_list}")
print(f"mean of red channel : {mean_of_3_images}")
# output
>>> rb_list : [281.01361083984375, 266.2029113769531, 1977.7083740234375]
>>> mean of red channel : 841.6416320800781
方法2
在这篇文章之后(https://saturncloud.io/blog/how-to-normalize-image-dataset-using-pytorch/#step-2-calculate-the-mean-and-standard-deviation-of-the -dataset),但经过修改以使用此数据集。在这里,作者发现跟踪所有像素的计数和平均值,然后将平均值除以像素数。
但是两种方法得到的结果是不同的。
def get_mean_std(root:str):
"""
Finds the mean and standard deviation of channels in a dataset
Inputs
- root : Path to Root directory of dataset
"""
mean = 0
num_pixels = 0
for data_folder in os.listdir(root)[:3]:
# Path containing to folder containing 12 tif files and a json file
data_folder_pth = os.path.join(root, data_folder)
# Path to RGB channels | rb refers to red band ...
rb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B04.tif")][0])
gb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B03.tif")][0])
bb_pth = os.path.join(data_folder_pth, [f for f in os.listdir(data_folder_pth) if f.endswith("B02.tif")][0])
# Open each Image and convert to tensor
rb = ToTensor()(Image.open(rb_pth)).float() #(1,120,120)
gb = ToTensor()(Image.open(gb_pth)).float() #(1,120,120)
bb = ToTensor()(Image.open(bb_pth)).float() #(1,120,120)
batch, height, width = rb.shape #(1,120,120)
num_pixels += batch * height * width
mean += rb.mean().sum()
print(mean)
print(mean / num_pixels)
# Output
>>> tensor(2524.9248)
>>> tensor(0.0584)
我想知道为什么价值观如此不同。知道为什么我的方法不正确吗?
只是为了了解 3 个图像内的红色带的值......
tensor([[[322., 275., 262., ..., 260., 225., 268.],
[283., 271., 259., ..., 277., 269., 278.],
[302., 303., 276., ..., 305., 279., 283.],
...,
[398., 341., 374., ..., 246., 273., 227.],
[383., 351., 375., ..., 266., 277., 260.],
[353., 347., 359., ..., 280., 260., 227.]]])
tensor([[[153., 214., 242., ..., 825., 575., 399.],
[206., 223., 198., ..., 766., 507., 477.],
[219., 256., 189., ..., 593., 365., 384.],
...,
[138., 255., 329., ..., 227., 289., 334.],
[174., 215., 276., ..., 402., 395., 350.],
[216., 212., 214., ..., 354., 362., 312.]]])
tensor([[[1727., 1852., 1184., ..., 3494., 3539., 3374.],
[1882., 1868., 1307., ..., 3523., 3443., 3278.],
[1716., 1975., 1919., ..., 3280., 3319., 3121.],
...,
[2199., 2214., 2269., ..., 2563., 2284., 2147.],
[2181., 2213., 2312., ..., 2686., 2668., 2737.],
[2208., 2297., 2351., ..., 2647., 2904., 3008.]]])
您提到的链接错误地执行了平均值和标准差计算。第一个实现是正确的方法。