是否有任何方法可以制作具有梯度流的向下/向上缩放函数? 我之所以这样做是因为反向传播用于训练缩减因子生成和更快的 RCNN。
我正在使用 Pytorch 设计计算机视觉深度学习流程。
初步:
它由两部分组成:预处理和视觉任务。 作为预处理部分,我创建了一个缩小因子生成器模型和一个图像缩小/放大模块。 在视觉任务部分,我转发预处理图像并进行对象检测。
训练流程看起来像这样 :: img 输入 -> 缩小因子生成(img 前向) -> 使用缩小因子缩小 img -> 使用 1/缩小因子放大 img -> 视觉任务(Faster RCNN,放大的 img) -> 后退.
问题:
当我设计这个流程时,我陷入了向下/向上扩展模块。 我尝试使用Pytorch的F.interpolate(img,scale_factor,,,)函数,但这阻止了梯度的流动。 我将原始图像和缩小因子(来自带 Gradeint 的模型)放入 F.interpolate 函数中,但 grad_fn 正在消失。
我尝试制作这样的自定义插值函数,
def bilinear_interpolate(self, img, scale_factor):
print('img, scale_factor :',img,scale_factor)
n, c, h, w = img.size()
new_h, new_w = int(h * scale_factor), int(w * scale_factor)
device = img.device
h_scale = torch.linspace(0, h-1, new_h, device=device)
w_scale = torch.linspace(0, w-1, new_w, device=device)
grid_h, grid_w = torch.meshgrid(h_scale, w_scale)
h_floor = grid_h.floor().long()
h_ceil = h_floor + 1
h_ceil = h_ceil.clamp(max=h-1)
w_floor = grid_w.floor().long()
w_ceil = w_floor + 1
w_ceil = w_ceil.clamp(max=w-1)
print('h_floor,h_floor, h_ceil, w_floor, w_ceil :',h_floor, h_ceil, w_floor, w_ceil)
tl = img[:, :, h_floor, w_floor]
tr = img[:, :, h_floor, w_ceil]
bl = img[:, :, h_ceil, w_floor]
br = img[:, :, h_ceil, w_ceil]
h_frac = grid_h - h_floor.to(device)
w_frac = grid_w - w_floor.to(device)
# bilinear interpolation
top = tl + (tr - tl) * w_frac
bottom = bl + (br - bl) * w_frac
interpolated_img = top + (bottom - top) * h_frac
return interpolated_img
但由于 int/float 转换和变量赋值的排序,它不起作用。
是否有任何方法可以制作具有梯度流的向下/向上缩放函数? 我之所以这样做是因为反向传播用于训练缩减因子生成和更快的 RCNN。
BilinearInterpolation
层执行缩放,同时保留梯度流。它只是包裹了
F.interpolate
,输出处的梯度函数是
<UpsampleBilinear2DBackward0>
。输出:
"grad_fn" of z_scaled is: <UpsampleBilinear2DBackward0 object...>
插值层:
class BilinearInterpolation(nn.Module):
def __init__(self, scale_factor):
super(BilinearInterpolation, self).__init__()
self.scale_factor = scale_factor
def forward(self, x):
batch_size, channels, height, width = x.size()
new_height = int(height * self.scale_factor)
new_width = int(width * self.scale_factor)
# Perform bilinear interpolation
interpolated = F.interpolate(x, size=(new_height, new_width), mode='bilinear', align_corners=True)
return interpolated
import torch
from torch import nn
import torch.nn.functional as F
class BilinearInterpolation(nn.Module):
def __init__(self, scale_factor):
super(BilinearInterpolation, self).__init__()
self.scale_factor = scale_factor
def forward(self, x):
batch_size, channels, height, width = x.size()
new_height = int(height * self.scale_factor)
new_width = int(width * self.scale_factor)
# Perform bilinear interpolation
interpolated = F.interpolate(x, size=(new_height, new_width), mode='bilinear')
return interpolated
#
# Test data
#
import numpy as np
xx, yy = np.meshgrid(*[np.linspace(-1, 1)] * 2)
z = np.sin(xx)**2 + np.cos(yy)**2
z = torch.tensor(z).float()
z = z[None, None, ...]
z.requires_grad = True
#View original data
import matplotlib.pyplot as plt
plt.contourf(z[0, 0, :, :].detach(), cmap='YlGnBu')
plt.text(x=9, y=42, s=f'original tensor\n{list(z.shape)}', fontweight='bold')
#Scale
scale_factor = 2
z_scaled = BilinearInterpolation(scale_factor=scale_factor)(z)
#View scaled data
plt.contourf(z_scaled[0, 0, ...].detach(), cmap='YlGnBu', zorder=0)
plt.text(x=28, y=91, s=f'{scale_factor}x interpolated tensor\ndims={list(z_scaled.shape)}', fontweight='bold')
plt.xlabel('x')
plt.ylabel('y')
plt.gcf().set_size_inches(5, 5)
print('"grad_fn" of z_scaled is:', z_scaled.grad_fn)