图像双三次插值与 OpenCV 和 Scikit 图像实现不匹配

问题描述 投票:0回答:1

我正在尝试用Python实现论文“数字图像处理的三次卷积插值”中的图像的双三次卷积插值。然而,我的实现看起来像一个适当的比例,但仍然与参考实现不同,我不明白为什么。这在较小的图像中尤其明显,如下图所示:

这是由 MWE 生成的图像,其中包含原始未缩放图像、我的坏双三次、opencv/skimage 双三次尺度,以及它们与我的缩放图像的差异。

这是我迄今为止在没有多重处理的情况下转换为 MWE 的代码:

import math
import time
from functools import cache

import cv2 as cv
import matplotlib.pyplot as plt
import numpy as np
import skimage


def u(s: float):
    # bicubic convolution kernel aka catmull-rom spline
    # the value of a here is -0.5 as that was used in Keys' version
    a: float = -0.5
    s = abs(s)
    if 0 <= s < 1:
        return (a + 2) * s**3 - (a + 3) * s**2 + 1
    elif 1 <= s < 2:
        return a * s**3 - 5 * a * s**2 + 8 * a * s - 4 * a
    return 0


in_file = "test_sharpen.png"
ratio = 2.0

im_data = cv.imread(str(in_file))

# because plt uses rgb
im_data = cv.cvtColor(im_data, cv.COLOR_RGB2BGR)

start = time.perf_counter()
print("Scaling image...")

H, W, C = im_data.shape

# pad by 2 px
image = cv.copyMakeBorder(im_data, 2, 2, 2, 2, cv.BORDER_REFLECT)

image = image.astype(np.float64) / 255

# create new image
new_H = math.floor(H * ratio)
new_W = math.floor(W * ratio)
big_image = np.zeros((new_H, new_W, C))
for c in range(C):
    for j in range(new_H):
        # scale new image's coordinate to be in old image
        y = j * (1 / ratio) + 2
        # we separate x and y to integer and fractional parts
        iy = int(y)
        # ix and iy are essentially the closest original pixels
        # as all the old pixels are in integer positions
        # decx and decy as the fractional parts are then the distances
        # to the original pixels on the left and above
        decy = iy - y
        for i in range(new_W):
            x = i * (1 / ratio) + 2
            ix = int(x)
            decx = ix - x

            pix = sum(
                sum(
                    image[iy + M, ix + L, c] * u(decx + L) * u(decy + M)
                    for L in range(-1, 2 + 1)
                )
                for M in range(-1, 2 + 1)
            )

            # we limit results to [0, 1] because bicubic interpolation
            # can produce pixel values outside the original range
            big_image[j, i, c] = max(min(1, pix), 0)

big_image = (big_image * 255).astype(np.uint8)

print(f"Finished scaling in {time.perf_counter() - start} seconds")


# generate proper bicubic scales with opencv and skimage
# and compare them to my scale with plt
proper_cv = cv.resize(im_data, None, None, ratio, ratio, cv.INTER_CUBIC)
proper_skimage = skimage.util.img_as_ubyte(
    skimage.transform.rescale(im_data, ratio, channel_axis=-1, order=3)
)


fig, ax = plt.subplots(nrows=4, ncols=2)
ax[0, 0].imshow(im_data)
ax[0, 0].set_title("Original")
ax[0, 1].imshow(big_image)
ax[0, 1].set_title("My scale")

ax[1, 0].set_title("Proper OpenCV")
ax[1, 0].imshow(proper_cv)
ax[1, 1].set_title("Proper Skimage")
ax[1, 1].imshow(proper_cv)

print("my scale vs proper_cv psnr:", cv.PSNR(big_image, proper_cv))

ax[2, 0].set_title("Absdiff OpenCV vs My")
diffy_cv = cv.absdiff(big_image, proper_cv)
ax[2, 0].imshow(diffy_cv)
ax[2, 1].set_title("Absdiff Skimage vs My")
diffy_skimage = cv.absdiff(big_image, proper_skimage)
ax[2, 1].imshow(diffy_skimage)

ax[3, 1].set_title("Absdiff CV vs Skimage")
ax[3, 1].imshow(cv.absdiff(proper_cv, proper_skimage))
ax[3, 0].set_title("Absdiff CV vs Skimage")
ax[3, 0].imshow(cv.absdiff(proper_cv, proper_skimage))

print("diffy_cv", diffy_cv.min(), diffy_cv.max(), diffy_cv.dtype, diffy_cv.shape)
print(
    "diffy_skimage",
    diffy_skimage.min(),
    diffy_skimage.max(),
    diffy_skimage.dtype,
    diffy_skimage.shape,
)
print(
    "proper_skimage vs proper_opencv psnr:",
    cv.PSNR(big_image, proper_cv),
    cv.absdiff(proper_cv, proper_skimage).max(),
)
plt.show()

它可以与例如

python scaling.py
将 test_sharpening.png 缩放至 2 倍。

到目前为止我已经实现了,看起来工作正常,但仍然有所不同。我也尝试更改

a
的值,但这不是问题。

python image-processing interpolation bicubic
1个回答
0
投票

看来我缩放坐标的方法是错误的。例如,比率为 2 时,y 轴上的新点为 2.0、2.5、3.0 等。

这是错误的,因为坐标应该位于旧点之间,而不是直接在它们之上。我将缩放比例更改为:

# scale new image's coordinate to be in old image based on its midpoint
y = ((j + 0.5) / ratio) - 0.5 + 2
x = ((i + 0.5) / ratio) - 0.5 + 2

现在新的点坐标是1.75、2.25、2.75等等。我的直觉告诉我,这些点应该从 2.25 开始,但在这种情况下,图像与参考相比似乎略有移动。

现在,当 a=-0.75 时,此实现与 cv2 的实现以及其他实现几乎完美匹配,唯一的例外是我的反射边框,它似乎被复制到其他实现上。

我将最终代码与 Rust 版本放在了 Github 上,该版本的速度提高了约 200 倍,可以在更大的图像上进行测试。

© www.soinside.com 2019 - 2024. All rights reserved.