将 RGBD 保存为单个图像

Question

我使用此代码https://www.programmersought.com/article/8773686326/通过整合RGB和深度图像来创建RGBD 现在我想知道 RGBD 文件是否可以保存为 single 图像（jpeg、png...）我尝试过，但没有成功，通过使用 imageio.imwrite()、plt.imsave()、cv2.imwrite()...可能是由于尺寸 [4,64,1216]，那么有没有办法实现它？

scale = (64, 1216)
 
resize_img = transforms.Resize(scale, Image.BILINEAR)
resize_depth = transforms.Resize(scale, Image.NEAREST)
to_tensor = transforms.ToTensor()
 
img_id = 0
 
# load image and resize
img = Image.open('RGB_image.jpg')
img = resize_img(img)
img = np.array(img)
 
# load depth and resize
depth = Image.open('depth_image.png')
depth = resize_depth(depth)
depth = np.array(depth)
depth = depth[:, :, np.newaxis]
 
# tensor shape and value, normalization
img = Image.fromarray(img).convert('RGB')
img = to_tensor(img).float()
 
depth = depth / 65535
depth = to_tensor(depth).float()

rgbd = torch.cat((img, depth), 0)
print("\n\nRGBD shape")
print(rgbd.shape)

Answer 1

我们可以将深度保存为 RGBA 像素格式的图像的 Alpha 通道。

Alpha通道应用透明通道，但我们可以将其用作第四通道来存储RGB和深度。

由于深度可能需要高精度 - 可能需要

float32

精度，我建议使用 OpenEXR 图像格式。
为了与 OpenEXR 格式兼容，我们可以将所有通道转换为 [0, 1] 范围内的

float32

。

注：

我意识到Open3D支持RGBD图像，但看起来它不支持读取和写入RGB和深度到单个文件。

以下代码示例使用 OpenCV 而不是 Pillow。
我以为 OpenCV 支持 EXR 文件格式，但我的 OpenCV Python 版本不支持 EXR。我用 ImageIO 包代替。

将 RGB 和深度转换并写入 EXR 文件的阶段：

加载 RGB 图像，调整其大小并转换为浮点数：

 img = cv2.imread('RGB_image.jpg')  # The channels order is BGR due to OpenCV conventions.
 img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)
 img = img.astype(np.float32) / 255  # Convert to float in range [0, 1]

加载深度图像，调整大小并转换为浮动：

 depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED)  # Assume depth_image.png is 16 bits grayscale.
 depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)
 depth = depth.astype(np.float32) / 65535  # Convert to float in range [0, 1]

合并
```
img
```
（3 个通道）和
```
depth
```
（1 个通道）至 4 个通道：
形状将是
```
(1216, 64, 4)
```
（应用 OpenCV BGRA 颜色约定）。
```
 bgrd = np.dstack((img, depth))
```
将
```
bgrd
```
写入 EXR 文件：
如果 OpenCV 是使用 OpenEXR 构建的，我们可以使用：
```
cv2.imwrite('rgbd.exr', bgrd)
```
。
如果我们使用 ImageIO，我们最好在保存之前从 BGRA 转换为 RGBA：
```
 rgbd = cv2.cvtColor(bgrd, cv2.COLOR_BGRA2RGBA)
 imageio.imwrite('rgbd.exr', rgbd)
```

代码示例（将 RGB 和 Range 转换为 RGBA EXR 文件，然后读取并转换回来）：

import numpy as np
import cv2
import imageio

scale = (64, 1216)
 
# load image and resize
img = cv2.imread('RGB_image.jpg')  # The channels order is BGR due to OpenCV conventions.
img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)
img = img.astype(np.float32) / 255  # Convert to float in range [0, 1]
 
# load depth and resize
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED)  # Assume depth_image.png is 16 bits grayscale.
depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)

if depth.ndim == 3:
    depth = depth[:, :, 0]  # Keep one channel if depth has 3 channels?  depth = depth[:, :, np.newaxis]
 
depth = depth.astype(np.float32) / 65535  # Convert to float in range [0, 1]

# Use the depth channel as alpha channel (the channel order is BGRA - applies OpenCV conventions).
bgrd = np.dstack((img, depth))

print("\n\nRGBD shape")
print(bgrd.shape)

# Save the data to exr file (the color format of the exr file is RGBA).
# Error: cv::initOpenEXR imgcodecs: OpenEXR codec is disabled.
#cv2.imwrite('rgbd.exr', bgrd)

# https://stackoverflow.com/questions/45482307/save-float-array-to-image-with-exr-format
rgbd = cv2.cvtColor(bgrd, cv2.COLOR_BGRA2RGBA)
imageio.imwrite('rgbd.exr', rgbd)

################################################################################
# Reading the data:  

#bgrd = cv2.imread('rgbd.exr')  # Error: cv::initOpenEXR imgcodecs: OpenEXR codec is disabled.
rgbd = imageio.imread('rgbd.exr')

img = bgrd[:, :, 0:3]  # First 3 channels are the image.
depth = bgrd[:, :, 3]  # Last channel is the depth

img = (img*255).astype(np.uint8)  # Convert back to uint8
#depth = (depth*65535).astype(np.uint16)  # Convert back to uint16 (if required).

# Show images for testing:
cv2.imshow('img', cv2.cvtColor(img, cv2.COLOR_RGBA2RGB))
cv2.imshow('depth', depth)
cv2.waitKey()
cv2.destroyAllWindows()

注：

您可能需要进行一些修改 - 我不确定尺寸（
```
64x1216
```
或
```
1216x64
```
），并且不确定代码
```
depth = depth[:, :, np.newaxis]
```
。
我可能对
```
depth_image.png
```
的格式有误。

更新：

将 16 位 RGBA 保存为 PNG 文件：

而不是使用 EXR 文件和

float32

像素格式...
我们可以使用 PNG 文件和

uint16

像素格式。

PNG 文件的像素格式将是 RGBA（RGB 和 Alpha - 透明通道）。
每个颜色通道将为 16 位（2 字节）。
Alpha 通道存储深度图（以

uint16

格式）。

将
```
img
```
转换为
```
uint16
```
（我们可以选择不缩放 256）：
```
 img = img.astype(np.uint16)*256
```
合并
```
img
```
（3 个通道）和
```
depth
```
（1 个通道）至 4 个通道：
```
 bgrd = np.dstack((img, depth))
```
将合并后的图像保存为PNG文件：
```
 cv2.imwrite('rgbd.png', bgrd)
```

代码示例（第二部分读取并显示以供测试）：

import numpy as np
import cv2

scale = (64, 1216)

# load image and resize
img = cv2.imread('RGB_image.jpg')  # The channels order is BGR due to OpenCV conventions.
img = cv2.resize(img, scale, interpolation=cv2.INTER_LINEAR)

# Convert the image to from 8 bits per color channel to 16 bits per color channel
# Notes:
# 1. We may choose not to scale by 256, the scaling is used only for viewers that expects [0, 65535] range.
# 2. Consider that most image viewers refers the alpha (transparency) channel, so image is going to look strange.
img = img.astype(np.uint16)*256

# load depth and resize
depth = cv2.imread('depth_image.png', cv2.IMREAD_UNCHANGED)  # Assume depth_image.png is 16 bits grayscale.
depth = cv2.resize(depth, scale, interpolation=cv2.INTER_NEAREST)

if depth.ndim == 3:
    depth = depth[:, :, 0]  # Keep one channel if depth has 3 channels?  depth = depth[:, :, np.newaxis]

if depth.dtype != np.uint16:
    depth = depth.astype(np.uint16)  # The depth supposed to be uint16, so code should not reach here.

# Use the depth channel as alpha channel (the channel order is BGRA - applies OpenCV conventions).
bgrd = np.dstack((img, depth))

print("\n\nRGBD shape")
print(bgrd.shape)  # (1216, 64, 4)

# Save the data to PNG file (the pixel format of the PNG file is 16 bits RGBA).
cv2.imwrite('rgbd.png', bgrd)


# Testing:
################################################################################
# Reading the data:
bgrd = cv2.imread('rgbd.png', cv2.IMREAD_UNCHANGED)

img = bgrd[:, :, 0:3]  # First 3 channels are the image.
depth = bgrd[:, :, 3]  # Last channel is the depth

#img = (img // 256).astype(np.uint8)  # Convert back to uint8

# Show images for testing:
cv2.imshow('img', img)
cv2.imshow('depth', depth)
cv2.waitKey()
cv2.destroyAllWindows()

Answer 2

我已遵循此操作，但图片变黑并将其加载到全局映射器不会给出任何内容，但波段仍显示 4

将 RGBD 保存为单个图像

问题描述投票：0回答：2

2个回答

更新：

最新问题

将 RGBD 保存为单个图像

问题描述 投票：0回答：2

2个回答

更新：

最新问题

问题描述投票：0回答：2