无监督自动编码器产生输出维度 - 不同大小数据集的批次数量

问题描述 投票:0回答:1

我有 2 个不同形状的数据集。

HS = np.random.rand(128, 128, 172)
MS = np.random.rand(512, 512, 9)

我想生成一个形状为

(512, 512, 172).

的图像

我没有这个图像数据,所以它是一个无监督模型。

我正在生成补丁并使用 keras 生成器来创建批次。

完整代码:

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.keras.layers import Input, Conv2D, Concatenate, MaxPooling2D, UpSampling2D, \
     Conv2DTranspose, Dropout
from tensorflow.keras.models import Model
from patchify import patchify, unpatchify

class DataGenerator(tf.keras.utils.Sequence):
    def __init__(self,
                 hs,
                 ms,
                 batch_size,
                 shuffle):
        self.hs = hs
        self.ms = ms
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.indices_ms = self.on_epoch_end_ms()
        self.indices_hs = self.on_epoch_end_hs()
        self.min_length = min(len(self.ms), len(self.hs))

    def on_epoch_end_ms(self):
         indices_ms = np.arange(len(self.ms))
         if self.shuffle:
             np.random.shuffle(indices_ms)
         return indices_ms
     
    def on_epoch_end_hs(self):
         indices_hs = np.arange(len(self.hs))
         if self.shuffle:
             np.random.shuffle(indices_hs)   
         return indices_hs
     
    def __len__(self):
        return int(np.ceil(self.min_length / self.batch_size))
            
    def __getitem__(self, i):
        
        start_ms = i * self.batch_size
        end_ms = min((i + 1) * self.batch_size, self.min_length)
        indices_batch_ms = self.indices_ms[start_ms:end_ms]
        
        start_hs = i * self.batch_size
        end_hs = min((i + 1) * self.batch_size,  self.min_length)
        indices_batch_hs = self.indices_hs[start_hs:end_hs]
        
        batch_x1 = np.asarray([self.hs[k] for k in indices_batch_hs])
        batch_x2 = np.asarray([self.ms[k] for k in indices_batch_ms])
       
        return [batch_x1, batch_x2] , [batch_x1, batch_x2]
    
STRIDE = 16
PATCH_SIZE = 32
BATCH_SIZE = 16

HS = np.random.rand(128, 128, 172)
MS = np.random.rand(512, 512, 9)

patches_MS = patchify(MS, (PATCH_SIZE, PATCH_SIZE, 9), step=STRIDE) 
patches_HS = patchify(HS, (PATCH_SIZE, PATCH_SIZE, 172), step=STRIDE) 

patches_MS = patches_MS.reshape(-1, PATCH_SIZE, PATCH_SIZE, MS.shape[2])
patches_HS = patches_HS.reshape(-1, PATCH_SIZE, PATCH_SIZE, HS.shape[2])

# split data
idx = int(0.8 * len(patches_HS))
indices = np.arange(len(patches_HS))

train_idx = indices[:idx]
val_idx = indices[idx:]


train_gen = DataGenerator(patches_HS[train_idx],
                          patches_MS[train_idx],
                          BATCH_SIZE,
                          shuffle=True)

val_gen = DataGenerator(patches_HS[val_idx],
                        patches_MS[val_idx],
                        BATCH_SIZE,
                        shuffle=False)



# Define input shapes
input_shape1 = (PATCH_SIZE, PATCH_SIZE, 172)
input_shape2 = (PATCH_SIZE, PATCH_SIZE, 9)

# Define input layers
input1 = Input(shape=input_shape1, name='input1')
input2 = Input(shape=input_shape2, name='input2')

#Down-sampling path for input1
x1 = Conv2D(16, (3, 3), activation='relu', padding='same')(input1)
x1 = MaxPooling2D((2, 2), padding='same')(x1)
x1 = Conv2D(16, (3, 3), activation='relu', padding='same')(x1)
x1 = Dropout(0.2)(x1)
encoded1 = MaxPooling2D((2, 2), padding='same')(x1)

# Down-sampling path for input2
x2 = Conv2D(16, (3, 3), activation='relu', padding='same')(input2)
#x2 = MaxPooling2D((2, 2), padding='same')(x2)
x2 = Conv2D(16, (3, 3), activation='relu', padding='same')(x2)
encoded2 = MaxPooling2D((2, 2), padding='same')(x2)

# Up-sample tensor from the first path to match the shape of the tensor from the second path
x1_upsampled = UpSampling2D(size=(2, 2))(encoded1)

# Concatenate the up-sampled tensor with the tensor from the second path
concatenated = Concatenate()([x1_upsampled, encoded2])

# Up-sampling path
x = Conv2D(16, (3, 3), activation='relu', padding='same')(concatenated)
x = UpSampling2D((2, 2))(x)
x = Dropout(0.2)(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
decoded = Conv2D(172, (3, 3), activation='sigmoid', padding='same')(x)

# Define the model
autoencoder = Model(inputs=[input1, input2], outputs=decoded)
# Compile the model
autoencoder.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4), loss='binary_crossentropy', metrics=['mae'])

autoencoder.fit(train_gen, 
                epochs=16, 
                validation_data=val_gen,
                batch_size=BATCH_SIZE)
                

# predictions
pred_gen = DataGenerator(patches_HS,
                         patches_MS,
                         BATCH_SIZE,
                         shuffle=True)

preds = autoencoder.predict(pred_gen)

# Use patchify just for getting the shape 
dummy_img = np.zeros((32, 32, 172))
patches = patchify(dummy_img, (PATCH_SIZE, PATCH_SIZE, 172), step=STRIDE)

for i in range(patches.shape[0]):
        for j in range(patches.shape[1]):
            patches[i, j, 0, :, :, :] = preds[i]

reconstructed_image = unpatchify(patches, dummy_img.shape)                
plt.imshow(reconstructed_image[:, :, 0])

现在的模型有

(None, 32, 32, 172)
作为输出层。

我想用

(512, 512, 172)
创建图像。

(为了预测,我只是使用相同的补丁)

此行

dummy_img = np.zeros((32, 32, 172))
应更改为
dummy_img = np.zeros((512, 512, 172))
,以便能够重新创建图像。

更新

我认为问题是因为我有 2 个不同大小的数据集,我采用较小数据集的长度来创建批次数:

 def __len__(self):
        return int(np.ceil(self.min_length / self.batch_size))

因此,预测 (

preds
) 具有形状:

(225, 16, 16, 172)

但为了能够将预测写入

(512, 512, 172)
图像,它们必须具有形状:

(1024, 16, 16, 172)

因为:

512 / 16 = 32  , 32 * 32 = 1024
,其中
16 = PATCH_SIZE

tensorflow machine-learning deep-learning conv-neural-network tensorflow2.0
1个回答
-2
投票

要使用无监督自动编码器从数据集中生成尺寸为 (512, 512, 172) 的图像,您确实应该将 dummy_img = np.zeros((32, 32, 172)) 更改为 dummy_img = np.zeros((512, 512, 172)) 用于正确的图像重建。此调整允许重建图像匹配您的目标尺寸。您还需要确保正确计算在较大图像画布上分布预测补丁的过程,尊重补丁总数及其位置,以避免重叠或间隙。这涉及到在取消修补步骤期间修改迭代方式并将每个修补程序放回到更大的图像框架中。

© www.soinside.com 2019 - 2024. All rights reserved.