如何为 cnn 创建我们自己的图像数据集

问题描述 投票:0回答:1

我有两个不同的文件夹,其中包含我的图像和随机图像,如何将其拆分以在 cnn 模型中进行训练和测试

我尝试将其作为训练数据和测试数据,但我不知道该怎么做。我需要知道从头开始构建我们自己的数据集的过程是什么。

deep-learning conv-neural-network training-data image-classification
1个回答
0
投票

考虑到您只是询问如何在训练和测试中分割图像,以下是您可以做到的方法:


import os
import shutil
import random

# Define paths for source and destination folders
source_folder = 'path/to/source_folder'  # Replace with your source folder path
train_folder = 'path/to/train_folder'    # Replace with your train folder path
test_folder = 'path/to/test_folder'      # Replace with your test folder path

# Define the ratio for splitting (e.g., 80% train, 20% test)
train_ratio = 0.8

# Create destination folders if they don't exist
os.makedirs(train_folder, exist_ok=True)
os.makedirs(test_folder, exist_ok=True)

# List all image files in the source folder
image_files = [file for file in os.listdir(source_folder) if file.endswith(('jpeg', 'png', 'jpg'))]

# Shuffle the list to randomize the selection
random.shuffle(image_files)

# Calculate the number of files for training
train_count = int(len(image_files) * train_ratio)

# Split files into train and test sets
train_images = image_files[:train_count]
test_images = image_files[train_count:]

# Move images to the respective folders
for image in train_images:
    src_path = os.path.join(source_folder, image)
    dst_path = os.path.join(train_folder, image)
    shutil.copyfile(src_path, dst_path)

for image in test_images:
    src_path = os.path.join(source_folder, image)
    dst_path = os.path.join(test_folder, image)
    shutil.copyfile(src_path, dst_path)

附注您仍然需要根据您的 CNN 模型使用适当的注释格式标记不同类别的数据。另外,这只是将您的数据分割到训练测试目录中,而不是读取您的图像以输入到您的模型中。

© www.soinside.com 2019 - 2024. All rights reserved.