PersonPath22 数据集 PathTrack 注释格式

问题描述 投票:0回答:1

我目前正在使用 Amazon PersonPath22 数据集。在

dataset/personpath22/raw_data/pathtrack/pathtrack_release/train
中有一堆包含图像及其各自注释的文件夹。注释位于(相对于前面所述的路径)
./SOME_VIDEO_NAME/det
位于名为
det_rcnn.txt
的文件中。我对注释数据格式感到困惑。

在 github 存储库上,他们说他们使用 gluon-cv 注释格式,但我认为那是针对不同的图像文件夹。

对象的线看起来像

1.0,-1.0,1053.8172607421875,198.0821990966797,106.62109375,275.42665100097656,0.9953873753547668,-1.0,-1.0,-1.0
。我的假设是
<frame> <idk> <bbox min x> <bbox min y> <bbox width> <bbox height> <confidence> <X> <Y> <Z>
。但是当我使用这些注释参数并将它们转换为 COCO JSON 时,边界框与图像无法正确匹配。

neural-network annotations dataset
1个回答
0
投票

下面是用于解析和可视化 personpath22 视频和注释的简单代码脚本

import cv2
import pandas as pd

# File paths
annotation_file = '/home/juma/Downloads/person_path_22_data/person_path_22/person_path_22-test/uid_vid_00008.mp4/gt/gt.txt'
video_file = '/media/juma/data/mot-data/tracking-dataset/dataset/personpath22/raw_data/uid_vid_00008.mp4'

# Read annotations
annotations = pd.read_csv(annotation_file, header=None)
annotations.columns = ['frame', 'id', 'x', 'y', 'w',
                       'h', 'confidence', 'class', 'visibility', 'misc']
print("annotations", annotations['class'].value_counts())
# Open video
cap = cv2.VideoCapture(video_file)
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi', fourcc, 20.0,
                      (int(cap.get(3)), int(cap.get(4))))

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    current_frame = int(cap.get(cv2.CAP_PROP_POS_FRAMES))

    # Get annotations for the current frame
    frame_annotations = annotations[annotations['frame'] == current_frame]

    # Draw bounding boxes
    for _, row in frame_annotations.iterrows():
        x, y, w, h = int(row['x']), int(row['y']), int(row['w']), int(row['h'])
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
        cv2.putText(frame, f"ID:{row['class']} {int(row['id'])}", (x, y - 10),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Write the frame
    out.write(frame)

    # Optional: Display the frame
    cv2.imshow('Frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release everything
cap.release()
out.release()
cv2.destroyAllWindows()
© www.soinside.com 2019 - 2024. All rights reserved.