如何将对象检测器应用于给定视频的每一帧?

问题描述 投票:0回答:1

我之前已经在此网站上发布了代码,但得知无法发布全部内容。因此,我只会发布重要的代码。

所以,我想做的是拿一个物体检测器(用于图像)并将其应用于给定视频的每一帧。

唯一的是我不知道如何完成它。也就是说,一旦检测到第一帧,该如何处理?我要存放在哪里吗?我如何处理其他框架?处理完这些帧后,如何将这些帧重新组合为视频,即输出视频?

这里是代码:

import numpy as np
import cv2
from numpy import expand_dims
from keras.models import load_model
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from matplotlib import pyplot
from matplotlib.patches import Rectangle

model = load_model('model.h5')

# define the expected input shape for the model
input_w, input_h = 416, 416

# define the anchors
anchors = [[116,90, 156,198, 373,326], [30,61, 62,45, 59,119], [10,13, 16,30, 33,23]]

# define the labels
labels = ["person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck",
    "boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench",
    "bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe",
    "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard",
    "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
    "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana",
    "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake",
    "chair", "sofa", "pottedplant", "bed", "diningtable", "toilet", "tvmonitor", "laptop", "mouse",
    "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator",
    "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"]

vs = cv2.VideoCapture('video.mp4')

class_threshold = 0.6
boxes = list()

while True:
    (grabbed, frame) = vs.read()

    if not grabbed:
        break

    if W is None or H is None:
        (H, W) = frame.shape[:2]

    image, image_w, image_h = load_image_pixels(frame, (input_w, input_h))
    yhat = model.predict(image)

    for i in range(len(yhat)):
        # decode the output of the network
        boxes += decode_netout(yhat[i][0], anchors[i], class_threshhold, input_h, input_w)
    # correct the sizes of the bounding boxes for the shape of the image
    correct_yolo_boxes(boxes, image_h, image_w, input_h, input_w)
    # suppress non-maximal boxes
    do_nms(boxes, 0.5)

    # get the details of the detected objects
    v_boxes, v_labels, v_scores = get_boxes(boxes, labels, class_threshold)

    # draw what we found
    draw_boxes(frame, v_boxes, v_labels, v_scores)

opencv keras deep-learning computer-vision object-detection
1个回答
0
投票

您可以使用opencv中的VideoWriter将帧再次作为视频输出。

一些有关如何使用它的示例代码:

fourcc = cv2.VideoWriter_fourcc(*'XVID')
video_writer = cv2.VideoWriter('test.avi', fourcc, 30, (image_w, image_h))
...
while True:
    ....
    video_writer.write(frame)
    ....
....
video_writer.release()

供参考openCV video saving in python

© www.soinside.com 2019 - 2024. All rights reserved.