我目前正在开发一个物体检测应用,该应用可以检测轮胎是否损坏。为此,我使用了Google的AutoML edge,它可以导出TFlite模型。现在,我想在我的代码中实现此模型,但显然它预测的坐标已标准化,因此我不得不对它们进行非标准化
在这里查看我的代码:
import tensorflow as tf import numpy as np import cv2 MODEL_PATH = 'Resources/model_v1_OD.tflite' LABEL_PATH = 'Resources/model_v1_OD.txt' class TFTireModel(): labels = [] intepreter = None input_details = [] output_details = [] height = 0 width = 0 def __init__(self): with open(LABEL_PATH, 'r') as f: self.labels = [line.strip() for line in f.readlines()] # Init TFlite interpreter self.interpreter = tf.lite.Interpreter(model_path=MODEL_PATH) self.interpreter.allocate_tensors() # Get input and output tensors. self.input_details = self.interpreter.get_input_details() self.output_details = self.interpreter.get_output_details() # Get input dimensions self.height = self.input_details[0]['shape'][1] self.width = self.input_details[0]['shape'][2] def predict(self, img, threshold=0.3): # Resize image to input dimensions img = cv2.resize(img, (self.width, self.height)) img = np.expand_dims(img, axis=0) img = (2.0 / 255.0) * img - 1.0 img = img.astype('uint8') # Predict image self.interpreter.set_tensor(self.input_details[0]['index'], img) self.interpreter.invoke() # get results boxes = self.interpreter.get_tensor( self.output_details[0]['index']) print(f"boxes: {boxes}") classes = self.interpreter.get_tensor( self.output_details[1]['index']) scores = self.interpreter.get_tensor( self.output_details[2]['index']) num = self.interpreter.get_tensor( self.output_details[3]['index']) # Get output output =self._boxes_coordinates(boxes=np.squeeze(boxes[0]), classes=np.squeeze(classes[0]+1).astype(np.int32), scores=np.squeeze(scores[0]), im_width=self.width, im_height=self.height, min_score_thresh=threshold) print(f"output: {output}") # Format output return output def _boxes_coordinates(self, boxes, classes, scores, im_width, im_height, max_boxes_to_draw=4, min_score_thresh=0.4): print(f"width: {im_width}, height {im_height}" ) if not max_boxes_to_draw: max_boxes_to_draw = boxes.shape[0] number_boxes = min(max_boxes_to_draw, boxes.shape[0]) tire_boxes = [] # person_labels = [] for i in range(number_boxes): if scores is None or scores[i] > min_score_thresh: box = tuple(boxes[i].tolist()) ymin, xmin, ymax, xmax = box xmin, ymin, xmax, ymax = (int(xmin * im_width), int(xmax * im_width), int(ymin * im_height), int(ymax * im_height)) #TODO: DO A LOOP #tire_boxes.append([(ymin, xmin, ymax, xmax), scores[i], self.labels[classes[i]]]) #More complete tire_boxes.append((xmin, ymin, xmax, ymax)) return tire_boxes
发生错误的地方:
boxes = self.interpreter.get_tensor( self.output_details[0]['index']) print(f"boxes: {boxes}" boxes: [[[ 0.00263482 0.50020593 0.3734043 0.83953816] [ 0.12580797 0.14952084 0.65327024 0.61710536] [ 0.13584864 0.38896233 0.6485662 0.85324436] [ 0.31914377 0.3945622 0.87147605 0.8458656 ] [ 0.01334581 0.03666234 0.46443292 0.55461186] [ 0.1018104 -0.08279537 0.6541427 0.37984413]
由于此处的输出已标准化,我不知道如何对其进行标准化。所需的输出是宽度和高度的百分比,如_boxes_coordinates函数中所示。
我目前正在开发一个物体检测应用,该应用可以检测轮胎是否损坏。为此,我使用了Google的AutoML edge,它可以导出TFlite模型。现在我想实现这个...
TFLite的对象检测模型的输出格式为: