我是 Swift 和 Tensorflow lite 的新手(有足够的 Python 和 ML 经验)。我在Python中通过Tensorflow生成了一个Tensorflow流模型,用于一些图片分类任务,并通过转换器转换为Tensorflow lite。
当我尝试从 Swift 调用模型时,我成功获得了预测结果,但这里的问题是无论我指定什么图像,我都会得到完全相同的预测结果。我想知道这个问题的原因可能是什么。您可以在下面找到我的 Swift 代码片段。输入图像已经调整为 224 * 224,因此我在这里不执行任何图像调整大小任务。
我猜测这种不一致行为的根本原因是由下面的方法“convertUIImageToData”引起的。也许从 UIImage 转换的数据不合适,但我很难弄清楚为什么......(Tensorflow lite 与 python 代码完美配合。如果你需要这个,我稍后会包含它。)
func predict() {
// Path to the TF lite model
var path = "./Resources/sl_model.tflite"
// Path to the image for classification. The result is samle no matter what the input image is...
var image = loadImageFromPath(path: "./Resources/bell.png")
var options = Interpreter.Options()
options.threadCount = 1
var interpreter = try? Interpreter(modelPath: path)
try? interpreter!.allocateTensors()
// Read TF Lite model input dimension
let inputShape = try? interpreter!.input(at: 0).shape
let inputImageWidth = inputShape!.dimensions[1]
let inputImageHeight = inputShape!.dimensions[2]
// Here try to convert UIImage to Data in order to feed the the input Tensor.
// Guess this method is not working right...
let odata = convertUIImageToData(image: image!)
// Copy the RGB data to the input `Tensor`.
try? interpreter!.copy(odata!, toInputAt: 0)
// Run inference
try? interpreter!.invoke()
// Get the output tensor
let outputTensor = try? interpreter!.output(at: 0)
let results = dataToFloat32Array(data: outputTensor!.data)
print(results) // This prediction result is always same no matter what the image is...
}
func loadImageFromPath(path: String) -> UIImage? {
return UIImage(contentsOfFile:path)
}
func convertUIImageToData(image: UIImage) -> Data? {
// Get the CGImage of the UIImage
guard let cgImage = image.cgImage else {
return nil
}
// Get the width and height of the image
let width = cgImage.width
let height = cgImage.height
// Create a bitmap context using the RGB color space and no alpha channel
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bytesPerPixel = 4
let bytesPerRow = bytesPerPixel * width
let bitsPerComponent = 8
var rawData = [UInt8](repeating: 0, count: width * height * bytesPerPixel)
guard let context = CGContext(data: &rawData,
width: width,
height: height,
bitsPerComponent: bitsPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: CGImageAlphaInfo.noneSkipLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue) else {
return nil
}
// Draw the image in the context
let rect = CGRect(x: 0, y: 0, width: width, height: height)
context.draw(cgImage, in: rect)
// Convert raw pixel data to Data
let data = Data(bytes: &rawData, count: rawData.count)
return data
}
func dataToFloat32Array(data: Data) -> [Float32] {
let count = data.count / MemoryLayout<Float32>.stride
return data.withUnsafeBytes { (rawBufferPointer) -> [Float32] in
let floatBufferPointer = rawBufferPointer.bindMemory(to: Float32.self)
return Array(floatBufferPointer)
}
}
// Result: Always same no matter what the input image is...
[3.545041e-08, 0.12145647, 0.008242114, 1.1497581e-09, 0.85731614, 0.012985148, 6.4414756e-08]
以下发现您很适合我的工作环境。
我尝试了什么:
我所期待的:
我得到了什么:
dataToFloat32Array
的目的似乎是使用这四个字节作为内存将每个 ARGB 像素转换为单个 Float32 表示形式。
这是你的真实意图吗?