我正在尝试在iPhone SE上运行我的金属程序
我为threadsPerThreadGroup和threadsPerGrid大小尝试了许多数字,所有这些都给了我这个错误:TLValidateFeatureSupport:3539: failed assertion `Dispatch Threads with Non-Uniform Threadgroup Size is only supported on MTLGPUFamilyApple4 and later.'
这是我的代码
var threadsPerThreadGroup: MTLSize
var threadsPerGrid: MTLSize
computeCommandEncoder.setComputePipelineState(updateShader)
let w = updateShader.threadExecutionWidth
threadsPerThreadGroup = MTLSize(width: w, height: 1, depth: 1)
threadsPerGrid = MTLSize(width: Int(constants.bufferLength), height: 1, depth: 1)
if(frames % 2 == 0) {
computeCommandEncoder.setBuffer(buffer1, offset: 0, index: 0)
computeCommandEncoder.setBuffer(buffer2, offset: 0, index: 1)
} else {
computeCommandEncoder.setBuffer(buffer2, offset: 0, index: 0)
computeCommandEncoder.setBuffer(buffer1, offset: 0, index: 1)
}
computeCommandEncoder.setBytes(&constants, length: MemoryLayout<MyConstants>.stride, index: 2)
computeCommandEncoder.dispatchThreads(threadsPerGrid, threadsPerThreadgroup: threadsPerThreadGroup)
frames += 1
我正在使用iOS 13.4和XCode 11.4
threadExecutionWidth的计算结果为32,并且constants.bufferLength为512
Use [dispatchThreads] only if the device supports non-uniform threadgroup sizes.
措词不够清晰。这意味着在A11之前的GPU上,dispatchThreads
不起作用。
如果要使解决方案适用于所有设备,则必须自己计算网格中有多少线程组,并使用dispatchThreadgroups。
如果要在代码中同时使用这两种方法,则可以detect the device's feature set at runtime。