I have successfully used Metal to render into a simple view, and update at full FPS, by following and modifying the code at https://github.com/jim-ec/metal-hello-triangle
I am now trying to repeat the process by drawing into multiple views in an NSCollectionView.
My render code is as follows, where each view in the NSCollectionView has a global Renderer
object as its delegate:
class Renderer {
/// Initialisation as per the linked tutorial
public func draw(in view: MTKView)
{
let dataSize = VERTEX_DATA.count * MemoryLayout.size(ofValue: VERTEX_DATA[0])
VERTEX_DATA[1][1] = Float(arc4random_uniform(100)) / 100
let vertexBuffer = view.device?.makeBuffer(bytes: VERTEX_DATA,
length: dataSize,
options: [])
let commandBuffer = mCommandQueue.makeCommandBuffer()!
let commandEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: view.currentRenderPassDescriptor!)
commandEncoder?.setRenderPipelineState(mPipeline)
commandEncoder?.setVertexBuffer(vertexBuffer, offset: 0, index: 0)
commandEncoder?.drawPrimitives(type: .triangle,
vertexStart: 0,
vertexCount: 3,
instanceCount: 1)
commandEncoder?.endEncoding()
commandBuffer.present(view.currentDrawable!)
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
}
}
The arc4random_uniform
call is to make the triangles change colour, so I can confirm that they are redrawing.
Unfortunately, views only update around 3 times a second, and not at all in sync. Looking at instruments, the GPU is barely breaking a sweat, and the CPU overhead of both WindowServer and my app are pretty huge. The draw
function is now very heavy with makeRenderCommandEncoder
becoming a real bottleneck, and there are obviously some synchronisation issues given the sporadic updating.
I expected this to be more-or-less similar performance as I’m still doing a trivial amount of drawing, but it really isn’t. What am I doing wrong?
Working example available at https://github.com/SirWhiteHat/MetalPerformanceProblem
Thank you!
There is not enough code here to deduce anything tbh.
@Spo1ler is there anything you’d recommend I add to make it clearer? Apart from this, I have the initialisation code as detailed in the link and an AppDelegate, nothing else.
in the example you linked, there’s
waitUntilCompleted
, do you havewaitUntilCompleted
somewhere?@Spo1ler I do now 😉 good spot! But unfortunately to no avail, same issue. I will add the initialisation code for completeness.
More code would be insightful, or better yet a minimum reproducible project. I would suggest looking at what is using up the CPU. arc4random is pretty expensive IIRC, and if you are running 30 of them 30-60 times per second could be part of the problem.
Show 4 more comments