SLI multicasting with multisampled FBOs


#1

Hi there, this is probably a very “niche” question. I am trying to use the NVIDIA SLI multicast extension (GL_NV_gpu_multicast) with Cinder, to get better performance in VR. It seems to be working fine, and I’m definitely seeing performance improvements (so far around 1.7x frame rate improvement in my particular scenario, with two GTX 1080s). However, I’ve run into a mysterious and frustrating thing…

I’m rendering (for VR) to two Cinder FBOs, normally with 4 multisamples. When SLI multicast is available, I render the left eye into GPU 0’s left FBO, and simultaneously the right eye into GPU 1’s “left” FBO. Then, when the rendering is done (after making a single set of draw calls to draw both eyes!), I need to blit the GPU 1’s “left” FBO’s contents back to GPU 0 as its “right”. This is required prior to submission of the two eye textures to OpenVR (or even just to draw on screen, say). This transfer is done with a call to glMulticastBlitFramebufferNV() (which is like the regular glBlitFramebuffer call, but allows you to explicitly transfer between GPUs).

What I’m seeing is that the right-eye (after transferring back to GPU 0) is not multisampled. It is all aliased, like the resolve didn’t happen properly (or more like rendering was done without multisampling, since you can’t visualize a MS buffer). I have called fbo->resolveTextures() and prior to that bindFramebuffer(), which ensures it’s marked as “dirty” so it does the resolve.

Strangely, my exact same code works glitch-free if I use 8 or 16 samples in the FBOs, and also works if I turn on CSAA sampling. But it fails (the right eye is aliased) when I have 2 or 4 samples. Of course, it also works fine (both eyes aliased) when no multisampling is enabled.

I know it’s a long shot, but has anyone any thoughts on what might be wrong? I’ve tried all kinds of combinations of things, but keep seeing the same result. Maybe it’s even a driver bug, but somehow I doubt that.

Thanks,
Glen.


#2

Just for the record, I finally managed to sort this out (after first writing a pure OpenGL program to try to repro it, then finding the pure OGL program didn’t exhibit the bug! :wink: ). Ah, so it was time to start stepping into the details of what Cinder is actually doing.

To do the multi-GPU stuff I want, I kind of need to “wrangle” a Cinder FBO. But Cinder kind of protects you from the underlying details, keeping both multisampled and resolved versions in a single Fbo instance (normally a good thing!). But I need to transfer the resolved version from one GPU to the other.

So in the end it wasn’t really complicated, but I was confused by the various “intricacies”, such as that simply calling Fbo::bindFramebuffer() also marks the framebuffer as dirty (needing multisample resolve). It does that even if you just bind it for reading with GL_READ_FRAMEBUFFER. (Devs: maybe that should be changed, only flagging it as dirty for GL_DRAW_FRAMEBUFFER or GL_FRAMEBUFFER bindings?)

Also, I needed more granular control of the multisample vs resolved “sub” framebuffers within one Cinder Fbo object, in particular to explicitly bind the resolved version – luckily there is an Fbo::getResolveId() that lets you do everything directly, with raw OGL calls. For awhile I was also confused because I thought Fbo::getId() was returning that (as opposed to Fbo::getMultisampleId()), but of course in the multisample case they both return the same value. (Yes, this is mentioned in the header…I just missed it the first time! :wink: )

Anyhow – it’s very cool, now everything works, double GPU action now, even when multisampling is active.

Thanks,
Glen.


#3

Thanks for sharing your insights, Glen!


#4

Hi there,

I wanted to mention that there’s been a bit of thought around how ci::gl::Fbo could be made more flexible and intuitive, as you can see on this list. I’m not sure if his work in progress addresses your needs, but when the time comes I’m sure it’d be great to get feedback or suggestions on how to make things like what you’re doing easier.

cheers,
Rich


#5

Thanks for pointing me to that. Just in case it’s useful to him, I added a comment there.