Single-pass cubemap (SOLVED!)


#1

I’m trying to get an fbo texture that has a 3x2 cubemapped grid to equirectangular projection. All the examples I find use a cubemap cross texture/gl::cubemap and that uses a 3dimensional texture coordinate lookup using world normals for the conversion. How would I use a 2D texture of a 3x2 grid of all 6 faces without having to draw my fbo into a cubemap texture of layered fbos with coordinate offsets?
basically a cubemap in the frag shader looks something like this…

uniform samplerCube uTex0;
`  //------------------------`
        vec2 thetaphi = texCoord * vec2(3.1415926535897932384626433832795, 1.5707963267948966192313216916398); 
        vec3 rayDirection = vec3(cos(thetaphi.y) * cos(thetaphi.x), sin(thetaphi.y), cos(thetaphi.y) * sin(thetaphi.x));
    	fragColor = texture(uTex0, rayDirection);

rayDirection is a vec3 using normals world position but my custom 3x2 texture would use a vec2 for the texcoord, not a vec3.
I’d like to avoid having to redraw my texture into a cubemap.
Since my 6 faces are already on a single texture, i’d like to do something along the lines of…

uniform sampler2D uTex0;
float faceID;
//----------------
        vec2 thetaphi = texCoord * vec2(3.1415926535897932384626433832795, 1.5707963267948966192313216916398); 
        vec3 rayDirection = vec3(cos(thetaphi.y) * cos(thetaphi.x), sin(thetaphi.y), cos(thetaphi.y) * sin(thetaphi.x));
vec2 newCoord;
if(faceID==0 || faceId==1)newCoord=rayDirection .xy;
if(faceID==2 || faceId==3)newCoord=rayDirection .yz;
ect....
fragColor = texture(uTex0, newCoord);

I was thinking, if I drew my 3x2 texture onto a plane instanced 6 times with a texture coordinate offset, I could assign an id to each square of my grid and assign custom texcoordinates to each grid cell.
If I understand correctly, rayDirection will always be zero on one of the three axis? If I had the grid with custom ids, maybe this would allow me to do a proper swizzle on the rayDirection depending on which grid the fragment is looking at? Is there maybe a cleaner way of going about this? I have all the data on one texture. It seems silly and redundant to draw it again into a cubemap’s separate layers.

Also, if I were to draw it into a cubemap which I’d like to avoid, I was looking at making my own class that inherits textureCubemap. TextureCubemap can take an image source and determine weather it is a 4x3(cross) or 1x6 image and fill its faces accordingly. I could add another statement to determine if the source is a 3x2 grid but textureCubemaps only supports an imageSource using its Surface, not a TextureRef. How would I be able to use a TextureRef instead of imageSource when I use textureCubeMap::create instead of binding my cube face framebuffer and drawing the texture into it manually? Perhaps I’m trying to do something more like void TextureCubeMap::replace( const TextureData &textureData ) but with a single texture instead of 6…
But all this can be avoided if I could just parse proper coordinates from my texture in the first place…


#2

Hi,

I think you’re going the wrong way about this. You’re trying to render a dynamic cubemap to a single 2D texture with a 3x2 layout, but why would you want to do that? It’s not an ideal internal representation for OpenGL to use. Sampling it is not straightforward and you will run into issues when sampling near the edges of the faces.

Note that when loading one of the supported image configurations (6x1, 1x6, 3x4 or 4x3), the 6 faces of the map are extracted to 6 separate textures (see the relevant code here). They share the same texture ID, so are treated as a single texture with 6 faces and can be used directly as a cubemap, just like you described.

When rendering a dynamic cubemap, you also want to keep the 6 faces separate (because of the way OpenGL samples the faces). Cinder has an FboCubeMap class just for this, see the DynamicCubeMapping sample.

If you’re looking for a convenient way to store your texture after rendering it, try the built-in 1x6 or 6x1 configuration. The order of the faces is right (+X), left (-X), top (+Y), bottom (-Y), back (+Z) and front (-Z). That way, you can more easily load the texture later.

If you’re trying to create an equirectangular texture (which has a 2x1 aspect ratio, by the way), you also need to render to a cubemap first. In a second step, you sample this cubemap using the exact code of your post (with the thetaphi stuff) to find the color for each texel.

Finally, if you’re trying to load a 3x2 texture into a cubemap, see if you can extract the 6 faces to a Surface yourself, then use this constructor.

Let me know if I didn’t answer your question or if you have additional questions.

-Paul


#3

Hi Paul,
I guess I’m a little confused about the 6 separate textures with one ID… is that a single draw call for all 6 faces or is it 6 separate render passes? What I’m trying to do is draw the 6 faces in a single draw call so I’m implementing a multi viewport array.
This way I can draw all 6 faces onto a single texture in one pass without wasted texture space. I just duplicate the geometry in a geometry shader and feed it the camera matrices in an array. A 4x3 cross texture has a bunch of wasted blank space so I’m using a 3x2. I have it working without any edge problems and I’m drawing my heavy rendering into the 3x2 texture first, then drawing that (lightweight)texture into a FboCubeMap, but if the FboCubeMap draws into its 6 faces in a single call than I don’t need to do any of that…


#4

Here’s my code so far


#5

A cubemap texture as defined in OpenGL is a single texture with 6 layers. Each face of the cube has the same resolution, but is otherwise completely separate from the other faces/layers. When sampling from the texture, you’ll never run into problems when the texture coordinate exceeds the [0…1] range. By comparison: if you have a single 3x2 texture, you would have to make sure you never sample the wrong face, not even when interpolating between samples. This is harder than it seems.

Single-pass rendering of cubemaps is something I haven’t tried myself, but seems to be possible if I understand this extension correctly:

Geometry may be rendered to one of several different layers of cube map
textures, three-dimensional textures, or one- or two-dimensional texture
arrays. This functionality allows an application to bind an entire complex
texture to a framebuffer object, and render primitives to arbitrary layers
computed at run time. For example, it can be used render a scene into
multiple layers of an array texture in one pass, or to select a particular
layer to render to in shader code. The layer to render to is specified by
writing to the built-in output variable gl_Layer. Layered rendering
requires the use of framebuffer objects (see section 9.8).

So you should be able to write a geometry shader that outputs your scene to the 6 faces of a cubemap and then render to the correct face in your fragment shader. Pretty neat, but advanced. Not sure how the depth buffer fits into all of this… there is no such thing as a depth cubemap, as far as I know. But I could be wrong.

I haven’t got a chance to look at your code yet, so you may already be doing exactly that. If so, kudo’s to you, sir. As said, I haven’t got experience with this myself, but you piqued my interest. If only I had time.

-Paul

PS: for further reading, I suggest:
http://on-demand.gputechconf.com/siggraph/2016/presentation/sig1609-kilgard-jeffrey-keil-nvidia-opengl-in-2016.pdf


#6

oooh, that looks optimal… I like it. Definitely have to read more into this. I’m all about single-pass… My technique is more of a glviewportarray trick but I still had to draw it into a cubemap afterwards so this extension will hopefully help cut both corners. Thanks Paul!


#7

Single pass rendering is definitely interesting, but I’d recommend you test it thoroughly for performance. Geometry shaders are notoriously slow when generating too many output vertices per input vertex. This fact alone was the reason tessellation shaders were created, which are more efficient in that department. The latter can not be used for layered rendering, though (I think), otherwise I would have expected that 2016 article to mention them instead of geometry shaders. But yeah, very interesting stuff. Let us know how you fare.

I’d also like to mention Simon Geilfus’s work on this, see his amazing work here:


#8

I’ll be testing it for sure… I was referencing a bunch from Simon’s ViewportArray actually. Kinda what gave me the idea in the first place. I used the technique for single-pass stereo-rendering last year and I’ve been meaning to share that code… it works twice as fast as the cinder stereo example. But each draw call requires a custom geometry shader which supports either points, triangles, or lines.

From what I’ve read, generating vertices in the geometry shader really only bogs down when generating a whole bunch(hundreds) of vertices, in which case, it’s usually suggested to use instancing, but I think 6 should be practical. However, if I understand the NV_viewport_array2 extension correctly, It kinda does what I was original wondering but its also similar to the viewportarray concept. They both use gl_ViewportIndex in the geometry shader. However, I’m not too familiar with gl_ViewportMask[] and gl_Layer, so that will be fun to learn.

My code has a few stages. First, draw the scene into a 3x2 grid using glviewportarray, second, draw that into a cubemap, and third, I draw the cubemap in equirectangular form into another texture so I can spout/syphon it out of the app. All in independent resolutions. So the NV_viewport_array2 extension should combine steps one and two.
My biggest issue was the communication load between the app and my shaders. So a single-pass instead of 6 passes is a tremendous improvement, especially for big scenes.


#9

Looks like I’m using AMD so I can’t use the NV extension… bummer. Perhaps my workaround will have to do.


#10

Wait but my GPU is nvidia so… am I missing something. I cant find the NV_viewportarray2 extension…


#11

It’s a 2015 extension, only available on Maxwell (GTX9xx) and Pascal (GTX10xx) architectures.

If you don’t have access to it, the GL_ARB_shader_viewport_layer_array extension might be an alternative?

I got this information from this presentation (video).


#12

About the 6 layers / 1 texture bit, I think a lot of the confusion comes from OpenGL concept of “images” vs “textures”. Basically a texture is made up of one to several images, the texture is the ID you use to reference that group of images, and the actual images are the actual data stored in memory. Each images being allocated memory accessible through the same texture ID. The most common use case is mipmapping, you have in practice a single Texture, but effectively N images of lower and lower resolutions are stored in the memory to help solve different issues related to resolution/scale/performance… Layered textures (3d textures, cubemaps, etc…) add to the confusion by being one texture representing several sets of images, … for example a cubemap texture has 6 layers/faces which can each have several images…

Re: gl_Layer. Those extensions while using gl_Layer are sort of unrelated and expose other functionalities you’re probably not going to need. Also gl_Layer should be more widely available.

https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/gl_Layer.xhtml

I would start with a passthrough geometry shader and try to write to gl_Layer. Maybe start with an hardcoded gl_Layer = 2; and see if that get rendered to the right face. From there it should be fairly easy to wrap the whole thing into a for( int i = 0; i < 6... and read the relevant transforms from an uniform array.

You might also need to setup your gl::Fbo in a specific way to make it work properly with gl_Layer. If I remember correctly one requirement for “fbo completeness” is to have all the attachments being layered. Something like this:

auto textureCubeMap = gl::TextureCubeMap::create( faceWidth, faceHeight, gl::TextureCubeMap::Format().immutableStorage() );
auto layeredFbo = gl::Fbo::create( width, height, gl::Fbo::Format().disableDepth().attachment( GL_COLOR_ATTACHMENT0, textureCubeMap ) );

Again, not done that for a while so not sure I remember correctly, but I believe that if you do want depth testing and writing to gl_Layer you might need a depth cubemap attach to the depth attachment,… something like this :

auto textureCubeMap = gl::TextureCubeMap::create( width, height, gl::TextureCubeMap::Format().immutableStorage() );
auto textureDepthCubeMap = gl::TextureCubeMap::create( width, height, gl::TextureCubeMap::Format().immutableStorage().internalFormat( GL_DEPTH_COMPONENT24 ) );
auto layeredFbo = gl::Fbo::create( width, height, gl::Fbo::Format().attachment( GL_COLOR_ATTACHMENT0, textureCubeMap ).attachment( GL_DEPTH_ATTACHMENT, textureDepthCubeMap ) );

Hope that helps.


#13

Thanks Simon. That totally helps with understanding fbos and gl_Layer and I will absolutely want depth. Unfortunately, I haven’t gotten that far yet.
As for NV_viewport_array2… I was using OpenGl Extension Viewer to get a summary of my graphics renderer but for some reason I couldn’t find NV_viewport_array2 anywhere. I’m using gtx1080ti so I knew it should be supported so I made a function in my cinder example based on this opengl tutorial.
Here’s the function in my Cinder app;

 void multiViewApp::setup_gl_extensions()
 {
	 console() <<"GL_RENDERER "<< glGetString(GL_RENDERER) << endl;
	 console() <<"GL_VERSION "<< glGetString(GL_VERSION) << endl;

	 int NumberOfExtensions;
	 glGetIntegerv(GL_NUM_EXTENSIONS, &NumberOfExtensions);
	 for (int i = 0; i<NumberOfExtensions; i++) {
		 const GLubyte *ccc = glGetStringi(GL_EXTENSIONS, i);
		 console() << "GL_EXTENSIONS " << glGetStringi(GL_EXTENSIONS, i) << endl;

		 if (strcmp((const char *)ccc, (const char *)"GL_NV_viewport_array2") == 0) {
			 // The extension is supported by our hardware and driver
			 // Try to get the "glDebugMessageCallbackARB" function :
		//	 glDebugMessageCallbackARB = (PFNGLDEBUGMESSAGECALLBACKARBPROC)wglGetProcAddress("glDebugMessageCallbackARB");
		 }
	 }
}

Indeed I have

GL_EXTENSIONS GL_NV_viewport_array2
GL_EXTENSIONS GL_NV_viewport_swizzle

and I have the geometry passthrough extension as well. This is great but not sure why I couldn’t find them in the opengl viewer.
So the question is how do I implement the extension.
if GL_ARB_debug_output uses

glDebugMessageCallbackARB = (PFNGLDEBUGMESSAGECALLBACKARBPROC) wglGetProcAddress(“glDebugMessageCallbackARB”);

Which I get cause glDebugMessageCallbackARB is defined in Cinder\include\glload_int_gl_exts.h

How would I define these extensions and use them in Cinder? Lets assume I’ve never used gl extensions, cause I haven’t.


#14

Awesome summary and info, Simon!

Also, take another look at that PDF I linked to, it has a few slides (page 27 and on) that complement Simon’s explanation.

Finally, to use an extension in your shader, add this line at the top:

#extension GL_NV_viewport_array2 : require

If an extensions also introduces new C-functions, then you may be in trouble because this should be handled by the loader. In case of the current Cinder version: it’s using glLoad, which hasn’t been updated since OpenGL 4.5. There is a pull request on GitHub to replace it with GLAD. Using GLAD, it’s much easier to add support for missing extensions. But I think the extensions you’re after are shader-only.

-Paul


#15

That’s perfect! I thought maybe I was looking in the wrong place to put it. That makes sense. Also, I needed any excuse to use the reverse-z build you made on top of GLAD.


#16

Simon’s explanation ended up being really helpful in understanding gl_layers method for cubemap rendering. I have determined quite a few different techniques now for approaching my single-pass cubemap…

Method 1: My original method:
A)render to a single fbo using a viewport array to 3x2 grid
B)duplicate and emit gl_positions in the geometry shader using invocations and a matrix array of 6 camera positions
C)draw the fbo into a cinder cubemap
D) draw the cubemap in equirectangular format

This method seems to work without edge artifacts even though it should be a risk. It also only emits a max vertices of 3 per triangle strip on the geom shader but it requires an extra step of drawing the original fbo into a cubemap in post. I’m also having a hard time with changing my app window size without viewport problems.

Method 2: Simon’s gl_layers suggestion
A) render into a fbo using multiple gl_layers
B) duplicate and emit gl_positions in the geometry shader using invocations to determine gl_layer and use a matrix array of 6 camera positions.
C) draw the seperate layers into a cubemap
D) draw cubemap in equirect format
It took me a bit to realize how close this method is to my original one and should eliminate edge artifacts but if I’m not mistaken, it duplicates and emits with a max of 18 vertices per triangle which may cause a bottleneck in the geometry shader. Both methods require redrawing the fbo into a cubemap in post.
.
Method 3: Paul’s GL_NV_viewport_array2 suggestion
A) setup a viewport swizzle for correct cubemap texture coordinates.
B)draw into a multi-layered fbo using gl extensions GL_NV_viewport_array2 and GL_NV_geometry_shader_passthrough
C) draw directy into equirectangular format via fragment shader

This method would be ideal… It not only draws in a single pass, but it doesn’t require drawing into a cubemap in post and it passes-through most data in the geometry shader, significantly reducing geom shader bottlenecks. It’s seems to be the most efficient, and overall cleaner design with less memory and drawing steps. However, Cinder doesn’t seem to use an updated version of GLEW that supports a NVViewportSwizzle function even though the GL_NV_viewport_swizzle extension is recognized. So I have no idea how to implement this method without the viewport swizzle function.

As for as the first two methods, there is a way to apply a custom swizzle in the geometry shader to avoid a post cubemap drawing step, but it has it’s drawbacks.
According to Nvidia’s cubemap example it looks as though Simon’s method works best using instanced rendering, otherwise it emits a total of 18 vertices instead of 3 and instancing allows us to avoid a switch statement in the shader for custom swizzling. Switches would cause branching issues and are not efficient. I’m also not sure how to implement the instancing if my design itself is instanced…

I’d love to use Paul’s GL_NV_viewport_array2 suggestion but I have no idea how to implement the GL_NV_viewport_swizzle function even though the extension is recognized as supported… I think the nvidia PDF also mentions that depth comparison uses reverse Z, which I love. Perhaps this method is possible using Paul’s custom reverse-Z cinder branch that impliments GLAD instead of GLEW. Paul, you also mentioned that it is easier to write custom GL functions with GLAD… I’ll try to setup an example using your cinder branch but may need your help with getting the viewport swizzle function in there…


#17

I cloned Paul’s reverse-z branch of Cinder…
Generated a glad.zip file from this website using every available extension…
Replaced cinder’s glad,h and glad.c…
the function glViewportSwizzleNV() seems to be recognized!!!
Let you know if I get something working.


#18

I’m really close… right now it’s rendering all of the faces onto one face/layer and it won’t let me change it manually.
For some reason, I get a memory issue when I try immutableStorage() on the textureCubeMap… wondering if it’s related. GL_DEPTH_COMPONENT32 is also not working and I could really use it…
I uploaded a github of my code so far .
The example is using Paul’s reverse-z branch and a custom glad generated file that includes the gl extensions.
Any help or ideas is greatly appreciated.


#19

Perhaps this NVIDIA sample provides a nice base for comparison. I have been comparing it to your code, but could not find any major differences. It must be in the details.


#20

Thanks Paul for checking it out… I’ve been comparing it too… kinda stumped at this point…