Sunday, February 20, 2011

Implementing histogram pyramid

Just recently, I wanted to implement the histogram pyramid datastructure as detailed here and here however I could not figure it out immediately how to go about it. Fortunately, I was able to find this website. Even though this website was a good starting point however it also left the gaps to be filled in by the readers themselves. Then, I looked into a GPL library HPMC) which made it easier for me to understnad how the histogram pyramid is actually implemented although the code is well written the shader stuff was a bit difficult for me to get hold of. So I went forward to implement the original technique from scratch on my own so here it is. I hope that the discussion here is useful for others. Here is a snapshot of what you would get in the end.
Histogram Pyramid

If you are able to produce something interesting out of this, do let me know I would love to see what you come up with. So here are the missing details.

The histogram pyramid starts with first generating a 2D texture out of your geometry. The first thing needed is generating a series of FBOs with multiple attacments that would act as storage for our histogram pyramid levels. We do so by first generating a series of FBOs with different attachments as follows,

 
vector fboIDs;
GLuint histopyramid_texID;
int size = 2048;
int max_level = (int)(log(size)/log(2));
void InitFBOs() {
glGenTextures(1, &histopyramid_texID);
glBindTexture( GL_TEXTURE_2D, histopyramid_texID);
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP);
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP);
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST_MIPMAP_NEAREST );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_BASE_LEVEL, 0 );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, max_level);
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA32F, size, size, 0, GL_RGBA, GL_FLOAT, NULL );
glGenerateMipmap( GL_TEXTURE_2D );

fboIDs.resize(max_level+1 );
glGenFramebuffers( static_cast( fboIDs.size() ), &fboIDs[0] );
for( GLuint m=0; m< fboIDs.size(); m++) {
glBindFramebuffer( GL_DRAW_FRAMEBUFFER,fboIDs[m] );
glFramebufferTexture2D( GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, histopyramid_texID, m );
if( !checkFBOStatus( __FILE__, __LINE__ ) ) {
cerr << "OpenGL error: Framebuffer for HP level " << m << " incomplete." << endl;
exit(EXIT_FAILURE);
}
}
glBindFramebuffer( GL_DRAW_FRAMEBUFFER,0 );
}

After setting up the fbo, we attach it to allow offscreen rendering and clear it.
Note that we need to setup the MAX mipmap level for directing the output to the correct mipmap level.

 
glBindTexture( GL_TEXTURE_2D, histopyramid_texID );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, 0);
glBindFramebuffer( GL_DRAW_FRAMEBUFFER, fboIDs[0] );
glClear(GL_COLOR_BUFFER_BIT);


Next is the slicing of the 3D geometry. We can do so by using the glOrtho function as follows.

 
int tileX = 16, tileY=16;
int w = size/tileX;
int h = size/tileY;
float teapot_size = 0.5;
float t_size_2 = teapot_size*2;
float clip_d = -t_size_2;
int x=0, y=0;
float dZ = (2*abs(clip_d))/(tileX*tileY);
for(int j=0;j< tileY;j++) {
x=0;
for(int i=0;i< tileX;i++) {
glLoadIdentity();
glOrtho(-t_size_2,t_size_2,-t_size_2,t_size_2,clip_d, clip_d+dZ);
glViewport( x, y, w, h);
glutSolidTeapot(teapot_size);
clip_d+=dZ;
x+=w;
}
y+=h;
}


This gives us something like this image


So now we have our geometry rendered to the base (0) level of the histogram pyramid. The original paper on building point clouds using histogram pyramid used descriminator to determine the active cells. We do this by using alpha test on the output during build process of the histogram pyramid (see the next code snippet). Now we need to build the histogram pyramid. We do so by using a reduction fragment shader and invoke it on a fullscreen quad. For each histogram pyramid level, we bind a new fbo and render the fullscreen quad. Thus, each successive level is a reduced version of its predecessor. This is achieved as follows,

 
base_shader.Use();
glBindFramebuffer( GL_DRAW_FRAMEBUFFER, fboIDs[0] );
glViewport( 0, 0, size, size );
glEnable(GL_ALPHA_TEST);
glAlphaFunc(GL_GREATER, 0.1f);
RenderFullscreenQuad();
glDisable(GL_ALPHA_TEST);
if( max_level < 1 ) {
return true;
}
reduction_shader.Use();
glBindTexture( GL_TEXTURE_2D, histopyramid_texID );
glUniform2f( reduction_shader_floor("delta"), -0.5f/size,0.5f/size );
glBindFramebuffer( GL_DRAW_FRAMEBUFFER, fboIDs[1] );
glViewport( 0, 0, size/2, size/2 );
RenderFullscreenQuad();
if( max_level < 2 ) {
return true;
}
for(GLsizei m=2; m<=max_level; m++) {
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_BASE_LEVEL, m-1 );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, m-1 );

glUniform2f(reduction_shader("delta"),-0.5f/(1<<(max_level+1-m)),0.5f/(1<<(max_level+1-m)) );

glBindFramebuffer( GL_DRAW_FRAMEBUFFER, fboIDs[m] );
glViewport( 0, 0, 1<<(max_level-m), 1<<(max_level-m) );
RenderFullscreenQuad();
}

reduction_shader.UnUse();



The base level shader is simply outputting the color from the histogram pyramid. The alpha test rejects all the zero entries and we are only left with the non-zero data.
The reduction shader I use is this,

 
uniform sampler2D histopyramid;
uniform vec2 delta;
void
main()
{
gl_FragColor = vec4(
dot( vec4(1.0), ( texture2D( histopyramid, gl_TexCoord[0].xy+delta.xx ) ) ),
dot( vec4(1.0), ( texture2D( histopyramid, gl_TexCoord[0].xy+delta.yx ) ) ),
dot( vec4(1.0), ( texture2D( histopyramid, gl_TexCoord[0].xy+delta.xy ) ) ),
dot( vec4(1.0), ( texture2D( histopyramid, gl_TexCoord[0].xy+delta.yy ) ) )
);
}



Obtaining total vertices


To see how many data items (points in our case) we have, we lookup the top element of the histogram pyramid. We do so by using a PBO, attach it to the histogram top level and then call glGetTexImage function as follows,

 
glBindBuffer( GL_PIXEL_PACK_BUFFER,histoPBOID );
glBindTexture( GL_TEXTURE_2D, histopyramid_texID );
glTexParameteri( GL_TEXTURE_2D,GL_TEXTURE_MAX_LEVEL, max_level);
glGetTexImage( GL_TEXTURE_2D, max_level,GL_RGBA, GL_FLOAT, NULL );


Next, we call the glBufferSubData function to extract the total vertices as follows,

 
float data[4];
glGetBufferSubData( GL_PIXEL_PACK_BUFFER,0, sizeof(GLfloat)*4, &data[0] );
totalVertices = static_cast( floorf(data[0]) + floorf(data[1]) + floorf(data[2]) + data(mem[3]) );
glBindBuffer( GL_PIXEL_PACK_BUFFER, 0 );


Rendering point cloud
For rendering of points, we first bind a temp VBO containing indices incremented linearly. The temp VBO is setup at the initialization as follows,

 

enumerate_vbo_n = 3*1000;
glGenBuffers( 1, &enumVBOID);
glBindBuffer( GL_ARRAY_BUFFER, enumVBOID);
glBufferData( GL_ARRAY_BUFFER, 3*sizeof(GLfloat)* enumerate_vbo_n,NULL,GL_STATIC_DRAW );
GLfloat* ptr = reinterpret_cast( glMapBuffer( GL_ARRAY_BUFFER, GL_WRITE_ONLY ) );
for(int i=0; i< enumerate_vbo_n; i++) {
*ptr++ = static_cast( i );
*ptr++ = 0.0f;
*ptr++ = 0.0f;
}
glUnmapBuffer( GL_ARRAY_BUFFER );


We could also have used the gl_VertexID builtin attribute to accesss the vertex ID which is used to index into the histogram pyramid. We bind this VBO and then call the glDrawArrays function as follows,

 

glBindBuffer( GL_ARRAY_BUFFER, enumVBOID );
glVertexPointer( 3, GL_FLOAT, 0, NULL );
glEnableClientState( GL_VERTEX_ARRAY );

user_shader.Use();

GLsizei N = totalVertices;
for(GLsizei i=0; i< N; i+= enumerate_vbo_n) {
glUniform1f( user_shader("key_offset"), static_cast( i ) );
glDrawArrays( GL_POINTS, 0, min( N-i, enumerate_vbo_n ) );
}
user_shader.UnUse();


The user shader accepts the incoming vertex stream and converts it into a 3d position. This vertex and fragment shaders are given below.

 

uniform sampler2D histopyramid;
uniform float key_offset;
uniform int max_level;
uniform float TILE_X,
TILE_Y,
TOTAL;

#define FUNC_X TILE_X/2.0
#define FUNC_Y TILE_Y/2.0
void
getVertexPos( out vec3 p )
{
float key_ix = gl_Vertex.x + key_offset;
vec2 texpos = vec2(0.5);
vec4 delta_x = vec4( -0.5, 0.5, -0.5, 0.25 );
vec4 delta_y = vec4( 0.0, -0.5, 0.0, 0.25 );
vec4 sums,hist,mask;
for(int i=max_level; i>0; i--) {
vec4 sums = texture2DLod( histopyramid, texpos, float(i) );
vec4 hist = sums;
hist.w += hist.z;
hist.zw += hist.yy;
hist.yzw += hist.xxx;
vec4 mask = vec4( lessThan( vec4(key_ix), hist ) );
texpos += vec2( dot( mask, delta_x ), dot( mask, delta_y ) );
key_ix -= dot( sums.xyz, vec3(1.0)-mask.xyz );
delta_x *= 0.5;
delta_y *= 0.5;
}
vec4 raw = texture2DLod( histopyramid, texpos, 0.0 );
sums = floor(raw);
hist = sums;
hist.w += hist.z;
hist.zw += hist.yy;
hist.yzw += hist.xxx;
mask = vec4( lessThan( vec4(key_ix), hist ) );
float nib = dot(vec4(mask), vec4(-1.0,-1.0,-1.0, 3.0));
texpos += vec2( dot( mask, delta_x ), dot( mask, delta_y ) );
key_ix -= dot( sums.xyz, vec3(1.0)-mask.xyz );
float val = fract( dot( raw, vec4(equal(vec4(nib),vec4(0,1,2,3))) ) );

vec2 foo = vec2(TILE_X,TILE_Y)*texpos;
vec2 tp = vec2( (2.0*TILE_X)/FUNC_X, (2.0*TILE_Y)/FUNC_Y ) * fract(foo);
float slice = dot( vec2(1.0, TILE_Y), floor(foo));

p = vec3(tp, slice)/vec3(1.0,1.0,TOTAL);

}

void main()
{
vec3 p;
getVertexPos( p);
gl_Position = gl_ModelViewProjectionMatrix * vec4( p, 1.0 );
//debug
gl_FrontColor = vec4(p,1);
//gl_FrontColor = gl_Color ;
}


The interesting bit is happening in the getVertexPos function. First the delta values for moving to the next cell are calculated.

 

float key_ix = gl_Vertex.x + key_offset;
vec2 texpos = vec2(0.5);
vec4 delta_x = vec4( -0.5, 0.5, -0.5, 0.25 );
vec4 delta_y = vec4( 0.0, -0.5, 0.0, 0.25 );


Then the traversal loop is initiated. This loop runs log_2(size) that is the total levels in the histogram pyramid. Since we are using 4 channels for each entry of the histogram pyramid, we do a partial sum by first looking up the value at the current histogram pyramid level. The swizzle part shown below calcualtes the end ranges for each child node.

 

for(int i=max_level; i>0; i--) {
vec4 sums = texture2DLod( histopyramid, texpos, float(i) );
vec4 hist = sums;
hist.w += hist.z;
hist.zw += hist.yy;
hist.yzw += hist.xxx;


Next, based on the current key_index, the mask is calculated to see whether the current key_index is less than the current sum at the current histogram pyramid level. This mask is used to calculate the texture lookup position and the new key_index is upated accordingly. After updating the key_index and texture position, the delta values are updated for the next level of the histogram pyramid.

 

vec4 mask = vec4( lessThan( vec4(key_ix), hist ) );
texpos += vec2( dot( mask, delta_x ), dot( mask, delta_y ) );
key_ix -= dot( sums.xyz, vec3(1.0)-mask.xyz );
delta_x *= 0.5;
delta_y *= 0.5;
}


After the traversal loop ends, we are already at the base level of the histogram pyramid. We find the raw value at the texture lookup position again calculating the sum and the end ranges as earlier. Then the mask is calculated and the texture lookup position and the key index are updated. The actual value is then calculated.

 

vec4 raw = texture2DLod( histopyramid, texpos, 0.0 );
sums = floor(raw);
hist = sums;
hist.w += hist.z;
hist.zw += hist.yy;
hist.yzw += hist.xxx;
mask = vec4( lessThan( vec4(key_ix), hist ) );
float nib = dot(vec4(mask), vec4(-1.0,-1.0,-1.0, 3.0));
texpos += vec2( dot( mask, delta_x ), dot( mask, delta_y ) );
key_ix -= dot( sums.xyz, vec3(1.0)-mask.xyz );
float val = fract( dot( raw, vec4(equal(vec4(nib),vec4(0,1,2,3))) ) );


Finally, the 3D position of the vertex is calculated by multiplying the texture position with the tile size. The Z value is approximated using the Y tiling values.

 

vec2 foo = vec2(TILE_X,TILE_Y)*texpos;
vec2 tp = vec2( (2.0*TILE_X)/FUNC_X, (2.0*TILE_Y)/FUNC_Y ) * fract(foo);
float slice = dot( vec2(1.0, TILE_Y), floor(foo));


Lastly, the 3D coordinate is calculated as follows,

 
p = vec3(tp, slice)/vec3(1.0,1.0,TOTAL);


This position is then rendered, giving us the following output.
Histogram Pyramid
The performance of the histogram pyramid is pretty fast. If u find anything interesting, do drop me an email.

Happy OpenGL coding.

4 comments:

  1. This is very interesting and a great explanation.

    Could you post your complete source code, either as a .zip file, or maybe on GitHub?


    Thanks!

    ReplyDelete
  2. Hi,
    i will try to find the sources and post asap.

    ReplyDelete
  3. Awesome, looking forward to it. (I keep checking back here every day ... :)

    ReplyDelete
  4. Hi Derek,
    Just letting you know. I have found the source code. However, there are some issues with the code so it is not outputting anything. As soon as it is sorted, I will post the code on github.

    Thanks,
    Mobeen

    ReplyDelete