Advice on particles & geom subdivision

hi all
I am practicing Graphics / Cinder / Shaders by focusing on particles engines as a way to learn.
I started with the amazing particles GPU (transformation feedback) example and got to a point where I map every pixel from a texture to particle. Now I would like the 1.2M particles to position themselves along the surface of a 3d model. The model that I am starting with is a very simple shape, with like 10 vertices or so. Can I use subdivision to increase the number of vertices then use that geometry to set the position of every particle in the vbo buffer (CPU side)? Or is there a way to interpolate the position of each particle in the update shader?
Apologies if the question isn’t clear, I’d be happy to supply more details

Hopefully I’m understanding you correctly…

You can get the vertex positions from a geom::Source instance like:

TriMesh mesh = TriMesh(geom::Sphere()); vec3* verts = mesh.getPositions<3>();

And then use the data that verts points at to initialize your VBO:

gl::VboRef vbo = gl::Vbo::create(GL_ARRAY_BUFFER, sizeof(vec3) * mesh.getNumVertices(), verts, GL_STATIC_DRAW);

yes thats the idea. I already have 1 million particles. I don’t think the mesh has that many vertices.
Would it be a good idea to subdivide the geometry?

Depends on what you’re trying to do - you could subdivide the geometry a bunch before passing the vertex positions to your new VBO. Or you could cycle through the same set of positions to fill an array with the desired number of vec3s. You can add a bit of randomness to the positions before transferring them to the GPU. It’s really up to you - idk if that answers your question…

With the strategy I posted above, the TriMesh will be deleted once it falls out of scope, so you’re only going to be generating that mesh on the CPU once. Therefore, increasing the subdivisions isn’t really going to have an effect on performance (other than at the start of your application, obviously). The assumption here is that all future transformations to the particles will occur on the GPU inside of a shader.

that’s exactly right. a one time subdivision routine at the start of the app, generating a VBO with 1M vec3. once bound to the GPU, transformation feedback is used and everything runs in shaders.
the idea: make a million particles align to any shape at runtime.

1 Like

It happened that I’ve done the exact same thing you described. Subdividing is definitely a possible option, but you’ll likely get much more particles at some places on the model than other places, usually at where vertices are dense. You might also start to see patterns caused by subdivision algorithm when you subdivide many times. Something like this:

Poisson disk sampling is an algorithm you can use for this purpose, it’s really good at reducing patterns while achieving an uniform sampling on a mesh. Simon Geilfus has implemented this algorithm in cinder: here.


wow that’s pretty incredible! thank you for sharing. I am trying to push the mesh sampling example and have it generate 1 million particles.

@phil : the pocket link is broken :confused:

Sorry about that, I just fixed it. I haven’t tried sampling that much points so I’m curious how it’ll work :slight_smile:

size_t k = 100; // the higher k is, the tighter the packing will be but the slower the algorithm float separation = 0.001f; // this is the min separation between samples

Took 149329ms to generate 660332 samples.

and it didnt cover the entire surface of the teapot

This sample is more an example on how to use the space partitioning block, than any sort of robust poisson disk mesh distribution algorithm. It is a pretty tricky subject and the code is definitely not made for that kind of mesh density.

Raising k to 100 while lowering separation to 0.001 is a bit crazy. I don’t think this exact code can be used as it to achieve what you’re after. You might want to check this sample first (and have a look at the article Phil posted above) to get an idea of how it works:

And maybe try something along those lines (not tested):

// Poisson disk parameters
size_t k = 300; // the higher k is, the tighter the packing will be but the slower the algorithm
float separation = 0.001f;    // this is the min separation between samples

//! IMPORTANT: tweak this part depending on the min separation and the scale of your model.
// If you don't, there's no point in using a grid
// Grid Parameters ( tweak this in relation to the mesh size )
float gridScale = 300.0f; // if the object is too small, the sp::Grid will not improve anything

// create a grid of the size of the mesh for checking distance between samples
auto boundingBox = mModelMesh->calcBoundingBox();
auto grid = sp::Grid3<int>( gridScale * boundingBox.getMin(), gridScale * boundingBox.getMax() );

vector<vec3> samplesList;
// go through all the triangles of the mesh
for( size_t i = 0; i < numTriangles; ++i ) {
     // get the current triangle vertices
    vec3 a, b, c;
    mModelMesh->getTriangleVertices( i, &a, &b, &c );
    vec3 ab = b - a, ac = c - a;

    // and use them to spawn k potential sample on this triangle
    for( size_t j = 0; j < k; ++j ) {
        // random vec3 on triangle
        vec2 uv = vec2( randFloat(), randFloat() );
        if( uv.x + uv.y > 1.0f ) uv = vec2( 1.0f ) - uv;
        vec3 newPoint = a + ab * uv.x  + ac * uv.y;

        // check if the new point has no neighbors that are too close to it
        // = no neighbor in the "separation" range
        // if it's far enough from every other sample we can accept the new sample.
        if( !grid.rangeSearch( gridScale * newPoint, gridScale * separation ).size() ){
            // if the point is selected add it to the output list and grid
            samplesList.push_back( newPoint );
            grid.insert( gridScale * newPoint );

This is a simplified version of the sample you used. Instead of using random positions on random triangles, it will try to spawn k samples on every triangles of the mesh. I’ve not tested it, but it should allow a higher density. (EDIT: We can’t really talk about poisson disk sampling anymore)

The key thing with that kind of method is that every time you add a new sample you need to check every other sample to see if they are not too close to each other. That’s how you achieve that kind of uniform look.

This obviously can very quickly become a huge and time consuming task. That’s where a grid or a kd-tree become handy. Because of how my Grid class is implemented you need to be careful with the scale of the data. It uses integer positions to fit every point in a bin of the grid. Which means that if all your position are between vec3(-1.0f ) and vec3(1.0f) they will all end up in the same bin. Hence all the positions being scaled by 300.0f in my example. Make sure to scale your model or data properly or use a KdTree instead!

Let me know if it works any better!

1 Like

Just tried the code above and it give a pretty dense distribution in slightly more than one second :


fantastic! indeed this isn’t poisson disk sampling anymore. your sample works great! indeed with the k=300 and separation=0.01f, it was a long running process. now this is super fast, but I am thinking of doing data dumps of the generated data and have a few canned models that i can switch between at runtime. although now I managed to generate a pretty nice distribution with k=600, sep=0.005 :smiley:
Took 12400.3ms to generate 1105120 samples. (on an iMac with an i7 and an nvidia)

Now, i am going to plug in my models and give it a shot and will keep in mind your point about scale. If I understand correctly, a kd-tree will always find the right bin irrespective of the grid size?

On a side note, is there a compact data format for vertex data that is recommended for Cinder? Or is it common to use JSON ?

sorry for all the noob questions and your help is much appreciated!!

size_t k = 600; float separation = 0.0005f; float gridScale = 600.0f;

Took 3892.2ms to generate 1105120 samples


regarding compact data format to store the point cloud: use binary. Store each float as 4 bytes in a binary file. That way, file size will be smallest and no conversion is needed. Reading the file will be lightning fast.

DataTargetRef target = ...;

auto stream = target->getStream();
stream->writeData( (void*)&myFloat, sizeof( myFloat ) );
stream->writeData( (void*)&myVec3, sizeof( myVec3 ) );
stream->write( myString );
stream->writeLittle( myUint16_t );

std::vector<vec3> myData;
stream->writeLittle( (uint32_t) myData.size() );
stream->write( (void*), myData.size() * sizeof( vec3 ) );
DataSourceRef src = ...;

auto stream = src->getStream();
auo filesize = stream->size();
stream->readData( (void*) &myFloat, sizeof( myFloat ) );
stream->readData( (void*) &myVec3, sizeof( myVec3 ) );
stream->read( &myString );
stream->readLittle( &myUint16_t );

std::vector<vec3> myData;
uint32_t numVertices = 0;
stream->readLittle( &numVertices );
myData.resize( numVertices );
stream->read( (void*), numVertices * sizeof( vec3 ) );

Warning: untested code, wrote this without compiling it myself.

Also note that calling resize() on a std::vector will cause it to construct a bunch of vec3's (*) that you are going to replace immediately, which is a waste of CPU cycles. Sadly there is no other way I know of, because reserve() does not set the size of the vector. If performance is critical, consider using a unique_ptr to a simple array instead.


  • vec3's default constructor is called, which sadly initializes its components to zero.

thanks Paul, that worked really well! It took 20.053ms to read a 13mb file with 1.1 million vec3 points.

for accuracy’s sake here is how I wrote and read the file


std::vector<vec3> samplesList;  
DataTargetRef target = DataTargetPath::createRef(  getAppPath() / "model.txt" );
auto writeStream = target->getStream();
writeStream->writeLittle( (uint32_t) samplesList.size() );
writeStream->writeData( (void*), samplesList.size() * sizeof( vec3 ) );


std::vector<vec3> samplesList;
DataSourceRef src = DataSourcePath::create(getAppPath() / "model.txt");
auto stream = src->createStream();
uint32_t numVertices = 0;
stream->readLittle( &numVertices );
samplesList.resize( numVertices );
stream->readData( (void*), numVertices * sizeof( vec3 ) );

I like to use Cereal when reading/writing data. That way you can pack extra data in as needed without needed to do much to maintain your data format.

thanks for the tip! For point cloud data, Paul’s suggestions worked great. Ill check it out though, thanks