Particle Z-sorting
category: code [glöplog]
Hey,
Off topic:
Sorry for my temper the other day (if you were a victim of it), I was in a regretfully bad mood at the time.
On topic:
A good algorithm for particle Z-sorting
(for slow-moving semi-transparent occluding particles like fog etc),
the best I can think of is
Qsort which seems not the best solution...
Off topic:
Sorry for my temper the other day (if you were a victim of it), I was in a regretfully bad mood at the time.
On topic:
A good algorithm for particle Z-sorting
(for slow-moving semi-transparent occluding particles like fog etc),
the best I can think of is
Qsort which seems not the best solution...
jaw: for fog-ish effects etc you don't need exact sorting, only rough, so something simple like radix sort could be good enough.
never heard that anyone implemented it (speed reasons?) but perhaps depthpeeling or reversed depthpeeling could do the job (its per pixel "sorting")..
'Bitonic Merge Sort' on the GPU. whatever that is
http://research.microsoft.com/apps/pubs/default.aspx?id=70307
Accelerated depth peeling at least.. definetly worth looking into. From a quick glance of the PDF - doesn't say wether it runs well in realtime or not but I'll try it. Thanks!
Accelerated depth peeling at least.. definetly worth looking into. From a quick glance of the PDF - doesn't say wether it runs well in realtime or not but I'll try it. Thanks!
depends on the number of particles and how fast they are moving.
1000 particles - radix sort. static particles - presorted indexbuffers from the 6 major view axes. 1million fast moving particles - something on gpu like bitonic sort or a gpu bucket sort (which is how i did it).
1000 particles - radix sort. static particles - presorted indexbuffers from the 6 major view axes. 1million fast moving particles - something on gpu like bitonic sort or a gpu bucket sort (which is how i did it).
cuda sdk comes with a nice demo and sourcecode called "particles" or something that does radix along the way (in the gpu, obviously).
Don't know if that's applicable in your case but if the particles are moving slowly and the camera is relatively stable, surely the list of particles could be pre sorted and when updating a particle, you could check if it just passed ahead of it's immediate sibbling and swap them. That type of sort has a bad big O complexity but it seems like a very good fit for a flock of particles moving together with little shuffling.
The ancient glue-method does 5-pass radix sort on CPU (first make 4 histograms, then sort by each byte), but if you have _lots_ of stuff you'll probably want to do it with GPU.
GPU sorting is a complex topic. There are lot of approaches and all seem to suck in one way or another but the immense processing power makes up for that.
Method 1: Sorting network. Downside: Requires a lot of passes. Not asymptotically perfect either. (I don't see why people prefer bitonic over odd-even, but there's probably some tradeoff involved)
Method 2: Vertex shader. Downside: Using triangles just to scatter some few pointers just seems silly. Potential bottleneck on old hardware too.
Method 3: Something more advanced. Downside: Works on maybe 10% of cards.
GPU sorting is a complex topic. There are lot of approaches and all seem to suck in one way or another but the immense processing power makes up for that.
Method 1: Sorting network. Downside: Requires a lot of passes. Not asymptotically perfect either. (I don't see why people prefer bitonic over odd-even, but there's probably some tradeoff involved)
Method 2: Vertex shader. Downside: Using triangles just to scatter some few pointers just seems silly. Potential bottleneck on old hardware too.
Method 3: Something more advanced. Downside: Works on maybe 10% of cards.
Method 4: Fake it, no one will notice.
Ah yeah, then there's e.g. weighted blending. That's of course even more advanced effect than just sorting :)
Method 5: Use only additive or multiplicative particles and don't sort. :P
Yeah, there's been a shortage of additive flares in demos
never ever seen multiplicative ones till now :(, but there must be atleast one demo ftw..
for static particles (up to few millions), what smash said
iq: BINARY OR IT DIDNT HAPPEN!
what kb said.
supah: What 216 said.
binary would be nice :)
this nvidia paper to the cuda example that Hyde mentioned also covers/references iq's approach.
Method 6: Fuck it, after all it looks good enough without sorting. ;)
thanks for all the info, surely a lot to consider. I want to do a mix of different things, so it has to be sorted. Otherwise kb, your're quite right and that's what I've done until now :)
But it would be cool to mix static and moving particles so I'll try to make use of several of these techniques. For the moving particles I think it's easiest to do a radix sort in a thread since the GPU will be loaded with rendering the scene anyway, might be a bit much to also do sorting (talking about maybe 1000-5000 particles multicore systems nowadays), and for the rest I'll try the volume sort you suggested, iq..
One thing about that technique though, when you rotate the camera, don't you get "jumps" when it changes the sorting set? Or maybe like linear-interpolate between 2 sets on the gpu..?
But it would be cool to mix static and moving particles so I'll try to make use of several of these techniques. For the moving particles I think it's easiest to do a radix sort in a thread since the GPU will be loaded with rendering the scene anyway, might be a bit much to also do sorting (talking about maybe 1000-5000 particles multicore systems nowadays), and for the rest I'll try the volume sort you suggested, iq..
One thing about that technique though, when you rotate the camera, don't you get "jumps" when it changes the sorting set? Or maybe like linear-interpolate between 2 sets on the gpu..?
btw. iq's method was already described in 1995/6 by Zappy (Pierre Terdiman) here..
jaw: If particles are "intersecting", you're bound to get pops no matter what.
kusma: that's true.