So, what do distance field equations look like? And how do we solve them?
category: code [glöplog]
iq: it's around 10 fps here - RV770@700mhz... after some 4-5 minutes compilation time. It's just strange that when I enter the shaders in the shaderanlyzer I have the assembly result in around 10s (it's using cat9.4 while I'm running 9.8) - it's producing a program at the size of some assembly 4k (around 3500 alu instructions) but the vec5 utilization is a bit worse (from a quick glance at it).
Texel: Nice Idea, I should try that.. I got some (unreleased) raytracers doing similar things, but this should work out nicely. It wouldn't be a problem with perspective projection either, it's just about calculating the cone/pyramid width at the given distance.
Regarding the gpu difficulties I expect another boost of these things when we get compute shaders (with LDS) soon, meaning that we can start working on groups of pixels, or at least share work between them. But this particular thing would work out nicely as multipass, saving some sort of rough, conservative z-buffer in the first pass.
Texel: Nice Idea, I should try that.. I got some (unreleased) raytracers doing similar things, but this should work out nicely. It wouldn't be a problem with perspective projection either, it's just about calculating the cone/pyramid width at the given distance.
Regarding the gpu difficulties I expect another boost of these things when we get compute shaders (with LDS) soon, meaning that we can start working on groups of pixels, or at least share work between them. But this particular thing would work out nicely as multipass, saving some sort of rough, conservative z-buffer in the first pass.
i does sound a bit "hmmh" as opposed to already tried and tested spatial structure optimizations used on cpu
Ok, forget about the non-perspective thing.
Here it is my new algorithm:
1) Make RAYS a list of all rays to raymarch
2) Pick a ray R from RAYS
3) Calculate the distance field sphere for R
4) For all rays in RAYS, intersect the sphere for R. If there is intersection, move the ray origin to the most far intersection point
5) If the sphere radious is lower than epsilon, take out ray R from RAYS
6) If there are rays in RAYS, goto 2)
This *should* work, isn't it?
Here it is my new algorithm:
1) Make RAYS a list of all rays to raymarch
2) Pick a ray R from RAYS
3) Calculate the distance field sphere for R
4) For all rays in RAYS, intersect the sphere for R. If there is intersection, move the ray origin to the most far intersection point
5) If the sphere radious is lower than epsilon, take out ray R from RAYS
6) If there are rays in RAYS, goto 2)
This *should* work, isn't it?
Ups, in 4 I forgot:
4) For all rays in RAYS, intersect the sphere for R. If there is intersection, move the ray origin to the most far intersection point, only if it is more far than its origin
4) For all rays in RAYS, intersect the sphere for R. If there is intersection, move the ray origin to the most far intersection point, only if it is more far than its origin
So well, the problems here are:
- In 2), what is the best way to pick a ray R?
and
- In 4), find a good way to do the less intersections as possible. This would require to mantain 2 data structures for the rays
- In 2), what is the best way to pick a ray R?
and
- In 4), find a good way to do the less intersections as possible. This would require to mantain 2 data structures for the rays
Ups, again, in 4 I forgot:
4) For all rays in RAYS, intersect the sphere for R. If there is intersection, and the origin point is inside the sphere, move the ray origin to the most far intersection point.
Now I think it is correct...
4) For all rays in RAYS, intersect the sphere for R. If there is intersection, and the origin point is inside the sphere, move the ray origin to the most far intersection point.
Now I think it is correct...
Are you talking about moving all the rays' origins initially so the first ray test could be avoided?
ferris, not only the first, but the nexts too if possible
Well given the vertex shader has the intersection code this could be implemented quite easily using pretty much the exact same methods used in prods like heaven seven :) But that would bloat the shader program size very quickly (yeah yeah psycho I get it :P) so it probably wouldn't be a good idea.
AH!! thanks for the pic :) I see more of what you're saying...hmmm...
If used for the first one the practicality is obvious and would be a great speed optimization I think...however the rest might quickly become overhead it seems :/ I'll have to ponder this one a bit more..
Precalculate more!
ah -- picture clears it up a bit more. well, go try and see if it's feasible :)
Book keeping issues aside the algorithm as described wouldn't work (apart from my z-prepass): As soon as a ray pass close to an occluder and the sphere grows behind it, you will extend a lot of rays right through the occluder..
I agree with psycho. I had another thought about this kind of an algorithm that I shared with Ferris a few days ago. It's for sphere marching (tracing) and for sphere marching only. Simply put, we keep one sample in a "history" buffer, and the following consecutive sample in the "current" buffer. These values are distances to our geometry (well right now I think only convex geometry would be covered by this logic). Then we calculate the value: "history" - "current". If the value is positive (which it should be while approaching an object) it means we're getting closer to the geometry and the usual behavior of the algorithm should follow; in other words "just perform the classic adaptive ray-marching using the distance field". But, the trick is, if the result from "history" - "current" is negative, then it means we passed near to an object with the ray but didn't hit it. At this point, we can eliminate this object from our future distance samplings and continue with the next closest object. Which means reducing the "close-to-geometry" advances on the ray by nearly 50%; and as you might know, those distances are the main cause of a lot of time spent while marching.
I hope I could explain my idea :). I will try to demonstrate in a week time with an actual application.
I hope I could explain my idea :). I will try to demonstrate in a week time with an actual application.
explanation is fine -- sounds like it should definitely work actually..
for paradistance i just used an adaptive minimum stepsize that i increase whenever i pass close to another object.. that turned out to speed up rays passing close to an object, or even worse through a gap between two objects a lot... and it doesn't need a lot of shader code so it is perfectly usable for 4ks.. of course the rendering quality is decreased when looking through a gap but that isn't a big problem since it's only a few rays anyway and overall quality is improved :)
this is a excellent discussion, very informative. But I still have one basic noob question: what's the difference between ray-marching and ray-tracing? just the hardware accelerated shader based implementation?
Secondly I'm reading about making spheres approaching and stuff, but wouldn't it be faster to just start with a test for camera aligned bounding boxes, or is it so simple it's implicit here?
for the rest, plz continue your discussion it's interesting :)
Secondly I'm reading about making spheres approaching and stuff, but wouldn't it be faster to just start with a test for camera aligned bounding boxes, or is it so simple it's implicit here?
for the rest, plz continue your discussion it's interesting :)
raytracing gives a discreet exact answer because you solve the equations of the ray and the surface. raymarching is an approximation by testing points on the ray whether they have passed the surface or not.
Decipher:
I'm not sure If I've completely understad your algorithm, but...
Well, suppose you have an object defined with... let say, perlin noise.
While you raymarch the object, you will be nearest and nearest to it. Then you will be far and far from it. In this case, I think you delete it, isn't it?
But, if you have something like perlin noise, you might be more near again (spikes), isn't it? If you have deleted it already, you will not intersect it that way...
I'm not sure If I've completely understad your algorithm, but...
Well, suppose you have an object defined with... let say, perlin noise.
While you raymarch the object, you will be nearest and nearest to it. Then you will be far and far from it. In this case, I think you delete it, isn't it?
But, if you have something like perlin noise, you might be more near again (spikes), isn't it? If you have deleted it already, you will not intersect it that way...
Quote:
(well right now I think only convex geometry would be covered by this logic)
Ups, sorry... forget what I wrote
Quote:
4) For all rays in RAYS, intersect the sphere for R. If there is intersection, and the origin point is inside the sphere, move the ray origin to the most far intersection point.
Quote:
Book keeping issues aside the algorithm as described wouldn't work (apart from my z-prepass): As soon as a ray pass close to an occluder and the sphere grows behind it, you will extend a lot of rays right through the occluder..
Psycho, I'm not completely sure, but I think the problem you explain would not be a problem because of the thing in bold
Psycho: wow, 10 fps!!! That means we should soon (+-1 years) start to see very interesting intros! I love hardware evolution.
Texel, and the others, as I understand all these optimizations you speak about (tracing/marching rays in groups) are only applicable for primary rays? What about reflection, shadow and ao rays? Also, I think that even if the trick worked for all type of rays, it would probably only be worth for the first 2 or 3 marching steps, and then as soon as rays start to hit surfaces and get incoherent the complete thing would reduce to "monoray" plus the overhead of breaking the groups ("packets"). This is just an intuition, but I would bet it will happen exactly as with modern packet raytracing: groups/packets bring lot of hassle for getting only a 2x speed up for primary rays (so, <5% speed up for an interesting image with ao, reflections and shadows). Anyway this is just an intuition, any optimization is always worth a try of course.
The simplest optimization I can imagine right now for current intros without changing much the code (say, adding a +250 bytes) would be to prerender a very coarse version of the distance fields to a small 3d texture like 96^3 and round the distances upwards before writing to the texture. At marching time you would sample that one until you cannot advance more cells and then you switch to the procedural function. The code for this would be minimal I suppose, and it should work for all sort of secondary rays, and also as coarse nice collision detection acceleration for particles or camera etc. The problem would be bandwidth, but even in my stupid card a static 128x128x128 volume with few primitives in it gives me >200 fps (only the tex3d part), while in Slisesix I get 1 fps, Sult I get 10 and MuonBaryon I get 7. Of course the volume should be recentered everyframe in the camera position. Even, perhaps, one could have 3 or 4 such 3d LUTs at same resolution but each covering more physical volume, and then I would call the trick "cascaded distance maps". Somebody should try an intro with this.
But well, as Phsyco says lets wait for the compute shaders and we will see if indeed the packet tracing ideas are good.
Texel, and the others, as I understand all these optimizations you speak about (tracing/marching rays in groups) are only applicable for primary rays? What about reflection, shadow and ao rays? Also, I think that even if the trick worked for all type of rays, it would probably only be worth for the first 2 or 3 marching steps, and then as soon as rays start to hit surfaces and get incoherent the complete thing would reduce to "monoray" plus the overhead of breaking the groups ("packets"). This is just an intuition, but I would bet it will happen exactly as with modern packet raytracing: groups/packets bring lot of hassle for getting only a 2x speed up for primary rays (so, <5% speed up for an interesting image with ao, reflections and shadows). Anyway this is just an intuition, any optimization is always worth a try of course.
The simplest optimization I can imagine right now for current intros without changing much the code (say, adding a +250 bytes) would be to prerender a very coarse version of the distance fields to a small 3d texture like 96^3 and round the distances upwards before writing to the texture. At marching time you would sample that one until you cannot advance more cells and then you switch to the procedural function. The code for this would be minimal I suppose, and it should work for all sort of secondary rays, and also as coarse nice collision detection acceleration for particles or camera etc. The problem would be bandwidth, but even in my stupid card a static 128x128x128 volume with few primitives in it gives me >200 fps (only the tex3d part), while in Slisesix I get 1 fps, Sult I get 10 and MuonBaryon I get 7. Of course the volume should be recentered everyframe in the camera position. Even, perhaps, one could have 3 or 4 such 3d LUTs at same resolution but each covering more physical volume, and then I would call the trick "cascaded distance maps". Somebody should try an intro with this.
But well, as Phsyco says lets wait for the compute shaders and we will see if indeed the packet tracing ideas are good.
Yes Iq, only for primary rays... but maybe also could be used for GI... maybe, I don't know.
I think I will try it anyway. As you know, I'm cpu only coder... and I will be until I buy a good enough graphic card.
The thing is, I have no idea if it could increment or not the speed... so I'm going to just try it...
I think I will try it anyway. As you know, I'm cpu only coder... and I will be until I buy a good enough graphic card.
The thing is, I have no idea if it could increment or not the speed... so I'm going to just try it...