Problem with RayMarching using DistanceFunctions when applying Twist-Distortion.
category: general [glöplog]
here's the soft-shadow thingie, nicestep suggested. looks pretty nice and has really low overhead. thanks for sharing that idea :)
the obvious way to do soft shadows is to get the closest point from the shadowray to any geometry. If distance is less than zero (so we got a hit), then pure shadow. Otherwise, fade from shadow to no-shadow with a smoothstep or something, based on that distance. It gives pretty ok shadows. Can have a look to http://iquilezles.org/www/material/nvscene2008/rwwtt.pdf, page 55 for the softshadow and 47 for the AO.
There are lots of variants you can apply here, that's the cool thing of having the distance function to the geometry!
There are lots of variants you can apply here, that's the cool thing of having the distance function to the geometry!
Another nice trick (generic, not only for distance fields) is to use the distance to the shadow intersection point to fade out a shadow. Gives pretty natural lighitng.
The last remaining trick we need is to fake SSS, and we are done!
The last remaining trick we need is to fake SSS, and we are done!
SSS, ftw!
very nice demo, xortdsc :)
very nice demo, xortdsc :)
xortdsc, do you know Rodrigo? I'm not sure if he worked in Farcry or Farcry2... He is a 3d expert, have you ever listen to that name?
About the soft-shadows, I think that is what I wanted to write that could be possible. It looks good, but not very soft. Can you make it softer?
About the grid thing... Due the nature of distance fields, I don't think it would be optimized by adaptative grids (kd-trees or whatever). Also, as you said, kd-trees doesn't look to be very GPU friendly...
About the soft-shadows, I think that is what I wanted to write that could be possible. It looks good, but not very soft. Can you make it softer?
About the grid thing... Due the nature of distance fields, I don't think it would be optimized by adaptative grids (kd-trees or whatever). Also, as you said, kd-trees doesn't look to be very GPU friendly...
thanks guys :)
yes, the shadows can be made softer. i'll render a pic with larger light-radius later this evening.
about the grid/tree-thing: what i planned to do is to create a on-the-fly-octtree (not very deep) and for each node i store just 1bit which tells me if that node is EMPTY (no surface enters the node-cell).
so now if the ray passes through a node-cell which is empty i can completly skip to the end of the node. if it is not empty i can visit its subnodes recursively. only when arriving at the deepest subnode which are still not empty i need to evaluate the distance-function. that way, lots of pixel will not evaluate the distance-function at all. one further opt would be to split the distance-function for each cell. so instead of having one global distance-function i have multiple local ones. sounds promising i think and i expect quite some speedup, but i've to see if it really works out...
and no, i've never heard of rodrigo ;)
yes, the shadows can be made softer. i'll render a pic with larger light-radius later this evening.
about the grid/tree-thing: what i planned to do is to create a on-the-fly-octtree (not very deep) and for each node i store just 1bit which tells me if that node is EMPTY (no surface enters the node-cell).
so now if the ray passes through a node-cell which is empty i can completly skip to the end of the node. if it is not empty i can visit its subnodes recursively. only when arriving at the deepest subnode which are still not empty i need to evaluate the distance-function. that way, lots of pixel will not evaluate the distance-function at all. one further opt would be to split the distance-function for each cell. so instead of having one global distance-function i have multiple local ones. sounds promising i think and i expect quite some speedup, but i've to see if it really works out...
and no, i've never heard of rodrigo ;)
xortdsc:
With the octree you will probably need a stack, or a complex structure for stackless trees, or to traverse from the root again every time you don't find an intersection and want to find the next intersection.
In the case of the stack, it seems that it is not good for GPUs (as far as I know).
In the case of the complex structure, it is ugly. In the case of starting from the root again, it looks slow.
Just by intuition I think the non-adaptative grid would be faster. If you have the maximum distance values inside a cell of the grid, you can traverse a lot of cells in just one step, so it looks really fast (not to say easy to implement too).
Well, it would be interesting to know your results :)
With the octree you will probably need a stack, or a complex structure for stackless trees, or to traverse from the root again every time you don't find an intersection and want to find the next intersection.
In the case of the stack, it seems that it is not good for GPUs (as far as I know).
In the case of the complex structure, it is ugly. In the case of starting from the root again, it looks slow.
Just by intuition I think the non-adaptative grid would be faster. If you have the maximum distance values inside a cell of the grid, you can traverse a lot of cells in just one step, so it looks really fast (not to say easy to implement too).
Well, it would be interesting to know your results :)
realtime (fake) SSS would be really nice. i'll try to come up with something. i think i got an idea, but i have to further think about it to not make a fool of myself ;) actually the idea was more about volumetric lighting, but it might be the same thing actually... or pretty closely related. lets see... ;)
texel: well, since my grid would be regular (hierarchical, but regular on each level), i dont really need a stack, i just keep adjusting indices to my "current node" whenever i hit a cell-boundary and keep looping until i hit something or the farplane is reached.
however it might be beneficial to have the min-distance per cell instead of just the "is-empty-bool" to step further than just that cell. on the other hand, this structure should stay in shared-mem (since it is constantly used) so it must be really small (16kB total and i've to put a sin-table in there as well). 1 vs 32 bits is quite a difference and the hierarchy will serve as "merger" of neighboring empty cells, so here as well i can step further than just a single (leaf)-cell.
i'll play around with it and let you guys know :)
however it might be beneficial to have the min-distance per cell instead of just the "is-empty-bool" to step further than just that cell. on the other hand, this structure should stay in shared-mem (since it is constantly used) so it must be really small (16kB total and i've to put a sin-table in there as well). 1 vs 32 bits is quite a difference and the hierarchy will serve as "merger" of neighboring empty cells, so here as well i can step further than just a single (leaf)-cell.
i'll play around with it and let you guys know :)
maybe you could just calculate the SSS component by following the ray from the intersection further and calculating when it leaves the object. basically the same thing as this one:
I've tried making SSS for Hydra, but the problem with raymarching SSS was that requires good behavior of distance functions inside the objects, which may sometimes be a bit troublesome. In the end, I just have gone for approximating SSS with tweaked fresnel term. ;)
texel, gimme a call when you are online, need to speak to you (nothing to do with raymarching tho)
actually i just realised a problem with nicesteps soft-shadows. they are nice as long as the cone-angle is below 90deg (i think). having a wider angle (due to near light or big light-area) i'll get artifacts, because when computing distances to nearby objects i'll also (inevitably) include the surface the light actually falls on (the point i wanna shade). so the light gets "occluded" even tho it is not. too bad, since otherwise it looks really good.
now, i'll try iq's suggestion... ;)
now, i'll try iq's suggestion... ;)
Quote:
Another nice trick (generic, not only for distance fields) is to use the distance to the shadow intersection point to fade out a shadow. Gives pretty natural lighitng.
i don't get it. imagine a floor which is to be lit. a light is straight above and in between there is another flat rectangle. with your formula, the shadow edges will not be much different from the full-shadow area, isn't it ?
wow, did you guys (who use cuda) know that
is about 4 times slower than the equivalent
bad compiler or bad me ?
Code:
for (int i = 0; i < 4; i++)
{
Vector vecPos;
switch (i)
{
case 0: vecPos = Vector(getSin(fPhase * 1.357f), getSin(fPhase * 3.357f + 1.31f), getSin(fPhase * 2.357f + 0.8572f)); break;
case 1: vecPos = Vector(getSin(fPhase * 2.357f + 0.23f), getSin(fPhase * 1.7f + 0.31f), getSin(fPhase * 6.37f + 3.14f)); break;
case 2: vecPos = Vector(getSin(fPhase * 1.17f + 1.64f), getSin(fPhase * 4.37f + 2.31f), getSin(fPhase * 4.57f + 1.31f)); break;
case 3: vecPos = Vector(getSin(fPhase * 4.17f + 0.4f), getSin(fPhase * 3.7f + 1.1f), getSin(fPhase * 1.757f + 0.31f)); break;
}
float fLenSq = vecLengthSq(vecSub(vecPos, vecOrigin));
fPotential += 1.0f / fLenSq;
}
is about 4 times slower than the equivalent
Code:
float fLenSq;
fLenSq = vecLengthSq(vecSub(Vector(getSin(fPhase * 1.357f), getSin(fPhase * 3.357f + 1.31f), getSin(fPhase * 2.357f + 0.8572f)), vecOrigin));
fPotential += 1.0f / fLenSq;
fLenSq = vecLengthSq(vecSub(Vector(getSin(fPhase * 2.357f + 0.23f), getSin(fPhase * 1.7f + 0.31f), getSin(fPhase * 6.37f + 3.14f)), vecOrigin));
fPotential += 1.0f / fLenSq;
fLenSq = vecLengthSq(vecSub(Vector(getSin(fPhase * 1.17f + 1.64f), getSin(fPhase * 4.37f + 2.31f), getSin(fPhase * 4.57f + 1.31f)), vecOrigin));
fPotential += 1.0f / fLenSq;
fLenSq = vecLengthSq(vecSub(Vector(getSin(fPhase * 4.17f + 0.4f), getSin(fPhase * 3.7f + 1.1f), getSin(fPhase * 1.757f + 0.31f)), vecOrigin));
fPotential += 1.0f / fLenSq;
bad compiler or bad me ?
it really sucks having to deal with this kind of crap...
jumps and conditionals are always bad if you can (practically) do without..
xortdsc: welcome to the muon baryon mayhem. conditionals on the GPU = committing suicide.
xortdsc: Try using "avoid control flow" or similar flag if your compiler has it (HLSL shader compiler has one). Also in your case try to use formulas or arrays on constant instead of switch.
Decipher: True. Especially conditionals for which route can vary per-pixel, which sometimes require counter-intuitive transforms to optimize.
Decipher: True. Especially conditionals for which route can vary per-pixel, which sometimes require counter-intuitive transforms to optimize.
well, i would have thought the compiler unrolls that anyway, since everything is constant. otherwise this would totally make sense to avoid that conditional stuff, but with constants, the compiler should evaluate the condition at compile-time. ...so i thought.
anyway, thanks for the hints... :)
anyway, thanks for the hints... :)
Use watcom! Ehrm. N/M.
The compile got a threshold for code expansion. So try this
#pragma unroll 4
for (int i = 0; i < 4; i++)
#pragma unroll 4
for (int i = 0; i < 4; i++)
xortdsc: conditionals cause a higher register usage. Pay close attention to the amount of registers your program uses! If it's above 10 (bad) or 16 (worse), you'll get such a low framerate... Making independent code blocks, reusing variables and avoiding conditionals will all help make the code parallelizable and using less registers.
There's a nice XLS sheet that comes with the CUDA sdk to calculate the amount of threads executing parallely, you should use that every time!
There's a nice XLS sheet that comes with the CUDA sdk to calculate the amount of threads executing parallely, you should use that every time!
Damn you all, I've been spending my entire afternoon and evening on this raymarching stuff!
Ok, so now I'm a little stuck. I'm pretty sure I got the meaning of it all, but something seems to be wrong.
I have a function, march_ray, which takes a position vector and a direction vector as arguments and returns white (Vec3f(1, 1, 1)) if the ray collided with an object and black (Vec3f(0, 0, 0)) if it did not. So far so good!
When I try a raytracing a simple 1.0f radius sphere in the origin of the world it works and I get a pretty white circle in the middle of the screen:
However, when I try to do raymarching, it does not work:
I have a function, march_ray, which takes a position vector and a direction vector as arguments and returns white (Vec3f(1, 1, 1)) if the ray collided with an object and black (Vec3f(0, 0, 0)) if it did not. So far so good!
When I try a raytracing a simple 1.0f radius sphere in the origin of the world it works and I get a pretty white circle in the middle of the screen:
Code:
Vec3f Renderer::march_ray(Vec3f& p, Vec3f& dir) {
float A = 1;
float B = 2 * dot(p, dir);
float C = dot(p, p) - 1.0f;
if (B*B-4*A*C >= 0)
return Vec3f(1, 1, 1);
else
return Vec3f(0, 0, 0);
}
However, when I try to do raymarching, it does not work:
Code:
Vec3f Renderer::march_ray(Vec3f& p, Vec3f& dir) {
float d;
do {
//very simple distance function for the 1.0f radius sphere in the origin
d = p.length() - 1.0f;
p += dir * d;
} while(d > 0.05 && d < 5.f);
if(d <= 0.05f) return Vec3f(1, 1, 1);
else return Vec3f(0, 0, 0);
}