trick for 1k raymarchers :: lighting
category: code [glöplog]
still picture on 4870
Weyland: works here, ati 5750
still picture still 5650
do the 10.7 ATI drivers give any solace? (see maytz's comment)
just a random idea: most pixels are pretty close to eachother when raymarching wouldnt it be possible to use this fact to estimate most of the hits ?
I doubt that interpolating instead of doing each pixels as raymarching would be easier and/or faster - and it would definately not be smaller :)
Works for about a second or two, then goes all white. HD4870, Win7 x64.
works without problems here, ATI 4870, Win7 x64, Catalyst 10.7
sweet! thanks you guys, I'll pack up some screensizes and keep it like this since it seems to work on the majority of cards now unless someone can shadernazi this for me
estimating doesnt necessarily means interpolation. fex. instead of getting there in 20 steps you could use the neighbour pixel's distance to start from close and get there in a few steps.
(btw never coded anything like this, just reading you here, and rambling now..)
(btw never coded anything like this, just reading you here, and rambling now..)
Oswald: in a shader you dont know the neighbour pixels ?
oswald: you don't know the distance of the neighbour pixel in a parallel gpu render.. And even if you do (by some multipass rendering), most of the speedup will be eaten by lack of branch coherence (how badly depends on evaluation function). And it's taking space ofcourse.
I don't think you can read neighbouring pixel values during rendering (there's no way to predict which pixels are rendered in what order). Unless that's changed?
You could read from the previous frame perhaps, if the values haven't changed too much it might help a little. Adding that to a 4k that's already at or near 4k is probably tough though ;)
You could read from the previous frame perhaps, if the values haven't changed too much it might help a little. Adding that to a 4k that's already at or near 4k is probably tough though ;)
upsampling from a lowres run might work maybe?
I see. Wasnt aware of this massively parallel nature of the GPU :) Isn't there a way to circumvent it ? Pick a 'thread' and calculate a few neighbour pixels with it ?
You could calculate the neighbouring pixels, but you'd be doing that for every pixel so it'd actually make it multiple times slower rather than faster :)
If you're calculating the distance for a few points to get the normal instead of using IQs trick it might help.
If you're calculating the distance for a few points to get the normal instead of using IQs trick it might help.
You could do it with multipass rendering, but your pixel shader wouldn't normally output the information you want to interpolate in the next pass (which surface is intersected and coordinates of the intersection). So it's extra processing just to produce data that you can interpolate. And then more processing still to work out the regions where interpolation is least likely to produce artifacts, and then that information has to be passed to the shader, which uses up bandwidth.
Maybe better to look at optimising the distance function by eliminating terms for objects that the ray won't get close to, based on bounding boxes and stuff.
Maybe better to look at optimising the distance function by eliminating terms for objects that the ray won't get close to, based on bounding boxes and stuff.
it's not about interpolation, I say that imho using the neighbour pixel's distance you can eliminate most of the raymarching steps.
it's not about interpolation, I say that imho using the neighbour pixel's distance you can eliminate most of the raymarching steps.
but you can't use the neighbouring pixel's data, because it's possibly not be calculated yet. You could use data from the previous frame, or a low-res pre-render pass. Both involve additional code and reduced speed. Only case where it might help is when you're calculating the normal i think?
Oswald: That still requires you to pass information about the intersection from one pixel to another, whereas the shader would normally just output the rendered pixel. Also you have the same problem as with interpolation, that you can't assume a ray will hit the same surface as the ray next to it without looking for edges. So you have much the same problems as with interpolation.
Well..
You can raymarch for a cone(/frustum) of pixels (and with distance functions you can be sure to not overshoot for any of the pixels, by stopping when the distance is less than current cone width), and then you'll have fewer evaluation steps per pixel from that common distance. You can do it in pixel shaders in two passes (by writing this lower bound depth in first pass - it doesn't have to be very precise), or more directly in compute shaders (but for performance reasons it should still be in several passes).
It does limit the amount of evaluations (but not as much as you may think), but because of the wide SIMD nature of the hardware the performance improvement isn't that big after all.
Sure, if you're not size limited and need a fast raymarcher you should do it, but it's not like you can render several times more complicated functions this way..
You can raymarch for a cone(/frustum) of pixels (and with distance functions you can be sure to not overshoot for any of the pixels, by stopping when the distance is less than current cone width), and then you'll have fewer evaluation steps per pixel from that common distance. You can do it in pixel shaders in two passes (by writing this lower bound depth in first pass - it doesn't have to be very precise), or more directly in compute shaders (but for performance reasons it should still be in several passes).
It does limit the amount of evaluations (but not as much as you may think), but because of the wide SIMD nature of the hardware the performance improvement isn't that big after all.
Sure, if you're not size limited and need a fast raymarcher you should do it, but it's not like you can render several times more complicated functions this way..
hmm, and is it not possible to write the shader so that each "thread" calculates a group of pixels ?
"you can't assume a ray will hit the same surface as the ray next to it without looking for edges"
ofcourse I cant, but my knowledge ends here. fex. how about, if the hit is farther than the last one we can assume the next pixel is on a different surface, and start over.
"you can't assume a ray will hit the same surface as the ray next to it without looking for edges"
ofcourse I cant, but my knowledge ends here. fex. how about, if the hit is farther than the last one we can assume the next pixel is on a different surface, and start over.
Quote:
hmm, and is it not possible to write the shader so that each "thread" calculates a group of pixels ?
compute shaders (dx11) can.
But because of the SIMD nature, the straightforward compute shader solution is a very bad idea, as the worst case pixel times will propagate to much larger areas (and the per-pixel work for sphere tracing varies wildly already).