does a linear z buffer work?
category: code [glöplog]
Because a linear zbuffer is always handy when you're sampling it for screenspace effects, has anyone tried storing a linear zbuffer by changing the vertex shader, if so is there a big precision loss at 24bits? I can imagine it works great with a 32F buffer. I could try it myself, but it's always good to ask first.
It works just fine.
The main reason graphics HW prefers "conventional" Z-buffers is that post-projection Z varies linearly across a screen-space triangle, i.e. no perspective correction required for Z interpolation/testing. Division's a bitch in HW, always nice to delay it until further downstream :)
The main reason graphics HW prefers "conventional" Z-buffers is that post-projection Z varies linearly across a screen-space triangle, i.e. no perspective correction required for Z interpolation/testing. Division's a bitch in HW, always nice to delay it until further downstream :)
Cool, thanks oh master of the gpu and everything binary.
I'm going to try it, since apparently it's just another multiplication in the vertex shader.
code / math here
I'm going to try it, since apparently it's just another multiplication in the vertex shader.
code / math here
ryg: Non-linear z also has the nice property of having better precision near the viewer, which is where rendering-artifacts are most apparent.
the precision issue is cool with fixed point numbers (old 16 or24 bit zbuffers).
with a linear 32 bit floating point buffer, since floating point numbers have more precision near 0 and less with bigger numbers in an exp manner, i wonder if there is some sort of a similar effect precision distribution anyway :)
with a linear 32 bit floating point buffer, since floating point numbers have more precision near 0 and less with bigger numbers in an exp manner, i wonder if there is some sort of a similar effect precision distribution anyway :)
iq: guess why a certain console maps its float zbuffer from -1 to 0 - the two precision distributions kinda cancel each other out. :)
Code:
I'm going to try it, since apparently it's just another multiplication in the vertex shader.
No it isn't. As I said above, Z interpolation doesn't do perspective correction. Which is correct if Z=z/w but wrong if Z=z. If you follow that article, Z values inside triangles will be wrong (and "swim" as you move), triangle intersections will be in the wrong place, and anything that reads the Z-buffer values to estimate visible geometry (e.g. DOF, SSAO etc. post-filters) will be affected.
Argh I just realized that the OP was also asking if you could switch to linear Z just by changing the vertex shader. Well, yes, linear Z works - and no matter how it's interpolated, it's always correct at the vertices :) - but no, because you can't set the interpolation mode for Z, you won't get correct results in the middle of tris. Now of course whether you care is another matter :)
Ok, it doesn't work. But it should! :D
Turn off filtering and use low-resolution textures and you have a very good simulation of demos from 1997 :)
Lastly, do some gl_Position.xy=floor(gl_Position.xy*resolution)/resolution or something to disable subpixel accuracy in the vertces. There, perfect reproduction of a demo from 1997. And don't forget the coder colors!
Of course I must be wrong, I it were just me we'd all be using fixed point for everything including zbuffer, and there would be no need fr 1/z, and everything would be more elegant and robust.
http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BA%20matter%20of%20precision%5D%5D
http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BA%20matter%20of%20precision%5D%5D
I meant, fail.
some keys on my keyboard don't work like they used too.
some keys on my keyboard don't work like they used too.
Then again it would maybe be more tedious to code sme things like proper physics engines and complex lighting.
I agree that for complex lighting and operations where the result is a color, it's arguable that one could not care less if after 10 transcendental ops you lost precision...
Although banding artifacts are an ugly pita.
I agree that for complex lighting and operations where the result is a color, it's arguable that one could not care less if after 10 transcendental ops you lost precision...
Although banding artifacts are an ugly pita.
Code:
Of course I must be wrong, I it were just me we'd all be using fixed point for everything including zbuffer
Erm, we are. Rasterization and half of all popular Z-buffer formats are fixed-point.
Since ryg is reading this: should I avoid interpolators when I can do the same on the fragment shader?
A good rule of thumb on current HW is that interpolating one attribute costs roughly as much as 2 MADs of the respective width in the shader.
There's usually some extra setup cost for different interpolation types. "Constant" interpolators don't need any setup worth mentioning, so they're basically free. The first "linear" attribute costs you extra for the setup; additional "linear" attribs don't have extra setup, just the cost of interpolation itself. Then, if you have a "linear_centroid", that'll cost extra setup again. Corresponding "noperspective" modes may or may not be free if the associated "linear" interpolation mode was already set up; if in doubt, assume it's not free.
As for the cost of operations, by now a good rule of thumb is to count the number of *scalar* ops (i.e. if you didn't have the float4 types etc.) rather than vector ops. For NV this is very close to what's actually going on and for AMD it's equivalent to assuming each VLIW lane is running one of your source instructions; in reality it's more constrained than that and you get worse-than-ideal utilization, use their shader analysis tools for details.
There's usually some extra setup cost for different interpolation types. "Constant" interpolators don't need any setup worth mentioning, so they're basically free. The first "linear" attribute costs you extra for the setup; additional "linear" attribs don't have extra setup, just the cost of interpolation itself. Then, if you have a "linear_centroid", that'll cost extra setup again. Corresponding "noperspective" modes may or may not be free if the associated "linear" interpolation mode was already set up; if in doubt, assume it's not free.
As for the cost of operations, by now a good rule of thumb is to count the number of *scalar* ops (i.e. if you didn't have the float4 types etc.) rather than vector ops. For NV this is very close to what's actually going on and for AMD it's equivalent to assuming each VLIW lane is running one of your source instructions; in reality it's more constrained than that and you get worse-than-ideal utilization, use their shader analysis tools for details.
And then there's interpolant packing, which is something of a sore subject.
On some chips (e.g. NV GF 6x00/7x00 series) there's a fixed cost per interpolant that you pay for each 4-vector you start, no matter whether you use 4 lanes or just 1, so you want to pack (e.g. pack 2 sets of UVs into one float4, or one leftover scalar together with a float3). On some chips (e.g. PowerVR in embedded devices), it's the other way round, and using 2x float2 for 2 sets of UVs is often significantly better than one float4. And on some chips (e.g. more recent NV chips) it doesn't seem to matter much either way.
On some chips (e.g. NV GF 6x00/7x00 series) there's a fixed cost per interpolant that you pay for each 4-vector you start, no matter whether you use 4 lanes or just 1, so you want to pack (e.g. pack 2 sets of UVs into one float4, or one leftover scalar together with a float3). On some chips (e.g. PowerVR in embedded devices), it's the other way round, and using 2x float2 for 2 sets of UVs is often significantly better than one float4. And on some chips (e.g. more recent NV chips) it doesn't seem to matter much either way.
... and this is why i really don't want to know too much about the deep details of GPUs, in a worry free life. very interesting, though.
ryg: good thing, because on powervr the interpolant usage can be heavily biased by the stage of the pipeline you want to schedule texture reads that use them..
.. and its interesting to see the console coders in this thread :)
.. and its interesting to see the console coders in this thread :)
ryg is like the gpu version of bones:
ryg:
Oh...
I didnt know that.
Well, faor rasterizatin if guess I knew.
But still world coordinates are floating point.
And colors of course.
This is where some precision problems, physic incoherencies, z-fighting, banding artifacts and plain bugs take their source. Well, I guess, at least, I'm not a professional.
Of course with fixed point you could have all those, and worse, if you're careless, but you could also do things right.
But I guess when you have to much successive complex operations in your formulas, and you cant afford to keep intermediate values that weight trillions of bits, fixed point show its limits.
Oh...
I didnt know that.
Well, faor rasterizatin if guess I knew.
But still world coordinates are floating point.
And colors of course.
This is where some precision problems, physic incoherencies, z-fighting, banding artifacts and plain bugs take their source. Well, I guess, at least, I'm not a professional.
Of course with fixed point you could have all those, and worse, if you're careless, but you could also do things right.
But I guess when you have to much successive complex operations in your formulas, and you cant afford to keep intermediate values that weight trillions of bits, fixed point show its limits.
Precision problems originate from approximating real numbers with limited-range, limited-precision values. This is true for both fixed and floating point.
And in particular, z-fighting and banding artifacts result from insufficient precision. This has nothing to do with floating point vs. fixed point.
And in particular, z-fighting and banding artifacts result from insufficient precision. This has nothing to do with floating point vs. fixed point.
Well, ok, but precision problems are quite some more complicated to handle with floating point values, that's all (granted, they allow easier implementation of more complex formulas). It's harder to masterize what you lose and where.
A point far from the world's origin has poor absolute precision relative to its neighbours if you compare that to a point near the origin, this is inherent to floating point and cannot be cured (and I personnally don't find this right :) ). Period. When you rotate and project, when you move things around, when you test collisions, I think you're more likely to get things to screw up, and I think that's what Tom Forsyth's blogpost was saying.
A point far from the world's origin has poor absolute precision relative to its neighbours if you compare that to a point near the origin, this is inherent to floating point and cannot be cured (and I personnally don't find this right :) ). Period. When you rotate and project, when you move things around, when you test collisions, I think you're more likely to get things to screw up, and I think that's what Tom Forsyth's blogpost was saying.