what part in starstryck is an animation?
category: general [glöplog]
Doom: Adaptive subdivision greatly increase polygon setup time and can easily lead to t-junctions. I'd much rather go for span-subdivision or grid-based rasterizers. With a grid-based method you can adaptively subdivide a grid until the error is less than one texel in the middle of the tile, so you'll always round the right way. That way you can actually get a truly perspective correct result.
The point is to do it recursively, after triangle setup. You pass your transformed camera-space triangle with texture coordinates to the mapper, and it does perspective projection on the vertices. It then determines if subdivision would increase accuracy more than a given threshold (could simply compare z components of the vertices, for instance), and if so creates a temporary vertex on each edge and calls itself four times. You can reuse vertex projections, and all in all there's very little overhead, and you gain a lot from being able to use a simple affine mapper. You do get T-junctions every time neighbouring faces are not divided in the same way, but with a precise affine mapper that doesn't matter much. It's also a very compact method, so it's ideal for 4k's.
A grid-based approach doesn't guarantee that the textures of neighbouring faces will line up perfectly any more than adaptive subdivision does, and there's a lot of overhead in doing grid-based interpolation while staying within the shape of the triangle. Plus, while it's easy to divide a triangle in four (average each pair of vertices), dividing what's initially a triangle in a rectangular grid is a lot trickier.
Also, subdividing until the error is less than one texel isn't sufficient for a truly perspective correct result. The error has to be less than whatever the precision of your texture coords is (ie. typically 1/256 of a texel). That would reduce your grid to 1x1 pixel squares and then you'd be way better off with one div per pixel.
A grid-based approach doesn't guarantee that the textures of neighbouring faces will line up perfectly any more than adaptive subdivision does, and there's a lot of overhead in doing grid-based interpolation while staying within the shape of the triangle. Plus, while it's easy to divide a triangle in four (average each pair of vertices), dividing what's initially a triangle in a rectangular grid is a lot trickier.
Also, subdividing until the error is less than one texel isn't sufficient for a truly perspective correct result. The error has to be less than whatever the precision of your texture coords is (ie. typically 1/256 of a texel). That would reduce your grid to 1x1 pixel squares and then you'd be way better off with one div per pixel.
the part where all the brown sauce makes everything all brown and fruity. maybe i dreamed it?
Doom: If you do it after triangle setup, you'd obviously have to go back into triangle setup to calculate new edge gradients. You seem to confuse triangle setup with primitive assembly. And no, I don't buy that t-junctions aren't an issue; you seem to believe so by judging by your previous demos, but I disagree.
A grid interpolator can, if written correctly guarantee that the textures line up relatively close to each other - no less than any finite precision barycentric interpolator. And again, the calculations need not be notably more expensive than in a normal scan-converter, but getting the performance decent is indeed more work than in the traditional sense.
About your last point, you seem to once again be incorrect - I said half a texel, and that's to make sure that the truncation rounds in the correct direction for all texels. Yeah, there will be some differently picked texels on texel-edges, but no sensible graphics standard defines what happens just there - this is once again back to my point that all practical implementations work on finite precision.
A grid interpolator can, if written correctly guarantee that the textures line up relatively close to each other - no less than any finite precision barycentric interpolator. And again, the calculations need not be notably more expensive than in a normal scan-converter, but getting the performance decent is indeed more work than in the traditional sense.
About your last point, you seem to once again be incorrect - I said half a texel, and that's to make sure that the truncation rounds in the correct direction for all texels. Yeah, there will be some differently picked texels on texel-edges, but no sensible graphics standard defines what happens just there - this is once again back to my point that all practical implementations work on finite precision.
You're actually better off recursing after calculating your edge slopes, because they can be reused. Only the three inner edges in the subdivision need new slopes. As for texture gradients, they need to be recalculated a lot of course, as they do in any other method that approximates perspective-correct mapping with linear interpolation (e.g. 16-pixel span or grid interpolation). But it depends a lot on the pipeline. Most software engines don't fit the "standard" model for a 3D pipeline at all.
The T-junctions really aren't a big issue: if you use a sub-pixel correct affine mapper you can avoid gaps, and if you want textures to line up completely, you could, similar to what you'd do for a grid, check that the error is less than 1/2 texel (or whatever) halfway along every edge (where you have the most warping). Then the entire face would fit your definition of perspective-correct. (And of course you'd find you need to be way more realistic about CPU cycles.)
I still don't see how you would implement a variable-sized grid interpolator while also clipping the texture to the shape of the triangle you're supposed to be drawing. I did make an 8x8 grid-interpolating convex-polygon mapper once, but I failed to make it respect the edges of the polygon. Maybe I just suck, but there was no obvious way to do that without big complications. In the end it just rendered a bit more than it had to and used a flat filler to clip it afterwards. Massively slow, but almost ok for a skybox.
The T-junctions really aren't a big issue: if you use a sub-pixel correct affine mapper you can avoid gaps, and if you want textures to line up completely, you could, similar to what you'd do for a grid, check that the error is less than 1/2 texel (or whatever) halfway along every edge (where you have the most warping). Then the entire face would fit your definition of perspective-correct. (And of course you'd find you need to be way more realistic about CPU cycles.)
I still don't see how you would implement a variable-sized grid interpolator while also clipping the texture to the shape of the triangle you're supposed to be drawing. I did make an 8x8 grid-interpolating convex-polygon mapper once, but I failed to make it respect the edges of the polygon. Maybe I just suck, but there was no obvious way to do that without big complications. In the end it just rendered a bit more than it had to and used a flat filler to clip it afterwards. Massively slow, but almost ok for a skybox.
the loaderbar is!
Quote:
what part in starstryck is an animation?
all of it! please don't ask me what parts are realtime tho, you never know for sure with swedish demos ;)
No matter what they animate, they will ALWAYS break shit!
"I still don't see how you would implement a variable-sized grid interpolator while also clipping the texture to the shape of the triangle you're supposed to be drawing."
You don't need to clip at all, all you need to do is enforce a proper fill convention, which boils down to a tiny amount of work during setup time. Nice tutorial by Nick: http://www.devmaster.net/forums/showthread.php?t=1884. The paragraph starting from "The second cause of the gaps is the fill convention" should solve your problem :)
You don't need to clip at all, all you need to do is enforce a proper fill convention, which boils down to a tiny amount of work during setup time. Nice tutorial by Nick: http://www.devmaster.net/forums/showthread.php?t=1884. The paragraph starting from "The second cause of the gaps is the fill convention" should solve your problem :)
ryg: Thanks, that is very interesting, and I can see how you'd expand on it to create a reasonable grid interpolating texturemapper. But there'd still be a lot of overhead around the edges. 7+ instructions per pixel, just to determine if it's inside or outside the face, as opposed to skipping outside pixels altogether. I might implement it, just to see how it performs. My guess is it'll be slower than doing 16-pixel horizontal spans, though, cause it does more than double the size of the innerloops.
That specific paragraph seemed to be about sub-pixel precision, though, which isn't the problem. Just making straight edges is hard enough. ;)
That specific paragraph seemed to be about sub-pixel precision, though, which isn't the problem. Just making straight edges is hard enough. ;)
doom: Valid points, but now you've complicated your model so much it becomes quite unmanageable without loosing performance. I'll still go wight span-subdivision and/or grid-rasterization. And that link ryg is pointing to is a good reference for anyone implementing a grid-tracer - but I've got my own bag of tricks that makes grid-rasterization even more appealing. Getting completely rid of polygon clipping and the near-plane is one of them. http://www.cs.unc.edu/~olano/papers/2dh-tri/2dh-tri.pdf has more details on that.
Doom: You seem to miss the point that the link is showing the technique, not an efficient implementation. You can quite easily detect how many lines cross through any tile, and go to optimized code-paths in those cases. And you can even do partial scan conversion inside the edges. Besides, a sky-box usually has very few visible edges onscreen. We were discussing sky-boxes, right?
eh, "inside the edges" = "inside the tiles"
doom: but that's exactly the point, you only have extra overhead around the edges - as long as you use relatively big triangles (pretty much a given considering the amount of tris you can reasonably use on amiga, and definitely the case for the perspective correct skybox cube that was the original subject) there's only a small number of blocks where you have to do accurate tracing, most blocks are either completely rejected or completely accepted. the latter is especially nice because the filler for such a block is the loop you'd use in a grid-based wobbler (simple+fast). when you want decent quality perspective correction, there's regions where you can get by with *really* coarse interpolation (i.e. 16x16 blocks) and ones where you want higher precision (e.g. use 4x4 there). now the nice part is that you can easily insert a higher-precision interpolation loop into the existing rasterizer without any difficulties. similarly, if you don't want to spend all those instructions for testing per pixel, fine, just do a finer block-level trace (e.g. 4x4 again) to make sure you just do it where you need to. all fits perfectly into the algorithm and can reuse the same deltas (well, with some extra shifts). it's all really clean and nice.
I didn't completely miss the point, and I could think of a whole bunch of ways to apply the technique, and a lot of optimizations based on the ability to test a point against the edges without having to do three complete dot products. And of course, you get clipping against the screen more or less for free. I rarely worry about the near plane in demos though, cause you can just keep stuff out of the way of the camera.
On a side note, 16-pixel span scanline rasterization has the huge benefit of being cache-friendly, which probably matters more in the end. ;)
On a side note, 16-pixel span scanline rasterization has the huge benefit of being cache-friendly, which probably matters more in the end. ;)
Thanks for the interesting discussion.
WRT skyboxes, isn't it actually faster to raycast a "skysphere" on an 8x8 grid ?
WRT skyboxes, isn't it actually faster to raycast a "skysphere" on an 8x8 grid ?
Doom: The near plane is a problem for a scene where you have, say, a FLOOR. Either you'd have to subdivide the floor if it intersects the w=0 plane (or near-plane), or just pre-verify your dataset to your camera movements (the official Shitfaced Clowns method).
As for the cache-friendlyness of span subdivision; that only goes if you have a linear framebuffer, and writes are usually way less important to cache than reads since you can tolerate high latency. I guess the biggest point in using span subdivision is that it easily integrates into existing span-rendering engines and gives a good enough result.
As for the cache-friendlyness of span subdivision; that only goes if you have a linear framebuffer, and writes are usually way less important to cache than reads since you can tolerate high latency. I guess the biggest point in using span subdivision is that it easily integrates into existing span-rendering engines and gives a good enough result.
p01: How do you plan on mapping your sphere-coordinates to a texture-space with good precision in all directions without making huge textures? It's most likely possible, and for demos it can most likely be good enough, but I prefer using skyboxes since my content tools spits it out quite easily. Besides, ray-sphere intersections are kinda expensive compared to forward-differencing some linear distances.
Kusma: Avoid going too close to the floor then, and/or make sure it's pre-subdivided in a way suits your camera paths. :)
About the cache, you're reading from textures, and you want to keep those reads cache-friendly too. An easy Amiga trick that pays off really well is to make a second copy of all your textures, rotated by 90 degrees. If you can live with a few artifacts, rotating by 45 and 135 degrees is a good idea too, but usually texture resolution wouldn't let you get away with it.
Ray-sphere intersections aren't too expensive when your sphere is fixed to the camera and arbitrarily sized. And the intersections are even fixed in camera-space. But yeah, mapping the texture on the sphere is the problem. I guess you could sort of get away with a square texture on a half-sphere. But I've been criticized before for my low-res skyboxes :D
About the cache, you're reading from textures, and you want to keep those reads cache-friendly too. An easy Amiga trick that pays off really well is to make a second copy of all your textures, rotated by 90 degrees. If you can live with a few artifacts, rotating by 45 and 135 degrees is a good idea too, but usually texture resolution wouldn't let you get away with it.
Ray-sphere intersections aren't too expensive when your sphere is fixed to the camera and arbitrarily sized. And the intersections are even fixed in camera-space. But yeah, mapping the texture on the sphere is the problem. I guess you could sort of get away with a square texture on a half-sphere. But I've been criticized before for my low-res skyboxes :D
Doom: Yeah, that's what I said was the SFC-method. My point was that it wasn't robust.
About that caching of the texture - that's a really shitty solution, as it will only give good performance when the polygon is either directly mapped to the screen or rotated a mutiple of 90 degree over the z-axis variant of that! Have you actually measured a performance-increase with this?! A much better solution is to store the texture in some swizzled form. It might make the actual interpolation code slightly slower, but the savings on cache hits should compensate for that. Another "trick" that I've heard some people use on amiga is to simply to disable the cache for textures ;)
About that caching of the texture - that's a really shitty solution, as it will only give good performance when the polygon is either directly mapped to the screen or rotated a mutiple of 90 degree over the z-axis variant of that! Have you actually measured a performance-increase with this?! A much better solution is to store the texture in some swizzled form. It might make the actual interpolation code slightly slower, but the savings on cache hits should compensate for that. Another "trick" that I've heard some people use on amiga is to simply to disable the cache for textures ;)
as Doom said, 1 square per hemisphere should be good enough.
Disabling cache can be wise and the memory potentialy gained from not having to store 90° pre-rotations of the textures could be used for bigger textures.
Disabling cache can be wise and the memory potentialy gained from not having to store 90° pre-rotations of the textures could be used for bigger textures.
Jamie's engine used on when we ride on our enemies made heavy use of the '060 FPU and delivered some pretty impressive results.
Animation/Skybox/whathaveyou or not StarStruck was the best demo of the compo as people who voted thought so too.
Of course you can't expect miracles from a machine with a slow planar frame buffer and a 50MHz CPU. It's still better than a PeeCee running Windows, though.
Animation/Skybox/whathaveyou or not StarStruck was the best demo of the compo as people who voted thought so too.
Of course you can't expect miracles from a machine with a slow planar frame buffer and a 50MHz CPU. It's still better than a PeeCee running Windows, though.
p01: Then how do you draw those tiles who have pixels from both hemispheres in them?
Kusma: I know it's unreliable as general solution, but you don't need anything else for a demo.
As for rotating the texture, it's not a "shitty solution" at all. It gives a big performance boost with almost no effort. You can get a feel for how big a difference it makes by measuring the framerate as you slowly rotate a single big triangle. You can plot a nice sine curve that way. :) It may be that you can store the texture in even better ways, but I've never heard of anyone successfully doing that on an Amiga. And the CPU isn't that fast, compared to the memory bus. I've also heard of disabling the cache for textures, but I don't know enough about the 68k cache to say if it helps or not. Can you even do it selectively like that?
Flynn: Nobody says Starstruck is a bad demo. But animated skyboxes are no kind of technical achievement ;)
p01: What Kusma said. Much of what you'd gain from using a sphere rather than a box, you'd lose from (somehow) dealing with the seem. If your camera never sees the seem, though, that'd be fine.
As for rotating the texture, it's not a "shitty solution" at all. It gives a big performance boost with almost no effort. You can get a feel for how big a difference it makes by measuring the framerate as you slowly rotate a single big triangle. You can plot a nice sine curve that way. :) It may be that you can store the texture in even better ways, but I've never heard of anyone successfully doing that on an Amiga. And the CPU isn't that fast, compared to the memory bus. I've also heard of disabling the cache for textures, but I don't know enough about the 68k cache to say if it helps or not. Can you even do it selectively like that?
Flynn: Nobody says Starstruck is a bad demo. But animated skyboxes are no kind of technical achievement ;)
p01: What Kusma said. Much of what you'd gain from using a sphere rather than a box, you'd lose from (somehow) dealing with the seem. If your camera never sees the seem, though, that'd be fine.
what if the 2 textures squares are connected, making a 2x1 rectangular texture. The seams is limited can be dealt with like in free direction tunnel.