AMD-ATI vs NVIDIA, the OpenGL showdown
category: general [glöplog]
Expanding on the one liner discussion regarding what to watch out for when coding in OpenGL for those two architectures:
My initial question was: "so what is it that we should watch out when coding on a nvidia card and that does not work with ati ?"
Answers:
- use direct3d
Won't, I like to develop under linux and macosx even if the target is win32. (because I actually need to work with people whose main os is OSX)
- Shaders:
-- iq said "GLSL compliance not followed by NVIDIA."
Which I interpret as NVIDIA being more lenient or providing proprietary features in fragment/vertex/geometry shaders.
-- pommak said "fuckings to ati for bad glsl"
- Textures
-- Navis said: "GL_TEXTURE_COMPRESSED_ARB: It is FUBARed on ATI, apparently"
Which I guess means we have to use an explicit compression method? (S3TC)
-- Non power of two textures
I've personally had issues with glTexSubImage2D/glCopyTexSubImage2D becoming very slow on AMD-ATI hardware when I had (not intentionally) defined a non power of two texture.
Non intentionally: I had *not* set the texture target as GL_TEXTURE_RECTANGLE_ARB for glTexImage2D)
For some reason the NVIDIA drivers did not seem to mind.
- Render contexts / FBO:
-- ryg said "ati opengl issues: i once had problems with multiple render contexts (ATI really needs wglShareLists, NV doesn't)"
I guess this pertains to pbuffers and not FBOs.
-- iq said "the main drawback of ogl support by ATI for regular demomaking has been the lack of antialised frame buffer objects imho, but it's finaly implemented \o/"
- Z-buffer
-- ryg said "ati opengl issues: some z-buffer readback issues"
My initial question was: "so what is it that we should watch out when coding on a nvidia card and that does not work with ati ?"
Answers:
- use direct3d
Won't, I like to develop under linux and macosx even if the target is win32. (because I actually need to work with people whose main os is OSX)
- Shaders:
-- iq said "GLSL compliance not followed by NVIDIA."
Which I interpret as NVIDIA being more lenient or providing proprietary features in fragment/vertex/geometry shaders.
-- pommak said "fuckings to ati for bad glsl"
- Textures
-- Navis said: "GL_TEXTURE_COMPRESSED_ARB: It is FUBARed on ATI, apparently"
Which I guess means we have to use an explicit compression method? (S3TC)
-- Non power of two textures
I've personally had issues with glTexSubImage2D/glCopyTexSubImage2D becoming very slow on AMD-ATI hardware when I had (not intentionally) defined a non power of two texture.
Non intentionally: I had *not* set the texture target as GL_TEXTURE_RECTANGLE_ARB for glTexImage2D)
For some reason the NVIDIA drivers did not seem to mind.
- Render contexts / FBO:
-- ryg said "ati opengl issues: i once had problems with multiple render contexts (ATI really needs wglShareLists, NV doesn't)"
I guess this pertains to pbuffers and not FBOs.
-- iq said "the main drawback of ogl support by ATI for regular demomaking has been the lack of antialised frame buffer objects imho, but it's finaly implemented \o/"
- Z-buffer
-- ryg said "ati opengl issues: some z-buffer readback issues"
also shadow-mapping differences come to mind:
NV does automatic hardware-supported depth texture compare while you have to use an explicit compare on ATI (still true?)
NV has hardware PCF whereas on ATi there is FETCH4
NV does automatic hardware-supported depth texture compare while you have to use an explicit compare on ATI (still true?)
NV has hardware PCF whereas on ATi there is FETCH4
dont forget the CUDA vs. CTM showdown in this serious thread
line rendering is still not accelerated afaik in nvidia (draw 2 or 3 million triangles with glPolygonMode to GL_FILL and then switch to GL_LINES, you will see).
ati does however have some problems on shaders writing to the zbuffer in some rare laptop drivers/cards, like gargaj's :(
nvidia's glsl correctnes is intentionaly bad, because of them wanting to push cg (imho)
ati is missing afaik some like nvidia's gpu_affinity extension to attach rendering contexts to physical gpus.
nvidia never supported hardware clipping planes, but you can simulate the functionality today with shaders.
tesselation unit is finally implemented on nvidia cards?
ati cards are red, nvidia's green....
be happy
ati does however have some problems on shaders writing to the zbuffer in some rare laptop drivers/cards, like gargaj's :(
nvidia's glsl correctnes is intentionaly bad, because of them wanting to push cg (imho)
ati is missing afaik some like nvidia's gpu_affinity extension to attach rendering contexts to physical gpus.
nvidia never supported hardware clipping planes, but you can simulate the functionality today with shaders.
tesselation unit is finally implemented on nvidia cards?
ati cards are red, nvidia's green....
be happy
i would love to have 1 pc, 2 cards, switch easily when i want to test nvidia or ati without uninstalling / installing drivers nor rebooting ...
oh wait, time to wake up, work doesn't wait
oh wait, time to wake up, work doesn't wait
i think this thread gives a useful overview. didn't know about the GL_TEXTURE_COMPRESSED_ARB issue e.g.
But then again these kind of issues were exactly the reason for me to switch to dx9..
But then again these kind of issues were exactly the reason for me to switch to dx9..
I've had issues with sampler2d in vertex shaders on ATI (in OSX, but a quick search of the intarweb suggests the same issue is there on PC). Aside from a horrible bug in apple's glsl code (relating to colorspace conversion, but only appearing when you use samplers in a vertex shader - if anyone needs a fix for that give me a shout).
Basically, it seems samplers in vertex shaders aren't supported in hardware (or more likely they ARE supported, but the driver doesn't implement it in hardware), so the vertex shader runs in software (cue slowdown, especially if you're using large and constantly changing textures as I am and the bus gets hammered).
I've not looked into this in any great depth, but I've heard that the geforce 8 series and later support this in hardware properly, maybe post-2k series radeons do too. I've seen reports of the same issue on windows, but seeing as windows drivers get updated more regularly maybe it's fixed by now.
Basically, it seems samplers in vertex shaders aren't supported in hardware (or more likely they ARE supported, but the driver doesn't implement it in hardware), so the vertex shader runs in software (cue slowdown, especially if you're using large and constantly changing textures as I am and the bus gets hammered).
I've not looked into this in any great depth, but I've heard that the geforce 8 series and later support this in hardware properly, maybe post-2k series radeons do too. I've seen reports of the same issue on windows, but seeing as windows drivers get updated more regularly maybe it's fixed by now.
Regarding NVidia's GLSL compliance (or lack thereof).
From developer.nvidia.com/object/nvemulate.html
From developer.nvidia.com/object/nvemulate.html
Quote:
Strict Shader Portability Warnings
In the past, NVIDIA has provided extensions to GLSL to make the language more compatible with other high-level shading languages. While this can be convenient, it also can cause portability problems. To aid developers in writing portable code, NVIDIA now flags the usage of these capabilities with warnings by default. This behavior can be strengthened by checking the “Generate Shader Portability Errors” checkbox in NVemulate. Additionally, with version 1.20, the GLSL spec has adopted many of these language extensions. As a result, NVIDIA has further tightened the syntax checking on the compiler if “#version 120” or “#version 110” are specified in the shader, such that the use of language extensions not included in version 1.20 will be flagged as an error
from top:
Regarding d3d there's ofcourse the thing about of ogl exposing dx10 level features (/lack of limitations) on WinXP..
NPOT textures for ati is a hardware limitation before the 1k series (although you could do *some* of it before) and ati has never really supported the (geforce fx centric) texture rectangles.
nvidia still uses it's cg compiler to compile glsl, meaning that it (by default) allows some non-conformance, gives next to unusable error messages and it is slow to call the compiler at lot of times if you have many shaders. Some glsl 1.20 features (constant arrays) are not supported (last time i needed them at least).
in glsl I don't think there's a diffence in depth texture compare (wrt shadow mapping), however, it's only since the 2k series that ati supports pcf in hardware (haven't tested though)
Same with vertex texture fetch.
On nvidia rendering a polygon (or batch?) may not take more than 1-2 seconds before you get a driver reset ;)
Ohh, and it's no problem to have both vendors cards installed in the same machine and just switching primary display.
Regarding d3d there's ofcourse the thing about of ogl exposing dx10 level features (/lack of limitations) on WinXP..
NPOT textures for ati is a hardware limitation before the 1k series (although you could do *some* of it before) and ati has never really supported the (geforce fx centric) texture rectangles.
nvidia still uses it's cg compiler to compile glsl, meaning that it (by default) allows some non-conformance, gives next to unusable error messages and it is slow to call the compiler at lot of times if you have many shaders. Some glsl 1.20 features (constant arrays) are not supported (last time i needed them at least).
in glsl I don't think there's a diffence in depth texture compare (wrt shadow mapping), however, it's only since the 2k series that ati supports pcf in hardware (haven't tested though)
Same with vertex texture fetch.
On nvidia rendering a polygon (or batch?) may not take more than 1-2 seconds before you get a driver reset ;)
Ohh, and it's no problem to have both vendors cards installed in the same machine and just switching primary display.
interesting (about switching primary display)
Nvidia allows a shader program with a fragment shader and no vertex shader, ATI doesn't. They supposedly fixed this in recent drivers, I haven't tried it though.
parapete: Are you 100% sure ? I use fragment shaders with no vertex shaders all the time since ages, never seen that..
navis, perhapse in CG. I think for GLSL he's right, I remember discussing this with Auld for 1k intro making, as a way to save the vertex shader, and didn't work indeed (6 months ago or so)
well nearly all of our postprocessing stuff is only a GLSL fragment shader without vertex shader and it runs well my ati x1600 mobile. and that allready worked at bcn party last october...
I have to use open gl, since we're multi platform. Breakin uses ati on windows and I'm using nvidia on linux.
The errors we get are either one of these two:
* Breakins code fail to compile on linux, but works in windows, NEVER the other way around, usually because ms allows certain bad code to pass thrugh their compiler
* My shaders which works on nvidia fails to compile on ati. And most of the cases, due to stupid stuff like "if (int_val>float_val)" (different types) or such. Apparently the noise()-method is b0rk on ati etc...
Sure there prob. are situations where there are some problems not related to these two examples, and I'm not saying vs/ati is the one to blame, but as a personal experience, I generally think that nvidia just works better.
Not that these errors are hard to fix tho....
The errors we get are either one of these two:
* Breakins code fail to compile on linux, but works in windows, NEVER the other way around, usually because ms allows certain bad code to pass thrugh their compiler
* My shaders which works on nvidia fails to compile on ati. And most of the cases, due to stupid stuff like "if (int_val>float_val)" (different types) or such. Apparently the noise()-method is b0rk on ati etc...
Sure there prob. are situations where there are some problems not related to these two examples, and I'm not saying vs/ati is the one to blame, but as a personal experience, I generally think that nvidia just works better.
Not that these errors are hard to fix tho....
IF you have access to the other vendor's hardware that is
Well if anyone's up for it we could make a set of tests to find incompatibilities? Does something like this already exist?
GLSL noise functions do not seem implemented on NVIDIA. (Or I don't know how to use them)
Quote:
-- Navis said: "GL_TEXTURE_COMPRESSED_ARB: It is FUBARed on ATI, apparently"
Which I guess means we have to use an explicit compression method? (S3TC)
I'd like to have a more detailed report of the problem, since I've never encountered any problems with ARB compressed textures.
Quote:
-- Non power of two textures
I've personally had issues with glTexSubImage2D/glCopyTexSubImage2D becoming very slow on AMD-ATI hardware when I had (not intentionally) defined a non power of two texture.
Non intentionally: I had *not* set the texture target as GL_TEXTURE_RECTANGLE_ARB for glTexImage2D)
For some reason the NVIDIA drivers did not seem to mind.
The texture rectangle extension also changes texture addressing semantics etc., so it's hardly a 1:1 replacement for normal textures. The "correct" extension is ARB_texture_non_power_of_two, and that doesn't introduce extra targets or anything else of the sort, its presence will simply cause non-power-of-2-sized arguments to glTex*Image*D calls not to fail (explaining the behavior of the NV drivers). I'm guessing the ATI implementation simply has a very slow software fallback path for texture upload of uncommon formats (or some other software side translation is taking place).
Quote:
- Render contexts / FBO:
-- ryg said "ati opengl issues: i once had problems with multiple render contexts (ATI really needs wglShareLists, NV doesn't)"
I guess this pertains to pbuffers and not FBOs.[/quote]
Neither, this was in an application with several (Win32) Windows rendering using GL. NV apparently automatically (and, technically, incorrectly) shares texture and other handles between windows created by the same process, ATI doesn't. I never tested this with pbuffers, since I've never worked on a project that used them.
Quote:
- Z-buffer
-- ryg said "ati opengl issues: some z-buffer readback issues"
Note that was only on some driver versions. It only occured when reading individual pixels of a 32-bit Z-Buffer using glReadPixels, and only when that pixel was in a Z-compressed region (create a really large render target, preferably with AA, then pixels in the lower and right regions were unaffected). Which all smells like driver bug, especially since it disappeared a few months later (though we never got any replies to the bug report). (In case anyone wonders: it was for quick&easy intersection checking code).
Quote:
On nvidia rendering a polygon (or batch?) may not take more than 1-2 seconds before you get a driver reset ;)
On ATI too. This is Windows-related: if a driver doesn't react or just runs a spin loop for a certain amount of time (5 seconds I think) Windows will assume either the driver or the HW has crashed and automatically BSOD. So now drivers do the same check and take care to abort and execute an internal reset before that happens, since BSODs don't look too good :).
Just out of curiosity, why is it incorrect to share textures between windows of a process? Or is it just the 'automatic' part (which I can see being an issue)?
psonice: It's the automatic-part. Each context should have it's own texture name-space.
ryg: Do you know if the automatic sharing applies to the default texture (texture name 0) as well?
ryg: Do you know if the automatic sharing applies to the default texture (texture name 0) as well?
Ah, ok. Makes sense. Thanks.
Where's the problem in sharing between windows of a same process? I've been doing that for ages (not in an ATI, though). Just use the same render context and switch between windows using wglMakeCurrent.
Afaik what you need glsharelists for is when you deal with multiple processes each with its own render context.
Afaik what you need glsharelists for is when you deal with multiple processes each with its own render context.
ithaqua: I don't think it's a 'big' issue as such, probably most times you want to share textures. But maybe if you're using the same patch of code to draw different stuff in different windows, you could end up with textures being shared that shouldn't be. As it happens, I may well be doing that soon for one of my intro tools, so this is pretty useful to know :)
What I mean is sharing resources (display lists/textures...) between different windows is correct as long as they belong to the same process, using the same render context. It is not, unless you specifically tell so, to share them between windows of different processes and thus render contexts.