Ultra Lightweight DX10/11 Framework
category: code [glöplog]
Need. Now. Please help me.
Plus: Does DX10/DX11 HLSL (with ps_4_0/ps_5_0) support indexed writing to small - but n > 2 elements - temporary constant sized arrays (of small structs)?
GLSL supports something pretty similar.
Plus: Does DX10/DX11 HLSL (with ps_4_0/ps_5_0) support indexed writing to small - but n > 2 elements - temporary constant sized arrays (of small structs)?
GLSL supports something pretty similar.
http://msdn.microsoft.com/en-us/library/bb232912(v=VS.85).aspx
Help me to crunch this stuff to dx9 level size! DX11 is a pain - but it can and has to be done.
Code:
// This will generate a screen aligned triangle
void VS2(uint i : SV_VertexID,
out float2 texcoord : TEXCOORD0,
out float4 position : SV_Position) {
float4 p[3] = {{-1,1,0,1},{3,1,0,1},{-1,-3,0,1}};
float2 t[3] = {{0,0},{2,0},{0,2}};
texcoord = t[i];
position = p[i];
}
float4 PS2(float2 texcoord : TEXCOORD0) : SV_Target {
return texcoord.xyyx;
}
Help me to crunch this stuff to dx9 level size! DX11 is a pain - but it can and has to be done.
Silly question, but does it make any difference to use the coords {{1,1,0,1},{1,-3,0,1},{-3,1,0,1}}; instead ?
Should work too, but you have to readjust the texcoords.
I was wonder how such tweak effect the compressed size ?
Actually that could improve the compression ratio, depends :)
why have the t[] array at all? for most uses, [-1..1] is the better range anyway, just reuse the position.
would something like this work at all in HLSL?
Code:
void VS2(uint i : SV_VertexID,
out float2 texcoord : TEXCOORD0,
out float4 position : SV_Position) {
texcoord = float4( (i&1)*2-1, ((i/2)&1)*2-1, 0, 1 );
position = texcoord.xy;
}
oops, swap "position" with "texcoord", of course
y'all only writing raymarching or postprocessing shaders? do you?! lamers do geometry?!
yeh. a common thing. -_-
yeh. a common thing. -_-
iq, it should, but may I suggest:
Code:
void VS2(uint i : SV_VertexID,
out float2 texcoord : TEXCOORD0,
out float4 position : SV_Position) {
position = float4((i&1)*2-1, (i&2)-1, 0, 1);
texcoord = position.xy*2+1; // for las' version, or just =position.xy for iq
}
IMHO dx10/11 graphics pipeline takes too much setup to be worth it for <=4k (last time I tried my conclusion was that using geometry shaders (+ a bit of texturing/postproc) would be smaller in opengl. The compute pipeline on the other hand...
And as it seems you just want a fullscreen quad anyway, the above babble will be replaced by (cs5.0 required to map the screen as a texture):
And as it seems you just want a fullscreen quad anyway, the above babble will be replaced by (cs5.0 required to map the screen as a texture):
Code:
RWTexture2D<float4> o;
[numthreads(16, 16, 1)]
void _4(uint2 v : SV_DispatchThreadID){
o[v] = v/float2(width,height);
}
dx11 works just *best* for 4ks + dont need that heavy setup ure talking about, the only things required are these:
device
swapchain
backbuffer view
depthstencil view
effectptr
vbuffer
omset rendertarget
setviewport
renderstates
frame loop
clear
draw
present
It's same which Rendering API u choose to do it
device
swapchain
backbuffer view
depthstencil view
effectptr
vbuffer
omset rendertarget
setviewport
renderstates
frame loop
clear
draw
present
It's same which Rendering API u choose to do it
And not PITA at all, it's the best dxapi so far, ask Carmack ;-)
If u want the immediate mode such as was in the old OpenGL (inline typed primitives) it can be done with dx11. You just use D3D10_USAGE_DYNAMIC with the ID3D11Buffer* and allocate space for ex. 0xffff for the vertices and then when on the frame u wrap it open using pDevice->Map() get ptr to the bufferrange ... write vertices to the buffer (based on the currently bound input layout) and Unmap() and Draw it (glEnd() basically).
You just have to write the immediate mode system urself there, but all stays the same otherwise...
You just have to write the immediate mode system urself there, but all stays the same otherwise...
funky: "only"? And with what you describe there you could just as well use dx9 (which makes things smaller with the access to fixed function)...
Shader compiling and especially textures (Texture, ShaderRessourceView, Sampler...) takes quite some space too.
DX11 is a nice and clean api, but like dx10 quite wordy.
For my dx11 1k computeshader setup I need:
swapchain
swapchain's texture (argh, 128bit uuid unless you're really dirty)
uav (creation+set)
constant buffer (same)
compute shader (compile+create)
gettickcount
updatesubressoruce
dispatch
present
Makes http://pouet.net/prod.php?which=32735 dx11 in 990 bytes from c code and regular crinkler.
Shader compiling and especially textures (Texture, ShaderRessourceView, Sampler...) takes quite some space too.
DX11 is a nice and clean api, but like dx10 quite wordy.
For my dx11 1k computeshader setup I need:
swapchain
swapchain's texture (argh, 128bit uuid unless you're really dirty)
uav (creation+set)
constant buffer (same)
compute shader (compile+create)
gettickcount
updatesubressoruce
dispatch
present
Makes http://pouet.net/prod.php?which=32735 dx11 in 990 bytes from c code and regular crinkler.
Psycho, could you perhaps release the code with a very small (e.g. hello world) CS? That'd be great :)
xTr1m: better late than never: TinyDX11 :)
Except that it's a good deal smaller now - 821 bytes (but also smaller with the 2011 crinkler)
Except that it's a good deal smaller now - 821 bytes (but also smaller with the 2011 crinkler)