OpenGL framework for 1k intro

category: code [glöplog]

So tell me what the difference is. Why the whining about graphics APIs used in 1ks, but not about sound APIs. It must be obvious to you.

added on the 2014-11-03 20:39:56 by yzi

There is a certain lack of innovation when it comes to soundcards and their APIs, those have been good for 20 years. You can't say the same about graphics cards.

added on the 2014-11-03 21:41:05 by red

Maybe if you are in the world of "sound cards". At least on the Mac, there has been quite some development around sound and music related interfaces available to application programs, in particular Audio Units. Why no whining about the lack of the latest and hottest Audio Units technology in 1k intros? There's lots of new stuff in there, at least compared to the General MIDI sounds that are actually being used. I guess that should be some kind of a problem, just like the lack of OpenGL something point something is?

added on the 2014-11-03 22:05:51 by yzi

Quote:

Why no whining about the lack of the latest and hottest Audio Units technology in 1k intros?

Well, if you can think that you can gain an advantage (size or quality wise) by using Audio Units (or any "modern" audio technology for that matter), I suggest you exploit that (instead of whining about the absence of people doing so).

I still don't think there has been any notable development in the audio department, but then again I'm no OSX guy. Audio Units sounds a bit like a subset of DirectShow, although that onboard timestretch effect seems funky. Maybe that can be used to mangle some system sounds into a soundtrack.

IMHO audio only plays a minor role in 1k, because there is only so much you can do in 1024 bytes, and people tend to concentrate on graphics.

...And for anything bigger, the only API you should need for music is something that eats your audio buffer.

added on the 2014-11-04 01:29:05 by red

@minas <3
I was planning to write one, you definitively seem to have a big lead :D
Hope to see a release soon!

added on the 2014-11-04 07:39:38 by stfsux

Quote:

It depends how you define not that far behind. It's like years behind.

Yes, I'm aware 4.1 spec is from 2010. But if we are talking about features, it is in about same level as DX11 (loosely speaking, depending of course what extensions are there and what DX11.X we are talking about). Could be better though, but it is not crap

When DX12 is widely implemented in Windows + GPUs but Mac still does not have GL 4.5 then I would call it behind :)

Quote:

Code being cleaned up for an initial release, coming real soon now :-)

Definitely interesting concept you have there. I will certainly take a look (and benchmark) when you release it.

added on the 2014-11-04 07:57:29 by ts

One single question: Are you selling apple devices? ;)

added on the 2014-11-04 09:30:13 by las

No, I just happen to like them ;)

added on the 2014-11-04 10:29:49 by ts

Source code released: https://github.com/google/elfling
Have fun!

added on the 2014-11-04 19:06:11 by minas

added on the 2014-11-04 19:28:26 by yzi

Any chance of a C version for the context-model decompressor?

added on the 2014-11-05 11:15:46 by mudlord

I have that as well, I'll clean it up and release it. The code for the decompressor is pretty similar to the compressor.

added on the 2014-11-05 12:24:36 by minas

\o/

added on the 2014-11-05 14:23:09 by stfsux

Hopefully for the benefit of the 1k intro category, here's a C prototype of the music routine that eventually became the music in the Stellar Driftwood intro for the Assembly 2014 1k intro compo. The final version was changed quite a lot, and of course rewritten in asm and hand-tuned/messed-up for best possible Crinkler compression ratio.

C source code of the routine. This is called once per frame, timing with Sleep()
http://www.kameli.net/~yzi/pulssi_v4.txt

This is what it's supposed to sound like, recorded on Windows 7
http://www.kameli.net/~yzi/pulssi_v4.mp3

The variables are perhaps slightly confusingly named. "Bass" is the note sequence for the pulsating/echoing bass note thing, and "sequence" is the note sequence for the string pad chords. midiout_handle must first be allocated with midiOutOpen() before starting to call the playroutine.

Generating music like this is by no means the only way to play music in 1k intros. Many others use a pre-made MIDI event track, where the data is very repetitive and compresses well. With my generative approach, because I am able to control notes, velocities and patch changes independently of each other, each with their own repeat cycle length, I can create nice musical variation. The combination of the parameters doesn't repeat itself exactly the same very quickly, even though every individual cycle repeats many times over. (I am not saying you couldn't do this with pre-written data, but your standard MIDI track player will need some help with it)

added on the 2014-11-05 19:31:10 by yzi

...that should have been: "notes, velocities, pitches and patch changes"

added on the 2014-11-05 19:32:50 by yzi

Ok, I made a simple comparison of my packer, elfling and paq8i and paq8l. Some explanation

orig: osX binary (Mavericks) unpacked. This is the source.
laturi: Packed with laturis own LZW compressor (best what you can get with LZW), shelldropper removed
laturi3: my new compressor, asm-decompressor is not unfortunately ready. expected size around 250 bytes
elfling: Took a pack.c from elfling and fed the original data here
paq8i: paq8i with -9 option
paq8l: paq8l with default settings

Stuff fed into compressors are actual 1k/4k prods.

Code:


                orig.   laturi  laturi3 elfling paq8i   paq8l
kraken          6603    4057    3644    3803    3474    3402
tonot           6140    3976    3669    3739    3390    3390
megademo        11391   4004    3571    4395    3371    3281
embers          2062    949     883     949     858     842
remnant         3223    990     923     1044    911     882
superstructure  2137    995     918     1016    904     894

It seems that elfling can have problems with certain kind of input (megademo). Also it is not doing stellar job, I believe you need more contexts than 4. That they are sparse does not seem to be issue what I tried here.

added on the 2014-11-05 20:01:17 by ts

minas

Quote:

Source code released: https://github.com/google/elfling
Have fun!

First of all, thanks a lot for sharing your efforts!

Second, when you un-dnloadified flow2, you could have just removed my comment from the top instead of adding your own ;) This would have restored it to the state Amand Tihon left it in.

Third, I really appreciate the shortcut of using the location the dynamic linker will write the r_debug address directly. I had not myself figured that out

Fourth, you seem to loop the symbol table simply up until the address matches the string table. Is it really guaranteed that the DT_STRTAB will immediately follow DT_SYMTAB?

Fifth:

Code:

.nextlib:
mov esi, [ebx + 4] ; Point esi to library name
or byte [esi], al ; If library name is empty, we skip it. Weird things live there.
jz .nameless
...
.nameless:
mov ebx, [ebx + 12] ; l_next
jmp .nextlib ; We don't know how to handle symbols that can't be found. Just continuing will cause us to crash.

You do not need to test for an empty lib - in my tests the linked libraries always seem to start from the second entry in the list. Just go to l_next at the beginning of the loop instead of at the end.

(On linux-amd64 also the second entry is also weird so there you need to advance one additional time before entering the loop).

added on the 2014-11-05 21:56:42 by Trilkk

This thread is going very well, thanks to everyone that has contributed and shared their ideas and knowledge. This kind of discussion helps new sceners to learn and also some old sceners may learn a new trick!

In the spirit of sharing I released the full source to Molten Core
Source is now included in the download file for the intro.

It's MASM assembly but written and compiled in VS2013. I included full project including Crinkler with all settings exactly as I made it. Also included bonus is a minimal version of Flow2 shader which compiles to around 750 bytes. I make no claim to being a great coder so please be respectful of that, I commented every line as best as possible.

I don't know anything about licensing, use it however you want but try to be creative and original and use it for a learning tool not just copy paste. Credit to Auld for Flow2 shader and Blueberry/Mentor for Crinkler.

added on the 2014-11-06 03:08:36 by drift

ts, thanks for the benchmarking! Looks like I have some more work to do - not really unexpected considering my initial tests on my side, but more test cases is always good :-) I have downloaded megademo, and let Optimize run a little more freely to test more weighting/context settings (do you actually use Optimize, or just CompressSingle?), and got 3955 bytes, which is not that impressive, but way less of an outlier than your result.
Also, the C code for the decompressor is in github as well now. And stay tuned for some more improvements, since I'm definitely not done with this :-)

added on the 2014-11-06 10:46:15 by minas

I just used pack, I didnt look the code :X

So probably I messed up. What is the proper way to do it? I will then redo the tests...

added on the 2014-11-06 13:40:13 by ts

Question about the C decompressor.

is memsetting modelCounters at the end right? Seems then that u8* base = modelCounters; would just point to random values then if compiled in release mode on MSVC.

added on the 2014-11-07 00:42:57 by mudlord

Seriously, thanks for the helpful posts, tools, source code and other snippets! You rock!

added on the 2014-11-07 01:14:50 by p01

Thought this was relevant to the topic.
http://code4k.blogspot.com.au/2010/12/crinkler-secrets-4k-intro-executable.html

added on the 2014-11-14 00:03:24 by mudlord

Clickable version:
http://code4k.blogspot.com.au/2010/12/crinkler-secrets-4k-intro-executable.html
And yes, it is a very good read indeed, although after reading it I decided to not look again at it during the development of elfling to avoid bias - so there are some differences in the implementation - I didn't want to do a straight ripoff of crinkler or PAQ.
BTW, there's more code out now, it's gotten quite a bit more efficient in the meantime, although I'm not really happy with it yet. And I stopped using hashtables for decompression, so now it will consume way less memory (it still allocates 161MB, but only needs 4MB/context now, and defaults to 8 contexts since in the current state this produces the best results.

added on the 2014-11-14 11:02:48 by minas

Shameless thread necromancy.

I checked out minas' elfling sometime in the beginning of December to try to combine it with my own project. That is, to combine a filedumping header with decompressing most of the actual program directly into memory. I also wanted to use minas' uncompressor source to do it in a platform-independent way.

This turned out to be a bit of work. The procedure:

Generate ELF headers as normal.
Extrant relevant part(s) of elfling unpack.cpp.
Compile stand-alone unpacker into assembler.
Compile the program source into assembler as normal.
Generate a new assembler source, start with unpacker stub.
Append a 'safe' alignment directive to ensure unpacked source will remain in a constant location.
Incorporate actual program source, renaming all labels to not conflict with earlier ones.
Set program entry point to beginning of incorporated source (actual program payload).
Assemble.
Extract file from entry point forward.
Compress with elfling packer.
Repeat earlier with correct packed data, set file to end after unpacker stub.

This generates a file that looks like this:

Code:

ELF headers
----
elfling parameters and compressed data (these can be slightly interleaved)
----
elfling uncompressor stub
----
JMP to _uncompressed
----
file 'end' label (this goes to filesize in headers, generated binary will be cut from here)
----
10 bytes of 0 just in case since elfling needs them
----
alignment directive with some arbitrary safe value
----
program payload
----
fake .bss section

FreeBSD ld-elf.so is generous in that manner, that all memory will be zeroed out when the program is loaded into memory. This means that we can ensure both the working area and the uncompression area required by elfling will indeed be completely filled with 0.

I tried this out on amd64 just to get an idea:

Code:

hello_world.cpp:
dnload: 425
elfling: 1089
dnload+elfling: 823

flow2:
dnload: 869
elfling: 1513
dnload+elfling: 1240

intro.cpp (opens OpenGL window and audio device, raycast and music from very short programs):
dnload: 1215
elfling: 1815
dnload+elfling: 1540

whisky_in_a_tube (ia32 4k intro from last year that already uses dnload.py):
dnload: 4321
elfling: 4758
dnload+elfling: 4505

it seems the uncompressor stub compiled from C++ source is relatively big, and the executable decompressing its payload into memory has entropy somewhat approaching 1, so LZMA filedropping has hardly any effect. Relatively disappointing.

In any case, if you want to try out this hack, it's available here: http://pastebin.com/PgeCngms

minas, if you don't mind, I'd like to integrate support fort this kind combination into my own project - even if the results are less than stellar, perhaps there is something to gain in the future. If you'd like to talk about this, I can be contacted with this nick in google mail - your own contact info seemed to be unavailable or then I did not know how to look.

BTW: Do I have to sell my soul to some scary Google daemons if I want to commit anything into https://github.com/google/elfling ?

added on the 2015-01-27 00:01:44 by Trilkk

pouët.net

OpenGL framework for 1k intro

login