Current best tool for compressing Windows 64K Intros?
category: code [glöplog]
Hi all. I'm just wondering if kkrunchy is still the recommended tool for compressing Windows 64 intros? I've currently got "kkrunchy_023a4_asm07.exe" plugged into my tooling pipeline and wondered if there was something else I should be using instead? Many thanks in advance! :)
Thank you gasman. Good old Ferris :D
Here is Ferris' talk as shown during Revision 2020 online. Around 03:30 he compares squishy to kkrunchy.
Anyone ever made comparisons between Squishy and BeroExePacker (based on kkrunchy, but 10% more powerful) ?
I compared BEP vs an earlier squishy version on trashpanda in this twitter thread where squishy came out on top, but that's anecdotal at best. I can imagine cases where BEP might perform better with the right combination of command line arguments as it's certainly more configurable and has more features (eg. hash import which can be a sensible tradeoff, and TLS if you happen to use that). Also I have evidence that pheromone is larger with squishy than with kkrunchy (which I suspect is the case for most things below 30k or so). Which leads to a good point - you might want to try your options here; while squishy typically wins on compression (sometimes up to several kb) you're paying for that with decode speed which may or not matter to you. In some cases (like with basically anything that uses gargaj's synth) squishy is actually far too slow and doesn't compress significantly better anyways.
While I'm obviously quite proud of my work and happy people want to use it, I'm even more happy with a wide variety of packers available and both kkrunchy and BEP are quite good, so be sure to try them too! :)
While I'm obviously quite proud of my work and happy people want to use it, I'm even more happy with a wide variety of packers available and both kkrunchy and BEP are quite good, so be sure to try them too! :)
So, I've been rebuilding some old 64k intros of mine (1998 to 2002) for Windows, and then compressing them with kkrunchy, kkrunchy_v7 and squishy-x86. However, I cannot launch them in compressed mode because Windows thinks they are viruses, and deletes them right away. The three of them. Additionally, it also deletes the other two compressed versions that sit next the it. Pretty clever.
I knew storing intros online is almost impossible (my website has been blacklisted by Google a few times already); but on top of online storage now I cannot even store them in my disk, not if I want to run them at least.
It seems the world is pretty hostile to the type of 64k packers we use in the demoscene. So I have two questions:
1. How do people watch 64k intros at all? Do you tweak Windows somehow to prevent the deletion of the executables? My test is in a vanilla, normal, minimal Windows computer.
2. Back in the 2000s when UPX and kkrunchy were hot, most of the content in 64k intros was generated in code. However, these days most content is generated in the GPU (through tessellation/geometry/mesh/compute shaders, or mere raymarching). Even sound can be generated in the GPU.
So.... is there a world where we give up on transforming and compressing CPU code, and focus only on compressing data? This I can do myself in my own intro, but what I'm trying to say I guess is that I'd be awesome is squishy had an option to compress .data and .bss segments only rather than .code.
I'd personally happily give up 16 kilobytes for making my executables executable again. I'm thinking that I can impress more people by getting them to see the content-wise equivalent of a 48k intro than by having nobody watch a fill 64k intro....
I knew storing intros online is almost impossible (my website has been blacklisted by Google a few times already); but on top of online storage now I cannot even store them in my disk, not if I want to run them at least.
It seems the world is pretty hostile to the type of 64k packers we use in the demoscene. So I have two questions:
1. How do people watch 64k intros at all? Do you tweak Windows somehow to prevent the deletion of the executables? My test is in a vanilla, normal, minimal Windows computer.
2. Back in the 2000s when UPX and kkrunchy were hot, most of the content in 64k intros was generated in code. However, these days most content is generated in the GPU (through tessellation/geometry/mesh/compute shaders, or mere raymarching). Even sound can be generated in the GPU.
So.... is there a world where we give up on transforming and compressing CPU code, and focus only on compressing data? This I can do myself in my own intro, but what I'm trying to say I guess is that I'd be awesome is squishy had an option to compress .data and .bss segments only rather than .code.
I'd personally happily give up 16 kilobytes for making my executables executable again. I'm thinking that I can impress more people by getting them to see the content-wise equivalent of a 48k intro than by having nobody watch a fill 64k intro....
Quote:
1. How do people watch 64k intros at all? Do you tweak Windows somehow to prevent the deletion of the executables? My test is in a vanilla, normal, minimal Windows computer.
I just put my "demo" directory as an exclusion to Windows Defender, and have had no problems since.
What Gargaj said, and I found that Windows Defender isn't all that aggressive compared to AV products from other vendors. Also with already-released demos you have the advantage that someone else most likely already submitted them as a false positive to your AV vendor, so the demo will just run. I regularly do this when I find one that's still mis-detected.
We do that to our own intros before the compo - highly recommended to do the same. Here's a list of contact addresses.
iq has a point tho - a lot of 64k intros don't have _that_ much code anymore, and also a lot of them end up at way less than 64k nowadays. It might be a good idea to optionally dial down on the header/section tricks and even compression (what gets compressed / how long does decompression take) to not trigger AVs' heuristics right away if you don't strictly need it. Ofc there are always productions where each byte counts, but doing something good in something that would land at 48k otherwise is absolutely possible, and shouldn't be subject to that vastly diminshed accessibility we're suffering from today.
I like iq’s suggestion. How about abandoning the 64k limit and increasing it to 80k (or similar) in compos from now on and dropping usage of traditional exe packers.
Also, you could put some relocateable code into the data segment and decrunch and execute at run-time.
Or switching to some homebrew script bytecode/VM for the main code that can be compressed nicely and stored in the data segment (has been done before anyways). So the actual uncompressed exe code is kept as minimal as possible. Probably need to make some tradeoffs between vm complexity and bytecode then. Just thinking out loud, probably a lot of bs, considering i usually don‘t code intros..
Also, you could put some relocateable code into the data segment and decrunch and execute at run-time.
Or switching to some homebrew script bytecode/VM for the main code that can be compressed nicely and stored in the data segment (has been done before anyways). So the actual uncompressed exe code is kept as minimal as possible. Probably need to make some tradeoffs between vm complexity and bytecode then. Just thinking out loud, probably a lot of bs, considering i usually don‘t code intros..
I don't think any of that really helps with the antivirus problem. It's just a matter of time for AV vendors to detect this new packer, realize that they cannot properly analyze the unpacked code and just flag all executables using that packer. If you really want to fix the problem, I guess you need to be able to use an unpacker without having to run the executable itself, and give that unpacker to AV companies. But even then it's not clear if they would care enough to actually use that unpacker like they do with UPX, or if they'd just continue flagging those executables.
a few notes/thoughts off the top of my head:
This is a serious suggestion: wasm/wgpu is looking more and more like a very viable 64k platform in my eyes. Distribution/platform compatibility is largely a non-issue, the interface to the GPU, even more more modern features, is simpler, and I suspect (though I haven't tested/confirmed) that compression would be easier/better as well, perhaps even having less overhead with headers/unpacking code.
- I think we're being a bit to pessimistic here by only considering x86. The situation for x86-64 squishy seems much better than x86 according to some virustotal tests I've done so far (though not perfect at this point)
- Nothing's stopping us from dynamically allocating a new section for decompressed code/data just before unpacking; I'd guess this costs <500 bytes for everything, and is obviously much more affordable in any sense than dropping compression. It's probably worth actually trying this and seeing what I find; maybe it solves most of the issues. Except a big one..
- Many false positives are based on an estimated 0th-order entropy of the executable image, and I suspect dialing back compression won't do enough here. I suspect this is an issue more or less regardless of what we attempt.
This is a serious suggestion: wasm/wgpu is looking more and more like a very viable 64k platform in my eyes. Distribution/platform compatibility is largely a non-issue, the interface to the GPU, even more more modern features, is simpler, and I suspect (though I haven't tested/confirmed) that compression would be easier/better as well, perhaps even having less overhead with headers/unpacking code.
side note for iq: I'd love to see some size numbers with the various packers you tried :)
Quote:
This is a serious suggestion: wasm/wgpu is looking more and more like a very viable 64k platform in my eyes. Distribution/platform compatibility is largely a non-issue, the interface to the GPU, even more more modern features, is simpler, and I suspect (though I haven't tested/confirmed) that compression would be easier/better as well, perhaps even having less overhead with headers/unpacking code.
It is also very much contingent on how browser creators approach online security, and they've been both notoriously unpredictable with their deprecations / exclusions. I trust them less than Windows.
From my experience, getting rid of kkrunchy/squishy doesn't help that much with the AV situation. I've already seen AVs going nuclear on UPX-packed executables, and Google blacklisted my website once because of an uncompressed, straight-from-the-compiler executable. I guess that's a battle we can't win :(
*cc21 flashbacks* :/ had to copy exe thrice and eventually run straight from USB flash before freaking Defender kills it
nb my laptop wasn't infected back then, and virustotal showed only one positive (+ two suspicious) out of several dozens on my demo' exe
nb my laptop wasn't infected back then, and virustotal showed only one positive (+ two suspicious) out of several dozens on my demo' exe
How about putting all the compressed stuff of the intro into a separate data file instead of cramming it into the exe itself? Would that help against AV issues?
What about using Microsoft‘s compression stuff like CAB/LZX or similar (for the data file)? Compression ratio probably isn’t top notch, but still usable I would guess. Shouldn‘t that help make intros look less suspecting to AVs?
What about using Microsoft‘s compression stuff like CAB/LZX or similar (for the data file)? Compression ratio probably isn’t top notch, but still usable I would guess. Shouldn‘t that help make intros look less suspecting to AVs?
Using cab file support was exactly what 20to4 did almost 20 years ago :)
Nowadays there is the Compression API (see compressapi.h), or you could rely on PowerShell "Expand-Archive".
However, it won´t help your AV issues directly as these are usually caused by other aspects which might still apply (or can be avoided with reduced settings in your favourite exe packer, too):
-obfuscated API calls (read: Import by hash or ordinal)
-uncommon PE header contents
Other main reasons like lack of code signing, unusual file location, unknown/low reputation source, etc. still persist.
Even worse, while monitoring API and IO usage was a good indicator for suspicious software this does no longer apply to newer approaches which just rely on execution timing for getting leaked data. Looks like AV is more and more switching from signature based blacklsiting to a reputation based whitelisting, which is not only a problem for scene prods but punishes basically all producers and users of small-scale company and hobbyist software projects.
Instead of trying to sacrifice compression for AV compliance (you´ll never win anyway) it is better to instruct users to configure their security setup as needed and set up exclusion folders like Gargaj mentioned above.
@KeyJ: Even better, my site got temporarily flagged containing malware due to a sample executable which contained nothing than just a simple "return" instruction. Despite all reasearch and efforts, AV in practice is still a totally dumb thing.
Nowadays there is the Compression API (see compressapi.h), or you could rely on PowerShell "Expand-Archive".
However, it won´t help your AV issues directly as these are usually caused by other aspects which might still apply (or can be avoided with reduced settings in your favourite exe packer, too):
-obfuscated API calls (read: Import by hash or ordinal)
-uncommon PE header contents
Other main reasons like lack of code signing, unusual file location, unknown/low reputation source, etc. still persist.
Even worse, while monitoring API and IO usage was a good indicator for suspicious software this does no longer apply to newer approaches which just rely on execution timing for getting leaked data. Looks like AV is more and more switching from signature based blacklsiting to a reputation based whitelisting, which is not only a problem for scene prods but punishes basically all producers and users of small-scale company and hobbyist software projects.
Instead of trying to sacrifice compression for AV compliance (you´ll never win anyway) it is better to instruct users to configure their security setup as needed and set up exclusion folders like Gargaj mentioned above.
@KeyJ: Even better, my site got temporarily flagged containing malware due to a sample executable which contained nothing than just a simple "return" instruction. Despite all reasearch and efforts, AV in practice is still a totally dumb thing.
Ferris, I have converted rebuilt these to MsDos intros to Windows. These are some results, with default settings for all three compressors (I haven't investigated each one's options yet)
Rare2 https://www.pouet.net/prod.php?which=8945
Life: https://www.pouet.net/prod.php
These are DOS intros. I'm slowly going through all my past intros/demos. When I arrive to the Windows ones (Paradise and 195/95/256) the numbers will be more meaningful.
BTW, off topic, but I'm finding awesome to look at code I wrote 20 years ago, I still understand it all. Also, I made them resolution independent (not so easily done with intros with tight hardcoded asm loops), and I can run them not in glorious HD-but-not-perspective-correct-and-texture-unfiltered look. It's pretty awesome, there's something cool about it.
I'll bring more numbers when I have them.
Rare2 https://www.pouet.net/prod.php?which=8945
Life: https://www.pouet.net/prod.php
Code:
Raw Squishy kkcrunchy2 kkcrunchy MsDos+UPX
Rare2: 102k 30k 31k 34k 47k
Life: 101k 37k 39k 45k 61k
These are DOS intros. I'm slowly going through all my past intros/demos. When I arrive to the Windows ones (Paradise and 195/95/256) the numbers will be more meaningful.
BTW, off topic, but I'm finding awesome to look at code I wrote 20 years ago, I still understand it all. Also, I made them resolution independent (not so easily done with intros with tight hardcoded asm loops), and I can run them not in glorious HD-but-not-perspective-correct-and-texture-unfiltered look. It's pretty awesome, there's something cool about it.
I'll bring more numbers when I have them.
(sorry for the grammar in the previous post, I'm starting to suspect I have some writing disorder)
On WebGL/WebGPU, I don't know. I mean, it simplifies distribution and probably has a lot to offer with compression. But Gargaj has a point, security is a major concern for browsers and the fact we haven't seen any major API being obsoleted because of it, it might just be because the platform is still young (say 10 years). For all we know WebGL could disappear soon after WebGPU is out, or when <model> takes over; just like Flash got wiped out (for whatever reasons).
But, I am also in camp that thinks Web playback solves a lot of problems right now.
A good thing is that since, again, most intros these days are data/shader driven more than code, converting a native intro to a js intro is actually easier than ever, so.... it can get difficult to do this in reality because of Real Life time constrains, but with some effort one can probably build both native and web versions of an intro without too much effort.
On WebGL/WebGPU, I don't know. I mean, it simplifies distribution and probably has a lot to offer with compression. But Gargaj has a point, security is a major concern for browsers and the fact we haven't seen any major API being obsoleted because of it, it might just be because the platform is still young (say 10 years). For all we know WebGL could disappear soon after WebGPU is out, or when <model> takes over; just like Flash got wiped out (for whatever reasons).
But, I am also in camp that thinks Web playback solves a lot of problems right now.
A good thing is that since, again, most intros these days are data/shader driven more than code, converting a native intro to a js intro is actually easier than ever, so.... it can get difficult to do this in reality because of Real Life time constrains, but with some effort one can probably build both native and web versions of an intro without too much effort.