pouët.net

Catharsis Theory by Cluster

Catharsis Theory

An intro for the Atari 2600, released at Revision 2017.
4k ROM, no compression, 128 bytes RAM, no framebuffer.

Code and direction: Kylearan (andre.wichmann@gmx.de)
Music: Glafouk (glafouk@hotmail.com)

"How do you get to Carnegie Hall? Practice, practice, practice. How do you
become a very angry person? The answer is the same. Practice, practice,
practice." - B.J. Bushman (2001)


After focusing on transitions and flow (TIM1T), hi-res graphics against all
limitations (Ascend), and sync (Derivative 2600), this time I wanted to do
a size-limited intro. For one because many people in the 2600 scene think
it's mandatory to make a 4k production or otherwise you're not a "true"
2600 programmer, but also because I wanted to show that an oldschool 4k
demo doesn't mean you have to ignore concept and direction.

If you like the concept is for you to decide. What follows is a technical
write-up about each effect and the size-coding challenges involved.


### Lightning ###

I use players, missiles and the ball to display the lightning, to get
enough colors but also to be able to have branches. A pseudo random number
generator is used in the kernel to determine if the direction of the
lightning should be reversed, if a new branch should be started, and for
each branch in which direction it should go and how long it should be.
Since there are a lot of conditionals involved, this kernel is not a
strict n-line kernel. Instead, it simply takes as long as it needs to
decide and process all things before doing a WSYNC and HMOVE, resulting
in a variable 2-4 scanlines before objects are moved. This is no problem
as lightning should look chaotic anyway, and the kernel uses INTIM to
determine when to stop.

One interesting challenge was when a branch (using a missile) ends. Since
movement of all objects is determined randomly and happens relative to its
last position (as per usual on the Atari 2600), I have no idea where
everything is for a given line during the kernel. So how should I "recall"
a missile object back from a branch to the main lightning then if I don't
know its position? Computing the new new position after each change would
be too expensive, not even speaking about wasting a scanline for
repositioning via HMOVE.

That was when it paid off that I have that strange habit of reading
specifications from front to back instead of using it only to look up
things when needed. I remembered these odd registers I had never used
before, RESMP0/1, which will force the missile objects back to their
corresponding player objects - exactly what I needed here. Originally
built into the console for supporting the Tank game, they came in handy
here as well! It was fun to make use of such an obscure feature.

For displaying the word below the lightning, there is a table of all
characters needed and the words are constructed on demand. There's a
small "framebuffer" where I scroll in each character from right to left
until centered, checking for kerning while doing so. This routine
combined with the tables for the words and the characters are actually
smaller than storing the playfield graphics for all words directly.

Since I wanted an aggressive, energetic look, I used the famous Cosmic
Ark star effect expanded to all five objects as an overlay. Credits for
how to do it with five objects goes to Thomas Jentzsch. When he discovered
how to do it, I immediately dismissed it as I couldn't imagine a use case
(using it as a star field looks very bad in my opinion, as the pattern
is way too regular). Turns out my initial reaction was misplaced, as I
ended up using it not only here, but in the white noise part as well. :-)
 

### Tunnel ###

Yes yes I know, a tunnel effect - the most overused effect of the demoscene
after the rotating cube. I can almost see your eyes rolling, but hear me
out first!

To support the concept of the demo, I wanted to have effects related to
the notions of descent and falling - and a tunnel is perfect for that. And
on a technical level, as far as I know no tunnel that big (16x16) with so
high precision and with multiple colors has been done on the Atari 2600
before. The reason is the lack of a framebuffer combined with the weird
playfield behavior that makes it so hard. For example, the tunnel in KK's
Ataventure demo is smaller and only single-colored, so I wanted to beat
that - and doing it in a multi-part 4k demo was an even more interesting
challenge.

The tunnel is displayed using a mirrored playfield where the right side
gets updated at the only cycle where this is possible (45), a quad-sized
player overlay and some AND masks applied via the SAX opcode for creative
dithering. The atan and distance tables (using 8.2 precision) as well as
the two textures are packed into 512 bytes only, and it took a lot of
optimization to use these packed tables to construct those weird playfield
registers at 25fps at least. Computation is spread over overscan, vblank
and the kernel, which is also the reason I have been unable to center the
tunnel vertically on the screen - I need the screen real estate for calling
the music routine and computing 1/8th of the next frame!


### Triangle ###

Not much to say here. The original plan was to make a triangle "hole"
oscillating back and forth, but due to some brain fart the original concept
didn't work (or at least not in 4k). But since I discovered an acceptable
background effect by accident (applying a sine-moving color gradient over
a scrolling playfield background), I kept the part - it doesn't look very
good, but I think it fits the theme and thus works okay here.


### Parallax ###

The easy way to implement such a two-color-bar parallax effect would have
been to make each color 8 pixels wide and then simply use player and missile
objects for the scrolling, as that would have been exactly in the range that
HMOVE can handle. But I wanted to have wider colors (16 pixels), which made
some more complex computation necessary if I didn't want to have black lines
between the bars used for repositioning. So what I have to do now as well is
to determine if I have to swap colors when a new bar begins, as that allows
me to remain in the HMOVE range again for the objects. That took a lot of
RAM and some fiddling around.

There's actually a small bug in the code that causes some jitter from time
to time, but I decided to leave it in there as it fits the theme nicely. :-D


### Game ###

Trivial, even to do it as size-optimized as possible.


### Relief/Relive ###

The display kernel uses SkipDraw for showing the characters and a mirrored
playfield for the parallax rays. The challenge here was how many counters I
could deploy for determining when to set or clear playfield bits in a
two-line kernel in addition to the SkipDraw routine. Turns out it's six,
with zero cycles left. :-)

For the distortions increasing in duration over time, I thought about how
to use random numbers to do it - but in the end, a simple table with 
timestamps was smaller and easier to control.


### White Noise ###

With each playfield pixel being 4 pixels wide, it is surprisingly difficult
to implement a good-looking white noise. You only have 5 hi-res objects
which cannot be repositioned arbitrarily during the kernel, and besides you
wouldn't have time for that when you also want to constantly write new
values into the playfield registers in the kernel.

That was when I remembered the Cosmic Ark star effect, which not only
allows for more copies of one object per scanline, but which also does the
repositioning each scanline for you for free. So all I do is write random
values into the playfield registers using an inlined prng algorithm, and
the overlaying Cosmic Ark star effect with all five objects hides the low
resolution a bit. In the chaos the regular pattern of the effect is also
less visible, so it works nicely.


### Music ###

When I had all parts roughly working (already programmed with size in
mind), I only had 236 bytes left - and that was without any scripting yet,
without the music routine and without music data! I had jokingly posted a
screenshot of my compiler free ROM output on Facebook and was surprised
and happy to receive multiple offers from musicians to try and make music
even for this little space! <3 Glafouk had been the first one, and since
he has been using my TIATracker several times already, I happily accepted
his offer.

After adding scripting and several serious and painful optimization
passes, I figured Glafouk could have about 320-350 bytes for the music
(including the player routine!), which is ridiculously small - especially
considering that I needed *two* tunes, one short "happy" loop for the
Relief part, and a longer, dissonant, aggressive tune for the rest of
the demo. Poor Glafouk!

After Derivative 2600, I had spent a very long time developing TIATracker
which now paid off, as it allowed quick experiments and turn-around times.
Glafouk went totally crazy and sent me no less than 17(!) small "happy"
loops and 3 main tunes - you can find his experiments in my public
bitbucket repository for this demo, in the "whiteboard/music" folder.

Don't blame Glafouk if you find the music too repetitive and simple -
with these limitations forced upon him by me, I'm very happy with the
music he delivered. The problem was that I couldn't cut out a full part
of the demo to make more room for the music, as all parts were needed
for the underlying concept to work.


### Optimizing for Size ###

With so much content in a 4k demo, almost half my development time went
into optimizing for size. I don't know how often I read my code front to
back, each time scraping off another 5-10 bytes from somewhere... Here are
some things I did:

- First, the obvious and most painful measures: I had to cut out features.
  Originally, the tunnel had a decorative border, the game part had a ammo
  display and the crosshair moved for each shot, the parallax part showed
  different symbols each time, the colors of the lightnings and the
  parallax bars changed, etc.
- I searched for the subroutine I called the most (fine-position an object,
  called 13 times) and used BRK instructions instead of JSR to call them,
  saving 13 bytes.
- In a spreadsheet, I listed all data tables I used (around 40) and their
  beginning and ending bytes. Then I looked for tables that start and end
  with the same byte sequences, and moved them in the code so they could
  overlap, taking into account that some of them must not cross a page
  boundary.
- If possible, I would inline subroutines at least once.
- Where possible, tables were packed together. For example, the characters
  used for the lightning words are only 5 bits wide, so I used the upper
  3 bits for the durtaion values for each part (the 6 bits of a 9 bit value
  split into two 3 bit values). Using AND to mask out unwanted bits and
  combining the two 3 bit values is smaller than a stand-alone table of
  32 duration values.
- Similarly, the part indexes which specify which part to call when
  (3 bits) are encoded in another table with unused bits.
- Turns out the distance table for the tunnel effect also doesn't use the
  upper two bits, so the two textures went into those.
- The music only uses one percussion instrument, simplifying the player
  routine. In addition, all start indexes of the patterns are <128, so I
  could do away with having to construct a pointer to the current pattern
  and use a simple byte index instead.
- And lots of traditional stuff like manually mirroring the sin table;
  identifying actions that I to several times (like setting most graphics
  registers to 0 at the end of a kernel) and making a subroutine out of
  them; checking if SEC/CLC instructions are really needed, sometimes
  even accepting occasional errors in the operation as long as they are not
  really noticable; re-using immediates (if a color stored into COLUPF
  which has bit #1 set, use it immediately to enable a missile for example)
  etc.
 
Several months of optimizing code for size nearly drove me insane, and now
I need a break and have a huge itch to unroll loops, waste ROM with look-up
tables and such. :-)