Close

GLXgears with the Budge 3D routines

A project log for GLXgears on a commodore 64

GLXgears on a commodore 64

lion-mclionheadlion mclionhead 08/14/2024 at 03:450 Comments

The next trick with glxgears is there's also a camera transformation.  The budge library doesn't support a camera transformation & it doesn't use a transformation matrix that can be multiplied.  The 3 gears would have to be translated after rotating Z, then translated again after rotating X & Y.  Despite a generic transform being available, there aren't enough clockcycles to really use it.

As discovered earlier, the rotation can't be baked into the models because the shaft polygons need all the steps.  Maybe the teeth could be baked in while the shafts were computationally rotated.

Through some clever programming, you can fit a single gear into 176 points & 264 lines by reusing the points.  The  budge library would need to be expanded to support 16 bit line pointers & to support different tables for each model.

Noted ca65 sometimes says "Didn't use zeropage addressing".  You have to put all the variables before the usages so the assembler knows to use zero page instructions.

Over a few days, most of the original Budge 3D library was removed.  Only the transformation & line drawing remaned.  The arguments were set manually instead of using tables.  The scaling, crunch, & code arrays were gone.  A few more self modify entry points allowed pointers to different models.

For the record, there was an attempt to color the gears by setting screen memory.

It was too ugly.  Where the gears overlap, there's a non deterministic order which would require real Z buffering to look right.  It could be mostly done by copying bitmap memory into 2 sprites but it would be complicated & lions survived without it.

For some reason, it gets real slow when the gears are edge on.  It has to draw all the vertical lines & the joiner lines in their full length, but none of the horizontal lines.  The same phenomenon happens when the top  or side of the gears is facing the camera.  Somehow the joiner lines really slow it down.  The line drawing could be impacted more by length than  number.

In its fastest orientation, Glxgears with the 1980 budge code ran at 2.22fps compared with 1.25fps for the 2023 edition.

In the slowest orientation, the budge code ran at 1.45fps while the 2023 edition  ran  at .91fps.

 Both were using orthogonal projection & limited to 256 columns.  Kind of enlightening how methods known in 2023 produced slower results than methods known in 1980 & later forgotten.

-----------------------------------------------------------------------------------------------------------------------------------------------------

There are theories about how line drawing could be faster, but smarter minds spent 40 years optimizing it.  Also, there's no way the budge line drawer could reach all 320 columns without becoming just as slow as the 2023 edition.  The commodore can advance 7 of 8 rows with inc instead of a table lookup.  It can advance 7 of 8 columns with bit shift instead of table lookup.  To avoid more branches, it would have to be unrolled.  If these loops were unrolled, it could probably reduce a 12 cycle address copy to a 5 cycle ror.  Starting & ending the loop is problematic because the line could start & end anywhere in the 8 iterations.  It could call a rolled version in certain conditions that occur for up to 14 iterations per line.  If the average line starts & ends in between multiples of 8, it would be a loss.  Most of glxgears is under 15 pixels with the start & end between multiples of 8 but not the cube demo.

Noted it's more expensive to advance rows than columns, vertical lines are slower than horizontal lines, because the row advance requires increasing a 16 bit address.  You couldn't increment 1 byte & check the result for a multiple of 8 without making it slower.

The Bresenham algorithm is basically a counter which wraps around at every width or height of the line, depending on if the slope is > or < .5.  The counter is incremented by the height or width of the line after every pixel.  If it wraps, we increment either a row or column.  The budge implementation only handles line dimensions up to 256.

1 thing that could work is offsetting all the rows by 32 bytes to center the graphics.

------------------------------------------------------------------------------------------------------------------------

Lions started wondering if a real C64 could have been overclocked with a faster crystal if the video was captured on custom hardware that could handle arbitrary scan rates.

It's kind of a shame to not make some kind of FPV game with the budge library.  It could make a nice wireframe flight simulator or driving game.  An optimized line drawer could draw a filled ground & sky.  Foreground objects could be wireframe.  Back in the day, games really were just graphics demos with no campaign beside shooting everything.

Discussions