Still room for improvement

After honing the precision of the Tiny GPS Clock, I'd like to improve the GPS Wall Clock in the same way: using the GPS module's timepulse (1PPS) signal to synchronise display updates precisely with the top of each second.

There's a couple of small problems however:

There are exactly four bytes of program space left on the ATTiny13a's 1KB flash.
All of the microcontroller's pins are in use.

4 bytes free program space: challenge accepted

This seems like an interesting challenge. Let's see how far we can get.

Digging for more program space

Going into probably the fifth or sixth code size reduction pass on this project, I really wasn't sure how much more could be done. Surprisingly, in a few evenings chipping away at the problem I managed to free up 180 bytes (17.5% of the total program space) without removing any functionality. Most of this was achieved by giving the compiler as much as possible to work on at once, with a sprinkling of other small optimisations:

Compiled as a single unit to maximise the compiler's ability to optimise - ie. including source files rather than headers in main.c. This is a little messy as it breaks the encapsulation allowed by separately compiled source files, but this alone saved 84 bytes of program space, so it was worth the trade-off for this project (commit).
Used -nostartfiles and provided a custom assembly entry point. This allows some redundant instructions GCC adds for software reset and "exit" to be removed, but most importantly lets us store code in the space normally reserved for the interrupt vector table, since interrupts aren't used at all. Freed up 24 bytes (commit).

The assembly portion (startup.S) is fairly straight forward as all it needs to do is zero out memory and jump to main(). All variables in the C program are already initialised to zero to avoid the need for additional memory initialisation on reset.
Replaced eeprom_read_byte and eeprom_write_byte calls from avr/eeprom.h with C source implementations from the datasheet. Having these functions in-source allows inlining as they're only used once each, plus they could be slightly modified to strip out a wait loop that's irrelevant in this application. Freed up 36 bytes (commit).
Replaced two calls to spi_send with a single call where the address and data bytes to send are combined as a single 16-bit value. Combining these bytes takes the same number of instructions as making two function calls, but with only one function call the compiler can inline the function. Freed up 8 bytes (commit).
Replaced multiply by 10 with bit shifts equivalent to (x*8) + (x*2). There's no other multiplication in the code that uses non-power-of-two factors (that get compiled to bit shifts), so there's no need for a generic multiply routine. Freed up 6 bytes (commit).

Some of these wins involved disassembling the firmware using avr-objdump to spot where space was being wasted:

I initially looked into using a custom linker script, but using -nostartfiles ended up being a much cleaner solution. Labels in the disassembly like __trampolines_end and __ctors_end were a bit misleading - marking the end of sections which were actually empty and not relevant at all.

I wanted to keep this as accessible and easy to work on as possible, so going to hand-written assembly wasn't really something I wanted to do outside of the short startup file. The compiler now optimises 90% of the code into a single function, which would be painful level of optimisation to achieve and maintain by hand.

Avoiding interrupts

Performing an action on a rising edge seems like a great case for interrupts, but given the code space constraints on this chip, it's not really practical. An interrupt handler doing anything more than flipping bits in register memory will need to push and pop registers to the stack, which quickly eats up program space at 4 bytes per register used, plus the instructions to do the required work.

It ends up being more practical space-wise to avoid interrupts and just carefully structure the main loop to get the required behaviour and timing.

Abusing unused register (I/O) memory

The lowest 32 bytes of memory on the ATTiny13a are bit-addressable, which means single-instructions can be used to read, write and test bits instead of using multiple instructions like the regular memory requires:

; Set bit 1 at address 0x05 (PORTB) - I/O memory
sbi    0x05, 1

; Set bit 1 at address 0x9F (bottom of SRAM)
lds    r24, 0x9F ; Load current memory value into register
ori    r24, 0x02 ; Set bit 1
sts    0x9F, r24 ; Store result back in memory

Most of the I/O memory space is used by important working and control registers, but with some careful analysis it's often possible to find individual bits that won't affect behaviour and can be repurposed. In this case I've repurposed:

PB5 bit of DDRB: with the fuses set to use the RESET pin as reset, the port and direction registers for PB5 have no effect.
AIN0D bit of the DIDR0 register: The "input buffer disable register" for PB0 has no effect as this pin is always an output in this application.

Saving one boolean in memory might not seem like much, but the saving on code space with fewer required instructions adds up once you write, read and test the value in a few places.

Double duty GPIO

So there's space to add more code, but we still need to figure out if it's possible to get the timepulse signal from the GPS into the existing circuit. I started by analysing the pin assignments on the ATTiny:

PB0: MOSI Data output to the MAX7219. Available when not sending data to the display.
PB1: SOFT_UART/MISO UART input from GPS, which is externally switched to MISO for in-circuit programming. This is the INT0 interrupt pin which would be potentially useful for the timepulse, but I don't really want to change pin assignments around so will leave this alone.
PB2: SCK Clock output to the MAX7219. Available when not sending data to the display.
PB3: LOAD_CS Chip select for the MAX7219. Idles high when not sending data to the display.
PB4: LIGHT_SENSE/BTN Analog input that's already serving multiple purposes. Mixing an additional digital signal in here would be messy, so I'll leave it alone.
PB5: RESET This pin can't be used as I/O without disabling in-circuit programming. Using high-voltage programming would be impractical for this device given the existing circuit, so this pin needs to remain as reset.

That leaves PB0, PB2 and PB3 as potential inputs. The timepulse signal can't be connected directly to any of these without affecting their output voltage, since the timepulse idles low - sinking to ground. However, if we use the timepulse signal to switch a pull-down resistor instead, it's possible to sneak our input signal in without affecting that pin's output capability. This pull-down resistor only needs to be strong enough to overcome the weak internal input pull-up in the microcontroller.

I opted to attach timepulse to the LOAD_CS pin, since its high output state can be switched to input-with pull-up in a single instruction. A couple of resistors and an NPN transistor are used to create a pull-down from the timepulse signal:

Schematic modifications to combine the GPS timepulse with MAX7219 chip select

Conveniently, the 0805 LED and resistor on my GPS module's timepulse pin could be removed and replaced with some of these components. It doesn't all fit, but the existing pads made this easier than soldering everything as a floating bodge:

Macro photo of a SOT-23 transistor and two resistors soldered to the timepulse pin of the NEO-6M GPS module

Pull-down resistor sizing

The resistor between the transistor's collector and LOAD_CS is needed to prevent the microcontroller pin sourcing too much current when configured as an output while timepulse is active. This resistor could be avoided if there was coordination to ensure timepulse is never active when LOAD_CS is an output, but there's no reason to go to that effort here.

I initially pulled down through a 1.8K Ω resistor which worked fine, but it did result in a very slow falling edge:

1.9uS edge shown on an oscilloscope screenshot

Reducing the pull-down resistance to 470 Ω speeds the edge significantly, though causes a current of 10mA flows while timepulse is active if LOAD_CS is an output:

Another oscilloscope screenshot showing a 200 nanosecond edge

10mA from an I/O pin is well within the spec of the microcontroller, but it's a bit of a waste of power. The waste could be reduced by changing the timepulse length from its default of 10 milliseconds to something like 10μS.

Buffering the time to display

With the timepulse signal finally coming into the microcontroller, the firmware needed some modifications to tick accurately. The gritty details of this are covered in the Ticking Accurately with the NEO-6M write-up, so I won't repeat them here, but the important changes were:

Added a 6-byte buffer to hold the display data ready for immediate output at the top of the second. This wasn't strictly necessary, but it aligns with the implementation in Tiny GPS Clock.
Added code to increment time for display at the next timepulse
Shuffled the main loop around with a blocking wait for timepulse or UART transmission (which ever comes first), and a check if the timepulse fired.
Added detection of the timepulse not firing and fall back to the original direct-from-UART time display method. The last decimal point is illuminated if this fall back is active.

After these changes, the program is back up to 1006 bytes (18 bytes free). Ready for the optimisation again next time I want to add a feature!

7-segment big GPS wall clock — The far-right decimal point is now lit when the displayed time isn't synced to timepulse. In this state the time continues to update using the original, less accurate method.

Calibrating the timepulse

With the timepulse coming into the microcontroller and code changes made, the last step is offsetting the timepulse to account for the display update time:

GPS timepulse followed by SPI transmission on scope

As there's not enough program space left to send configuration commands to the GPS module from the microcontroller, I used the u-blox software u-centre to modify the configuration on the clock's GPS module and save it its SPI flash, without writing any code. This software is only available for Windows, but it can be run in a virtual machine without issues.

Setting the User Delay option of the timepulse to 154μS accounts for the time it takes to update the display's digit memory. With this set, the timepulse fires slightly early to account the delay between the rising timepulse edge and the display completing its update.

Until next time...

I can finally rest easy knowing this source of time in our apartment is slightly closer to the arbitrary concept of time we've invented, even if I have no way to absolutely measure it. The difference is ultimately unnoticeable, but it is nice to see the various GPS clocks in our apartment tick in synchrony now: