One of the goals of this project is to make the code capable of being entered into the Hackaday 1K code competition. This isn't perhaps as easy as it sounds: 1K is not a lot of memory, and the Arduino framework (as generally friendly as it is) isn't all that helpful if your goal is to use the smallest amount of memory that you can. As an example, consider the smallest Arduino sketch imaginable:
void
setup()
{
}
void
loop()
{
}
How much space does this simple sketch take up? AVR Memory Usage
----------------
Device: atmega328p
Program: 444 bytes (1.4% Full)
(.text + .data + .bootloader)
Data: 9 bytes (0.4% Full)
(.data + .bss + .noinit)
Yep, 444 bytes out of our 1024 bytes, and we haven't even started to do anything yet.What happens if we include just a few additional library calls? Here is a simple program that simply blinks the LED on board once a second, and writes out the word BLINK each time.
void
setup()
{
Serial.begin(9600) ;
pinMode(13, OUTPUT) ;
}
void
loop()
{
Serial.println("BLINK") ;
digitalWrite(13, HIGH) ;
delay(500) ;
digitalWrite(13, LOW) ;
delay(500) ;
}
How much memory does this use? AVR Memory Usage ---------------- Device: atmega328p Program: 1916 bytes (5.8% Full) (.text + .data + .bootloader) Data: 192 bytes (9.4% Full) (.data + .bss + .noinit)
Yep, 1916 bytes of flash! We are way over our 1K code, and we have barely done anything yet. Clearly if we are going to implement my 1Keyer in just 1K, we are going to have to be clever, and use some techniques which aren't part of the normal way of working with Arduino. As a hint to what's coming, I'll note that you don't have to use all the Arduino libraries to write code for Arduino microcontrollers. You can use avr-gcc directly (or via platformio) to compile the following simple C program which does nothing.
int
main()
{
for (;;) ;
}
And how big is that?AVR Memory Usage
----------------
Device: atmega328p
Program: 134 bytes (0.4% Full)
(.text + .data + .bootloader)
Data: 0 bytes (0.0% Full)
(.data + .bss + .noinit)
Just 134 bytes! Sure, we won't be able to use the convenient pinMode, digitalWrite and Serial commands if we do this, but we can recover space for all the functionality of those that we don't use. That will be the key to making this work. And, as it turns out, it's not particularly difficult to do simple things without this Arduino scaffolding.
Bonus: First of all, don't be terrified by this. We are going to dive a tiny bit into the actual machine code that is generated by the C compiler to see how we can eke out some more space. If this doesn't make any sense to you, it's entirely natural, and don't get frustrated or depressed. Try reading a machine language tutorial like this one and even if it doesn't make sense, try to follow along, and if you have any questions, feel free to leave comments.
We can figure out what all these 134 bytes are doing if we use the avr-objdump program which is part of avr-gcc. Again, just teasing, here's what happens when I tell avr-objdump to disassemble the firmware that we generated from that empty main() program above:
> ~/.platformio/packages/toolchain-atmelavr/bin/avr-objdump -z -d -C .pioenvs/uno/firmware.elf .pioenvs/uno/firmware.elf: file format elf32-avr Disassembly of section .text: 00000000 <__vectors>: 0: 0c 94 34 00 jmp 0x68 ; 0x68 <__ctors_end> 4: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 8: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> c: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 10: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 14: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 18: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 1c: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 20: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 24: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 28: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 2c: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 30: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 34: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 38: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 3c: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 40: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 44: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 48: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 4c: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 50: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 54: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 58: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 5c: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 60: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 64: 0c 94 3e 00 jmp 0x7c ; 0x7c <__bad_interrupt> 00000068 <__ctors_end>: 68: 11 24 eor r1, r1 6a: 1f be out 0x3f, r1 ; 63 6c: cf ef ldi r28, 0xFF ; 255 6e: d8 e0 ldi r29, 0x08 ; 8 70: de bf out 0x3e, r29 ; 62 72: cd bf out 0x3d, r28 ; 61 74: 0e 94 40 00 call 0x80 ; 0x80 78: 0c 94 41 00 jmp 0x82 ; 0x82 <_exit> 0000007c <__bad_interrupt>: 7c: 0c 94 00 00 jmp 0 ; 0x0 <__vectors> 00000080 : 80: ff cf rjmp .-2 ; 0x80 00000082 <_exit>: 82: f8 94 cli 00000084 <__stop_program>: 84: ff cf rjmp .-2 ; 0x84 <__stop_program>
This is the raw assembly and machine code for the program we just produced. To really understand what is going on here, you'll have to probably read the datasheets for the Atmel ATmega328, which is the chip this is compiled for, but briefly here are the major parts:
- It all starts at address 0 with the interrupt vector table. When the program receives an interrupt (which I'll perhaps cover a bit more in the future) then it uses the interrupt number to jump to the appropriate handler. The ATmega328 has 26 separate possible interrupts, each one of which has a jmp instruction to tell it to go when it receives that type of interrupt. When the program starts, it will start at address 0, which contains a jmp to the location __ctors_end.
- __ctors_end() is responsible for doing all the necessary setup. In this case, it doesn't do much. It clears the contents of register1 (eor r1, r1 uses exclusive or to make sure r1 contains zero). It then stores it in the SREG, which is the status register.
- The four instructions at 0x6c setup the stack pointer to point to the end of RAM. For the ATmega328p, the end of ram is at address 0x8FF, and the out instructions write that into the SPH and SPL special registers.
- It then issues a call() to address 0x80. If you look at that instruction, that just does a relative jump to the address 2 bytes before the current PC location. When this executes, the PC has already incremented, so this basically resets the PC to execute the same instruction over and over. In other words, it implements the infinite loop that I coded in the main().
One thing that you can see from this is that most of the storage used by this program (104 out of the 134 bytes) is actually the interrupt table. If we don't want to use the interrupt service table (we never enabled most or all of the interrupts available) we could potentially recover all that space. We might have to resort to such extreme measures in the future. It depends on how far we get before we run out of space. But there are all sorts of games we can play.
For more hints about the sort of thing we will be doing, you can look at this article by Nerd Ralph.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.