Close
0%
0%

kiloboot

1kB TFTP Ethernet bootloader for ATmega328P and ENC28J60

Similar projects worth following
For microcontroller projects that use ethernet, this bootloader will allow you to push firmware updates over the internet. It's written for the most popular ATmega chip and the cheapest ethernet breakout board. It uses the standard TFTP protocol. The IP addresses can either be hard-coded into the bootloader, or retrieved from the EEPROM. The whole thing fits into the 512-word boot section of the ATMega328p.

1kB TFTP Ethernet bootloader for ATmega328P and ENC28J60

Original writeup: https://mitxela.com/projects/kiloboot

Source code: https://github.com/mitxela/kiloboot

Demo:You will need an ATmega328p and an ENC28J60, wired up as follows:

PB1 <--> INT
PB2 <--> CS
PB3 <--> MOSI
PB4 <--> MISO
PB5 <--> SCK

I've only tested this with the ATmega chip running at 16MHz. Running faster than that, you may need to alter the wait routine, and possibly the SPI prescaler. If anyone wants help porting this to another chip, let me know.

I've included a compiled hex file, which if you're planning to use the EEPROM settings, may be usable without changing the config.

By default, the bootloader will try to load the settings from the EEPROM, but if it's empty, it will use the hard-coded backup settings. The first few lines of the .asm file are the configuration. You need to give it a MAC address, an IP address, the addresses of the TFTP server, default gateway and the subnet mask. If you want it to have hard-coded settings, and you're using this on a home router with NAT, it's easiest to check the router settings to see what the DHCP pool is, and give the device an IP outside of that range.

You can optionally change the filename that it requests, this can be up to 31 chars. You can also set the number of reattempts if it can't immediately contact the server. Just after powering on, it may take a few seconds for the ethernet link to establish, so the first few packets may be lost. But too many reattempts means a longer delay before giving up and starting the old application, if the server is actually unavailable.

To use the EEPROM settings, simply fill in the first 16 bytes of the EEPROM with the device IP, server IP, default gateway IP and subnet mask, in that order. You could do this by creating a binary file in a hex editor and writing it to the EEPROM using avrdude, eg -U eeprom:w:"IPs.bin":r with the 'r' type for raw binary. You could also do this by adding an .eseg section to the assembly file which should produce an .eep file in hex format. But the main anticipated method of setting the EEPROM addresses is by writing to them from the application itself, for instance, if it's running a DHCP client.

Beware, if your application uses the EEPROM for something else, the bootloader has no way of knowing if the data there is valid, it only checks that it's not erased. If you don't want to use the EEPROM for these settings, I would recommend disabling it.

If you've never used AVR assembler before, I think the easiest way to build is to use AVR Studio (or Atmel Studio, as the later versions are called). You can simply select new project, 8-bit AVR assembly. The single .asm file is all you need, there are no dependencies. You can also assemble from the command line with something like

avrasm2 -fI -o "kiloboot.hex" kiloboot.asm

You then need to burn this hex file onto the chip and set the correct fuses. The important one is to set the high fuse to 0xDC, with the avrdude command -U hfuse:w:0xdc:m This configures the reset vector and the bootloader section to be 512 words.

Next you need to set up a TFTP server. For windows I recommend tftpd32, for linux I used tftpd-hpa. It's straight forward if you only want to boot from the local network, but making the server publicly accessible is not trivial. See my description here.

The server needs to host a binary image of your application, not an intel hex file. There will be an option to configure your compiler to output a binary file, but if you've already produced a hex file, you can convert it using objcopy:

avr-objcopy.exe -I ihex program.hex -O binary program.bin

The last consideration is triggering the bootloader. This is up to your application to manage. The bootloader will run once on powerup, and after each reset, but not while the application is running. The two obvious methods would be to either run the bootloader on a timer, or have some user-driven aspect...

Read more »

  • 1 × ATmega328P Microprocessors, Microcontrollers, DSPs / ARM, RISC-Based Microcontrollers
  • 1 × ENC28J60 Development Kits, Boards and Systems / Adapters, Adapter Boards and Sockets

  • 1024: a space-saving odyssey

    mitxela01/02/2017 at 16:23 0 comments

    The writeup for this project is very long and detailed, and doesn't really fit the hackaday.io log format. I started out just wanting to learn about networking (it's fun!) and diverted into the bootloader project after a bit. So if you want to follow the journey starting from the very basics of internetworking, see the page on mitxela.com.

    The relevant part for this, as an entry to the 1kB contest, is the optimization. It's all written in AVR assembly, but even so I went to great lengths to squeeze everything in to such a small space. I'm used to optimizing for speed, but optimizing for prog mem requires a different mindset.

    Just as a comparison, the smallest TFTP bootloader for this platform I could find was 8kB. There are other, more expensive ethernet boards that implement a lot more of the IP stack and can make things a lot easier, but the ENC28J60 is only the PHY and some help with the data-link layer (it has a hardware CRC calculator). I deliberately chose this chip because it's cheap, popular, and my original goal was to learn about networking, not have everything done for me. There was an interesting bootloader by kehribar that uses UDP broadcast messages, which greatly simplifies the process, you don't even need to give the device an IP. But that will only work on the local network, using your proprietary protocol. Using TFTP, the file can be anywhere on the internet, maybe the other side of the planet, which may or may not be important, but it has a certain charm to it.

    Many of the optimizations are quite general, even obvious. Others, not so. But I made sure that I wasn't optimizing to the point of obfuscation, and I hope the result is easy enough to understand (if you're familiar with assembly, anyway).

    Some highlights:


    The ethernet chip is controlled by writing to its registers over SPI. Many of these registers are 16-bit, so have a "low" and a "high" byte that need accessing in separate operations. All of these 16-bit registers are contiguous in memory (the high byte's address is exactly one more than the low byte) so we can write an automated subroutine for them.

    In assembly, the concept of a subroutine is far more relaxed than in something like C. The Program Counter is the address in memory the processor is reading from – our "current location" in the code. All that the call function does, is push the current Program Counter onto the stack, and jump to a new address. All that the ret (return) function does, is pop two bytes from the stack into the Program Counter. When you understand that, you can be a lot more relaxed about what's going on. For instance, to do a read from SPI, we need to output a zero on the MOSI line. Rather than load zero right before every function call, I made a single command directly before the doSPI routine, which loaded zero, and labelled it doSPIzero.

    doSPIzero:
      clr r16
    
    doSPI:
      out SPDR,r16
    SPIwait:
      in r16, SPSR
      sbrs r16, SPIF
      rjmp SPIwait
      in r16, SPDR
      ret
    

    Most of the time we'll call doSPI, having set the data to be output, but if we want it to output zero, we call doSPIzero and the program counter falls through into the doSPI function, since there isn't a ret. This type of overlapping routine optimization is quite easy and saves a lot of program space. So, for the enc28j60writeWord routine, we do something a little more complex:

    ; r16 = op | address (L register)
    ; r17 = dataL
    ; r18 = dataH
    enc28j60writeWord:
      push r16
      rcall enc28j60write
      pop r16
      inc r16
      mov r17, r18
    
    ; r16 = op | address
    ; r17 = data
    enc28j60write:
      cbi PORTB,PB2
      rcall doSPI
      mov r16,r17
      rcall doSPI
      sbi PORTB,PB2
      ret
    

    enc28j60write can be called to set one byte to a register. For enc28j60writeWord, we want to call enc28j60write twice, with different arguments. So the first time, we call it within the enc28j60writeWord routine, which does the deed then the ret sends it back to enc28j60writeWord. We switch over the arguments, and then this time, fall through into the enc28j60write routine,...

    Read more »

View project log

Enjoy this project?

Share

Discussions

Mark Atherton wrote 03/23/2021 at 18:43 point

This quite simply, is exactly what it is all about. Wonderful project, well done.

  Are you sure? yes | no

Hyr0n wrote 01/07/2017 at 03:25 point

I think this is really innovative and cool, but why TFTP? Its a very insecure protocol, everything including the username and password is sent in the clear. Why not use OpenSSH and tunnel FTP securely?

  Are you sure? yes | no

mitxela wrote 01/07/2017 at 12:23 point

Because TFTP was designed for network booting... Because many real-world applications will only be booting from the local network... But mostly because of the 1kB size limit.

I may have underplayed just how tight a squeeze it was. Even the driver for the ENC28J60, before we can start sending packets, is most of a kilobyte. While googling I found a few similar TFTP bootloaders that people were struggling to fit into 8kB (the biggest boot sector on the atmega chip). Writing the whole thing in assembly, and recursively optimizing for space, meant I could _just_ manage the 1kB limit.

TFTP is not FTP, and makes no pretence of security. There is no username or password, just a single request to read a file from a server.

OpenSSH? The keys alone are hundreds of bytes. It would be an impressive feat even to fit it into the full 32kB of application space.

  Are you sure? yes | no

Does this project spark your interest?

Become a member to follow this project and never miss any updates