20241119 - Setting Up SDCC for the Z-80 | Details

As mentioned before, I am attempting to make a custom ROM for the Cefucom PCU to help in reversing the hardware. I have no idea whether the folks that have the actual unit in their possession would be game for such, but I'm continuing on as a mental exercise, and maybe, who knows, they might take me up on it.

I also mentioned that I am going to make my life harder by trying to bring up a C-based environment rather than simply cranking out some bespoke assembler like a professional would do in the real world. To do that, I am going to use SDCC [https://sdcc.sourceforge.net/]. This project has been around for a very long time and realizes a pretty rich C toolchain for old-school CPUs, including the Z-80. It is a labor of love, however, and as such there are some things that can be frustrating. E.g. there is pretty rich documentation, but it often does not explain what I am trying to find out or seems a little out of date. But hey that's what you get for 'free, and uncompensated work'. So we have to soldier on...

If you're familiar with Z80, you'll know that reset is always at address 0000h, that there are special locations for 'restarts' on 8-byte boundaries, that there is location 0066h which is the NMI handler, that IM1 uses rst38h, and that IM2 uses a vector table you put in memory somewhere on a 256-byte boundary. So your code has to honor that stuff. Also you have to do the usual things of informing the tools of your memory map.

I found the process to be quirky, but long story short I did get something working correctly. As to whether this is the orthodox way, I do not know, because I did not see a lot of tribal wisdom during my searches. So I want to document my findings for posterity. (Which is possibly just me, and that's fine, too.)

The tools are apparently meant to be run by invoking 'sdcc', which seems to be a front end that will select the actual compiler and invoke the tools in some ways. This works, but there are caveats:
There is a default target processor of 8051 and if you forget to specify, say, -mz80, you will encounter the sadness of inscrutable error messages about pdata sections etc. I'd rather there be no default, but I'm sure that's legacy behaviour because this project started as an 8051 toolchain before it grew legs.
Another caveat is that sdcc will compile one source file only. So if you need more than one (what project doesn't) then you will need to use -c to 'compile only' and then a separate invocation to link the object files, which are named *.REL. And you will need to specify the target processor again lest you get the inscrutable error messages for a different cpu.

If you compile a single source file as described in the docs:
sdcc -mz80 main.c

the tool will compile that, generate a 'linker script' of sorts, and link with a pre-built crt0.rel and standard library. I will make all sorts of assumptions about your platform, like that you have 32 KiB RAM and 32 KiB ROM, have no interrupts, etc. Maybe you do, maybe you don't.

The tool does emit assembler, but not of the final linked binary. So I disassembled that myself as I went along to truly know what was going on, despite it being a chore. But I'm glad I did because it was clear that out-of-box the tool definitely did not place things were I needed them to be. I'm going to save you the story of the journey, and just provide my end findings.

You will almost surely want to customize crt0.s. So grab a local copy out of installed toolchain and hack that.
The default crt0.s puts stub ISRs at all the RSTs. I find this vexing. So I commented all that out.
You will also set up the stack pointer. This is done right away, and has a hard-coded default at 0000h. Maybe you want it somewhere else. Maybe there's no RAM at top of memory. E.g. I am modelling an existing system with RAM a 8000h-9ffffh. So you will need to fix that. It is not clear if there is a way to do that symbolically rather than hard-coded. Oh, btw, this particular assembler uses the syntax #0xNNNN for constants. That took me a while.
There's a 'clock' and an 'exit' label that load A and do a RST 8. I don't know what that is or why I would want it in my program by default. I suspect it has to do with the companion 'simulator' tool, but unsure. So I commented the rst's out (but left the terminal halt loop; maybe you'd choose to rst 0).
There is init code for the initialized and uninitialized data, but it uses symbols that I guess are magically set in sdcc somehow because the assembler complains about undefined symbols. So you will need to add at the top:
```
    .globl    ___sdcc_external_startup
    .globl    l__DATA
    .globl    s__DATA
    .globl    l__INITIALIZER
    .globl    s__INITIALIZED
    .globl    s__INITIALIZER
```

___sdcc_external_startup is a function that indicates that variable init should not be done, that it will be done 'externally'. The default implementation simply returns 'false', so init is always done.

The other symbols appear to be generated by the linker, and I infer that the 's__' means 'start' and 'l__' means length. You only have to declare these, you do not have to set them.

A quirk seems to be the absence of 'weak' functions, but in lieu of such you apparently can blithely over stamp code simply by placing some on top of it later. I think this is how you override the default ___sdcc_external_startup -- just declare your own later in the linking order. I don't need this, so I didn't explore much more.

That should get you a generally usable crt0.s (with the exception of having to customise the stack pointer per-project).

Next was about placing functions at specific addresses. There is not a mechanism like an 'attribute' that can be applied to a function to do this (there is for data, though; more on that in a moment). But there is a pragma:
#pragma codeseg

The thing is that pragma will affect the entire source file. Not just the things that follow. I know this because I tried to use it for my RST 8, just before the function's definition, and the effect was that all the code in that module started there. main() and everything. My actual rst was somewhere far later, lol. So if you need to place your function at a specific address, then you'll need to put that definition in it's own source file, and have that definition the first bit of code. At least as far as I know. So, I can do a RST 8 like this with a separate file:
rst8.c:

#pragma codeseg RST8
void rst8 (void) {
}

This will now give you linking problems because it doesn't know where _RST8 is. So you will need to define the resulting symbol "_RST8" to the linker via it's 'linker script'. The linker script is really just a list of command line options. The pertinent one here is 'base':
-b _RST8 = 0x0008

Now if you check your disassembly you'll see you rst8 stub correctly located, and everything else where expected (namely, crt0 at 0x100, and your code (main) at 0x200. It's a bit of a hassle to require a separate source file just to get the base address applied correctly, but hey, the price is right, yes?

Now for interrupts. There is a little bit of quirkiness in the syntax which I'm sure is that way for legacy reasons of another CPU, but the form looks like this:

void isrCTCa0(void) __critical __interrupt(0);

The '__critical' keyword is optional, and in Z80 world means 'prohibit nesting of interrupts by not generating an EI at the start of the generated code'. In the Z-80 world you generally do not nest unless you are in IM2, so you probably need __critical in most case. '__interrupt(n)' is required for maskable interrupts, and means "end the function with an 'ei, reti' pair instead of the usual 'ret'". The numeric parameter has no material effect for Z-80, but it is required and must be globally unique and can take on the value 0-255. I used 0-5 for mine and 255 for a do-nothing stub implementation.

Other than that the ISR is an ordinary function that can wind up anywhere in code space. This is OK for an IM2 handler. If you are doing IM1, you probably want to have something at 38h specifically, even if just a thunk, so you're going to have to use a separate file just as with the rst 8. Likewise, if you're going to have an NMI handler then you are required to be at 66h.

So I already described how to get the code to be at 66h by creating a separate source file nmi.c with it's implementation:

#pragma codeseg NMI

and add into the linker 'script':
-b _NMI = 0x0066

But you also have to use a syntactic hack and omit the interrupt number from the signature:

void isrNMI (void) __critical __interrupt;

By omitting the interrupt number you are semantically telling the compiler that this is an NMI, and it will generate a final 'RETN' instead of the usual 'ei, reti' pair. Quirky!

At this point it might be useful to show my linker 'script'. The file was generated when I first ran sdcc on a trivial source file with main(), and then I customised it. As you can see, it's less of a script and more of a list of command line options passed to the linker.

-mjwx
-i diagnosticrom001.ihx
-b _CODE = 0x0200
-b _DATA = 0x8000
-b _RST8 = 0x0008
-b _RST10 = 0x0010
-b _RST18 = 0x0018
-b _RST20 = 0x0020
-b _RST28 = 0x0028
-b _RST30 = 0x0030
-b _RST38 = 0x0038
-b _NMI = 0x0066
-k C:\Program Files (x86)\SDCC\bin\..\lib\z80
-l z80
crt0.rel
main.rel
nmi.rel
rst8.rel

-e

Notable are the various '-b' options and the list of object files near the bottom. It is documented that order is important, with crt0.rel coming first, and then whatever.

OK, that's enough for an IM0 or IM1 system, but this one is IM2 so we have to set up an interrupt vector table. This isn't as bad, because data does have a means of declaring placement. It uses the '__at' qualifier. E.g. in my case:

void* __at(0x9e00) g_im2Vectors[16] = {
    isrCTCa0,           //00 CTCa-0
    isrCTCa1,           //02 CTCa-1
...
};

So that wasn't as bad as with placing functions. (A pity __at() does not work there. You know I tried.) What is a little annoying is that there doesn't seem to be 'math' capabilities on these numbers, so like the setting up of the stack pointer in ctr0.s, you will need to be conscientious about updating these spots if you change the memory map as your design evolves. So maybe try to do that up front and leave a little comment to your future self.

Which gets us to the last one: loading the I register. In the Z-80 the base of the interrupt vector table is held in the I register, and so would be 0x9e in this case. We'll need some inline assembler for that. There is an 'old' syntax and a 'new' syntax and I suggest using 'old' because it is block oriented. (The 'new' syntax is apparently for consistency with some other tools' conventions, but is not block oriented.) So in main(), after the hardware is set up, and we're ready to shift the system into 'drive', a code block:

    //need inline assembler to setup im2
     __asm
        ld      a, #0x9e    ; hibyte of address of IM2 vector table
        ld      i, a
        im      2
        ei                  ; away we go
     __endasm;

The last thing you'll want to know for Z-80 is about port I/O addressing. This is well-documented, so I'll be brief: you declare a location as a 'special function register'. E.g.:

__sfr __at(0x00) ioCTCa_0;

Now you can simply assign to and from that spot and the compiler will emit the requisite 'in' and 'out' instructions.

After doing all that that you will have a baseline to start coding in C and be less exposed to Z-80 particulars. Bear in mind that you'll need to go through a similar process of customising crt0.s for other runtime scenarios, such as running the context of an OS, as an extension, as an overlay, etc. Maybe they don't have interrupts, maybe they don't fiddle with the stack, maybe..., etc.

Some final comments:

build tools
I wound up invoking the assembler and linker directly so that I could control what they were doing. The compiler seems to be embedded in the sdcc front end application, so you still have to use that there with the -c option to just compile.
I did this via scripts 'assemble', 'compile', 'link' because I didn't want to add 'make' to my problems just now, but it can surely be used for greater sophistication.
calling conventions
This is C, and locals are on the stack, and there is a frame pointer via the index registers. The calling convention is documented, so you will want to take a peek at that. You can specify to omit the frame pointer if you have other designs on IX and IY. Also, the alternate register set is not used at all, so you have that at your disposal (and responsibility).
initialized variables
The crt0.s does the stuff expected by C for initializing storage. However what I found interesting in SDCC is that it will also initialize data that is not in the default data segment. That is unique in my experience -- on other toolchains you have to do that yourself. What is also distinctive is that it does this not by doing a 'memcpy()' from a chunk of constant, but by emitting code that loads registers and stores values. Know that this can get big, so you might choose to do that yourself.
you can download the sdcc project source to look into the stdlib implementation, and you might need to. (or at least peruse it in the repo via web.) For example, there is a malloc() and free(), but I have no idea how they work or are placed in memory. So that's worth checking out because usually I am not happy with any given toolchain's choices for heap. If you need a heap.

The generated code quality is OK, doubtlessly not better because there is not whole program optimisation during link. So you'll find things like fun things loading a value of zero in a register and then immediately testing if it is zero to make a conditional jump. It's like that because the constant that happened to be zero was not known at code generation time -- it was set at link time. And the same truth applies to the non-zero case, so really the test and conditional jump were never needed. If only it knew... So the result is probably going to be a bit fluffy and you may run into resource shortages sooner than if you hand-assembled.

But hey, coding in C certainly beats coding in assembler if you prioritize your development time over generated code quality (and your sanity over juggling registers). Plus you get multiplication, division, floating point, trig if you really need it, and sprintf, etc. So I'm glad I put forth the effort to scope it out and ramp up, and I am very grateful for the tool's existence.

Discussions

Ken Yap wrote 11/20/2024 at 21:51

Yep, you've discovered pretty much all there is to know about using SDCC for embedded development. If you think that's complicated, just wait until you see what happens in the backend scripts when building an embedded object using a gcc based toolchain for recent more hlgh-level language friendly microprocessors. The complexity is due the wide range of demands embedded development creates, compared to a standardised environment for an executable program on a normal OS.

Are you sure? yes | no

ziggurat29 wrote 11/21/2024 at 02:22

lol; yes, I've had to delve into the inscrutability of gcc link scripts from time to time. I do wish SDCC had the '__at()' capability for functions, but you get what you pay for, and I'm glad I've got anything at all.

Are you sure? yes | no

Ken Yap wrote 11/21/2024 at 02:55

I can see why you might want that if you're calling routines in pre-existing ROM. You might be able to do it by declaring an extern symbol in an ABS segment in asm (and instantiating the function in C if you want to implement it). But I've never needed to do something like that.

The asz80 assembler in SDCC is a modified version of the multi-target asxxxx assembler by Alan Baldwin so documentation for that can be found by searching for that software's site.

Are you sure? yes | no

20241119 - Setting Up SDCC for the Z-80

20241116 - Front Panel LEDs and Buttons

Discussions

Become a Hackaday.io Member