-
20241114 - Virtual Machine Summary
8 hours ago • 0 commentsYesterday I mentioned this apparent 'virtual machine' embedded in the Cefucom product, and I spent some time figuring how the it do. The machine workings are pretty clear now, and I have done 66 of the opcodes (10 more to figure out). Here's the scoop so far:
Overview
* basic unit of execution is a 'block'
* block structure is:
* opcode
* parameters... Number of parameters is opcode-specific. So a block is between 1 and 7 bytes.
* a 'program' is a series of blocks
* rst 8 is the 'primitive block executive'. It is not typically used directly.
* rst 10 is the 'sequenced block executive', and is the primary way of executing 'programs'
* is BIG-ENDIAN
* has absolute addresses
* indices are 1-relative
* has various functional groups:
* load/store; 8 and 16 bit, references and constants
* memset/memcpy
* arithmetic; addition/subtraction, usually accumulator form (e.g. *parm1 += *param2)
* bitwise; and, or, xor (note, no 'not', though I suppose you can synthesize that from xor)
* shift
* goto
* computed goto (well, 'indexed'; the computation would be done separately)
* if (and if-not)
* next
* 'usr' (call out to an assembly routine)
* 'run' (another program)
* some Cefucom specific opcodes; probably added by the companyother notable aspects
* The C register is used to store flags, where appropriate. 80h = carry, 1 = non-zero/no-carry, 0 = zero. The C register is actively preserved between block execution.
* A couple instrucstion place the result in B or DE, but these are not preserved between blocks. (I need to look more into this when I find a program that invokes them; and maybe there are no instances of such.)
* the RST 10 implementation realizes a few more outside of the dispatch table:
* 7F - NOP
* 7E - exit on no-carry
* 7D - exit on carry
* 7C - exit on non-zero (or carry)
* 7B - exit on zero
* the implementation speculatively loads param1 into HL and param2 into DE, which is useful for most instructions, but some do have less parameters (e.g. END). This is a potential out-of-bounds read, but will not cause problems on this hardware.
* there are two opcodes 62h and 63h which have little-endian parameters param1 and param2, while param3 is big-endian. I think this is a bug, though it might originate from a detail of their build process.
* opcode 43 is also little-endian, which runs a rst 10 program. This is interesting because there is already opcode 3e that does the same thing with a big-endian program pointer, so there must have been felt a special need.The remaining opcodes deal with Cefucom structures which I do not yet understand, so I might put those off for the moment in the interest of making progress.
One vexing thing has been some functions that do some stack manipulation such that the execution is no longer linear. I figured out some of those, like the ones that have a dispatch table that follows a call. But there are some others that have more convoluted shenanigans. I marked those as 'witchcraft' so that I can find them more easily when I come back to them. Perhaps now is the time to look into this witchcraft.
-
20241113 - An Embedded P-Code-esque Virtual Machine?
2 days ago • 0 commentsAs mentioned before there are a lot of tables, and even a couple of tables of tables. Some are lists of 'described text' (having a header indicating position), some are indexed dispatch tables, some are double-dispatch, and the others are presently unknown.
I had previously thought the table-of-tables form was a double-dispatch mechanism, but it's not. Rather it's a list
Often they are processed by RST 10, which delegates processing of the elements to RST 8. The elements are variably-sized blocks. They seem to have the structure:uint8_t fxn; uint16le_t param1; uint16le_t param2; ... payload;
(I am coining the term 'uint16le_t' because I have found many places that are big-endian! So elsewhere I use 'uint16be_t' for that.)
Often the blocks are short, like this:
7817 62 byte_7817:db 62h 7818 95 7A dw unk_7A95 781A 11 00 dw 11h 781C 00 db 0
but others can be quite long. I originally thought param1 is a pointer, and params is a length, because the RST 8 code that processed them immediately loads param1 into HL and param2 into DE. Indeed sometimes those are used as pointers and lengths, but in other cases they are not. But for starters I look at the ones at off_7803, which is a list of these things which have just one element in them, and seem to follow the (ptr,length) assumption. The trailing zero is interesting. All the ptrs and lengths worked out sanely when updating the the disassembly.
These blocks are ultimately processed by RST 10. This is a block sequencer, feeding individual blocks to RST 8 (which expects the block pointer in IX). A null function code terminates processing, so that explains the trailing 'db 0' above. So there is no explicit payload length of a block; it depends on function code.
RST 8 dispatches servicing through dispatch_4002, which has 128(!) entries. I went through and labelled each of them like 'fxn00'. Many of them are apparently unimplemented as their slot directs to 'fxn00', which was found to be used as the block sequence terminator. It does have an implementation, though, which is to back up IX by 4. This is interesting because fxn00 will not be dispatched by RST 10, but it would for the ostensibly unimplemented function codes. Turns out that RST is optimistically loading param1 and param2 and incrementing IX just past. So the IX-=4 is to undo that optimistic loading and continue at the byte following the unimplemented function code. So 'ignore unimplemented function code'.
In the end, there are 76 implemented and 52 unimplemented functions.
After labelling, I looked at fxn62h, since that was the the code used in the blocks above. The implementation boggled my mind a bit. It did some sort of queueing into c200, which is structured as 32 8-byte entries. A gave up with that and scrolled through the nearby disassembly casually and found fxn63 just below it, that also did something similar with c200, and then invoked my buddy RST 10 -- the 'block list dispatcher'. So if fxn62 puts it in, and fxn63 takes it out and processes it, this seems evocative of a 'load' and 'run' functionality. Anyway, there was still too many unknowns so I started to look for smaller fish to fry.
Scrolling through I found some shorter ones that I could comprehend, and coincidentally these tended to be the smaller numbered function codes. The first one I found was:
43FD ; XXX 3f: dispatch to param1 (no param2) 43FD fxn3f_43FD: 43FD DD 2B dec ix ; (no param2) 43FF DD 2B dec ix 4401 E9 jp (hl) ; thunk over
So this would run arbitrary external code.
Another is a conditional 'goto' of sorts, where the block processing is directed to another place (hopefully within the list!) if --*((uint8_t*)param2) != 0:
43D3 ; XXX 3c: goto block @ param1 43D3 fxn3c_43D3: 43D3 EB ex de, hl 43D4 35 dec (hl) 43D5 28 03 jr z, leave_43DA ; leave if --*((uint8_t*)param2) == 0 43D7 D5 push de 43D8 DD E1 pop ix 43DA leave_43DA: 43DA C9 ret
So this seems to be implementing a form of 'next' using an implicit down counter. Similarly, there is an 'if':
43B2 ; XXX 3a: if ( *((uint8_t*)param2) ), goto param1 43B2 fxn3a_43B2: 43B2 1A ld a, (de) 43B3 B7 or a 43B4 28 03 jr z, leave_43B9 43B6 E5 push hl 43B7 DD E1 pop ix 43B9 leave_43B9: 43B9 C9 ret
So these 'blocks' seem to suggest being a 'byte code' of sorts for a virtual machine.
It's worth noting also that most values referenced within this machine are treated big-endian; e.g.:
434D ; XXX 2d(4): *(uint16be_t*)param1 <<= param2.l 434D fxn2d_434D: 434D C5 push bc ; save 434E 46 ld b, (hl) ; high byte first! 434F 23 inc hl 4350 4E ld c, (hl) ; then low byte! ...
However, param1 and param2 are host cpu native, i.e. little-endian. I add the notation 'param2.l' to make it clearer which byte in a neutral way.
Along the way I also found and, or, xor. But then I needed to call it a night, but lot's more to do in this area. Since the Cefucom seems to embed scripts for this thing, it may be useful for me to make a disassembler for it. This will be a little challenge since the 'compiled' scripts have absolute memory references, but I can probably manage. It will take some effort, but it might be worth it for more rapidly understanding what's going on.
It was a thing back then to use 'p-code' as a means of achieving code density. This is not p-code per se, because it's not (p)ortable at all. It has absolute address references and is mixed-endian. But it seems to be in that spirit.
A non-important observation is that all this code is located from 3f00-4dff. (there is a couple dead bytes at the beginning and many at the end). So it feels like a distinct 'component' of sorts. 3840 bytes for this 'virtual machine'.
-
20241111 - Catching Up and Publishing Work
2 days ago • 0 commentsLast week's PCU push put me further behind in posting, so I caught up those log entries.
Also, I don't know why I didn't already, but I put the listing in a github so others can check it out if they are curious. There's a link in the project's 'link' section, and also here if you happen to be reading this entry:
https://github.com/ziggurat29/cefucom-21.git
It's a pity I didn't start this at the beginning so that I could see the history of this past 3 week's work, but oh well. I was moving fast then. Still am.
Now that all the puzzle pieces are on the table, it's time to see what picture emerges when solved. There's no depiction on the packaging. ;)
-
20241110 -- PPI (8255) Catalogue
2 days ago • 0 commentsToday I went though all the references to the 3 Programmable Peripheral Interface (PPI; 8255) devices. To my relief they are configured once and stay that way. Also, they are all set to be garden-variety gpio -- no special modes.
8255 Summary:
8255a:
Port A: in
Port B: out
Port C: in8255b:
Port A: out
Port B: out
Port C (upper): out
Port C (lower): in8255c:
Port A: in
Port B: in
Port C: outMostly I don't know what these do, though, because I have no schematic. It's worth noting that my nomenclature of "8255a" and "CTCa" is purely made up. I don't know what actual device on the board these are associated with. Otherwise I'd use the parts marking on the board. If I ever find out, then I'll update the info. This did send me on a side activity of going through the screen shot and collecting all the parts. (Well, just the IC's.) So I made a BOM for future reference. The BOM occasionally gives me some ideas; e.g. there are two 7474's on board. Maybe clock dividers? And if someone has a physical board and is willing, I can be specific about requests to buzz out specific pins. (Especially the address decoders to the various chip selects would be handy!)
I can say that 8255b-a is unused, and that the unusual 8255b-c is used for the bit-bang serial, and possibly RTS/CTS.
Watchdog?
In the course of this, I found a bit which is polled periodically, and will reset a down counter. If the counter reaches zero, a request for reboot is flagged. This check is registered in the CTCb2 task list (entry 1).
0D7D isrCTCb2task01_watchdog_D7D: D7D DB 69 in a, (69h) ; 8255c-b; checking b6 0D7F 47 ld b, a 0D80 DB 69 in a, (69h) ; 8255c-b; checking b6 0D82 B8 cp b 0D83 20 14 jr nz, resetWatchdog_D99 ; transitioned during subsequent reads; lets try again 0D85 CB 77 bit 6, a 0D87 20 10 jr nz, resetWatchdog_D99 ; b6 = 1 == system ready? 0D89 3A 69 C0 ld a, (byte_C069) ; XXX a watchdog timer; initted to 5 on 69h b6 set (system ready) 0D8C 3D dec a 0D8D 32 69 C0 ld (byte_C069), a ; XXX a watchdog timer; initted to 5 on 69h b6 set (system ready) 0D90 20 0C jr nz, nullsub_10 ; leave 0D92 3E FF ld a, 0FFh 0D94 32 68 C0 ld (byte_C068), a ; XXX flag causing warm boot (or maybe sleep) in main task 00 0D97 18 05 jr nullsub_10 0D99 resetWatchdog_D99: 0D99 3E 05 ld a, 5 ; give it 5 chances to come back 0D9B 32 69 C0 ld (byte_C069), a ; XXX a watchdog timer; initted to 5 on 69h b6 set (system ready) 0D9E nullsub_10: 0D9E C9 ret
I'm calling this a 'watchdog' for now, but it might actually be something else that isn't supposed to stay low for too long. Maybe a sensor of some sort.
Ring Counter?
Maybe the most interesting find was three bits that are configured in what appears to be a ring counter:
5D44 sub_5D44: 5D44 3A 02 81 ld a, (byte_8102) 5D47 32 03 81 ld (byte_8103), a 5D4A 21 B0 80 ld hl, byte_80B0 5D4D CB 96 res 2, (hl) ; clear 5D4F CB 8E res 1, (hl) 5D51 CB 86 res 0, (hl) 5D53 3D dec a 5D54 28 07 jr z, was0_5D5D 5D56 3D dec a 5D57 28 07 jr z, was1_5D60 5D59 3D dec a 5D5A 28 07 jr z, was2_5D63 5D5C C9 ret 5D5D was0_5D5D: 5D5D CB CE set 1, (hl) 5D5F C9 ret 5D60 was1_5D60: 5D60 CB D6 set 2, (hl) 5D62 C9 ret 5D63 was2_5D63: 5D63 CB C6 set 0, (hl) 5D65 C9 ret
and later:
047D F3 di ; critical section 047E 3A B0 80 ld a, (byte_80B0) ; XXX a bitfield; also goes out 8255b-b 0481 E6 07 and 7 ; just the lower 3 bits 0483 32 B0 80 ld (byte_80B0), a ; XXX a bitfield; also goes out 8255b-b 0486 FB ei ; end critical section 0487 D3 65 out (65h), a ; 8255b-b
A ring counter is a bit peculiar and in this product maybe drives a 4-wire 3-phase stepper motor. Such a motor might be useful for advancing the paper roll display.
-
20241109 -- Tables, and Tables of Tables
2 days ago • 0 commentsThere's still a fair amount of unexplored data and code. Can't say much for data, but code you can figure out where functions demarcate in many cases just by looking for the C9 that is the RET that often punctuates a function end. I found several 'orphan' functions this way.
As a new tack, I searched the binary for the orphaned function's address. Ostensibly to find a call site, but many times I found it in a list of addresses that mapped to other orphaned functions. So I had found some dispatch tables. Oftentimes is was easy to work backwards from such to find the start of the function table because it would abut some code ending. Then I could do a similar address search to find the code that dispatches through the table.
It was a bit tedious, but there are presently 42 such dispatch tables currently found! And just for fun, it turns out that there are dispatch tables of dispatch tables in some cases! There is a huge table of 128 entries at 4002h, a double-dispatch table at 2DA6 (with associated paramters table at 2DE0) and another double-dispatch table at 2200.
With that many dispatch tables, this surely is some state machine design. To think I was daunted by the one in ROM 4! This is proportionately larger. But who knows, this might be a gift as it might make the intent clearer.
Bloopers
Some amusing treats were nullsubs that have a subsequent jump to the nullsub. (The code in the subsequent jump is not referenced anywhere.)
3E9C nullsub_4: 3E9C C9 ret 3E9D C3 9C 3E jp nullsub_4
or
3EA8 nullsub_5: 3EA8 C9 ret 3EA9 C3 A8 3E jp nullsub_5
And a dispatch table of jump addresses that ends with a ret; lol.
... 40FC 9E 41 dw sub_419E ; XXX IX -= 4 40FE 9E 41 dw sub_419E ; XXX IX -= 4 4100 9E 41 dw sub_419E ; XXX IX -= 4 4102 C9 ret
My personal favorite:
... 0475 18 00 jr loc_477 ; well let's jump right on that! 0477 loc_477: ...
I think more symptoms of machine generated code.
It took a day, but it got me to my desired 100% code coverage milestone. Now that I have the puzzle pieces, it's time to see what picture emerges.
-
20241108 -- PCU; RSTs and Rendering and Tasks and Serial
3 days ago • 0 commentsDesultorily rummaging through the code...
RSTs
There are only a few RSTs implemented in this unit. Some are covered with FF and so don't do anything (well, they'll reboot since FF is coincidentally 'RST 0').
RST 0 starts the party with 'ld sp, C800h' and 'jp boot_100'. Seems OK, except boot_100 immediately changes it's mind to 'ld sp, C7ffh'; lol. Incidentally c800 is ostensibly more sane since SP is pre-decrement. So C7ffh wastes the last byte. But this is a fun system.
RST 8, 10, and 18 do something. RST 20, 28, 30, and 38 are in FF padding. That padding continues out to the NMI 66h which I already describe it's special-ness. It is immediately followed by one of these:
0082 nullsub_7: 0082 C9 ret
"One of these" because there are nearly 40 such empty functions throughout the code. So surely there's one within a relative jump range if you need it! Joking aside, this is giving me the vibe that at least some of this code is machine-generated -- i.e. compiled or some other generation tool. It was the 80s. Throwing out redundant code is the linker's job, and there were business selling fancy linkers that had the smarts to do that. Phar Lap is one that comes to mind. I haven't discerned an obvious calling convention, though.
RST 18 is the easiest to understand. I simply vectors to an instruction indexed by A, 0-4.
Function 01 was the first easiest as I was able to use my recent experience with the RTC to see that it was reading the clock and updating the values in RAM as BCD. It confused me for a moment at the end where it OR'd all the values on top of each other and then masked that with 0f and conditionally invoked sub_DF9, until I realized that it was just checking for midnight. So sub_DF9 does things at midnight. It's only checking hours and minutes, so this needs to be invoked at least once a minute or it will miss it. Also, it needs to be invoked not more than once a minute unless the midnight tasks are idempotent.
RST 8 and 10 are more involved. They operate on blocks of data. The blocks have the form:
00 code 8-bit 01 param1 16-bit 03 param2 16-bit 04 ... data
RST 8 expects the block to be referenced through IX
RST 10 does a little more, expecting a sequence of blocks in HL, and then dispatching them via RST 8.What is amusing is that most of the code does not invoke these functions via the RST instruction. Rather, they make the traditional 3-byte call to the underlying implementation. There are no invocations of RST 8, and only one of RST 10. All the invocations of rst 8 are direct (OK, there's only 3) and 24 direct invocations of rst 10. So why bother? Again, it seems to be more likely some machine generated code -- humans don't code this way.
Tasks
[Nigel] made a little headway getting past the memory test failure, but is stuck in a new loop. His instruction trace ended up looping around here:
... 01FC 3E C0 ld a, 0C0h ; IM2 vector base is C000h 01FE ED 47 ld i, a 0200 ED 5E im 2 0202 FB ei 0203 again_203: 0203 AF xor a 0204 32 62 C0 ld (byte_C062), a ; XXX index into function table @ C040 0207 next_207: 0207 3A 62 C0 ld a, (byte_C062) ; XXX index into function table @ C040 020A 07 rlca 020B 06 00 ld b, 0 020D 4F ld c, a 020E 21 40 C0 ld hl, word_C040 ; XXX an array of 16 fxnptrs indexed by C062 ... stuff 022E 3C inc a 022F FE 08 cp 8 0231 28 D0 jr z, again_203 0233 32 62 C0 ld (byte_C062), a ; XXX index into function table @ C040 0236 18 CF jr next_207 ...
Turns out this is exactly where it should be. This is the main() loop! What it does is whizz through a list of function pointers, invoking them, and then repeating that forever.
The first task is mainTask00_default_2C3, and that is the one that check for the 'do warm boot' flag set by the NMI that we looked at yesterday!
A curiousity is that there is room for 16 tasks, but only ever are three used, and the first one doesn't change. So it's over-provisioned relative to something like:
for (;;) { defaultTask(); myTask1(); myTask2(); }
This oddity is not a code generation thing, this means that some sort of library was used that was intended to be more general purpose. I don't know if it was in-house or 3rd party.
It's also interesting because the ISRs for CTCb1 and CTCb2 have a similar 'task list' design (though they only execute one pass at a time; not an infinite loop. They are also over provisioned supporting up to 8 tasks though only 3 are used.
Text Renderer
There appear to be several 'screens' in memory, of dimension 32x16 (one is at E400, but there are a couple more). This board has no video, of course, so it presumably is shipping it over to the MCU2 board for actual display.
There is a rendering function I named 'renderDescText_11F9' which renders what I am calling 'described text'. (yes, concocting naming is not my talent). The 'description' is a header of:- row
- column
- length
- text...
So cross-referencing all the calls to 11F9h, I was able to find and annotate all the fixed strings in the application that are rendered via this method. That soaked up a bit of the unexplored area -- not a lot, but a little.
I left a lot of the text as hex because I just don't know what those code points look like on the screen. I don't have the actual character generator rom dump, but [Nigel] suspected that it would be the same as some other Japanese models and he gave me one from a Seiko unit. I made a tool to render the binary into a shape. E.g.:
char 0xa6 (166) 00 00 01 00 02 fe ####### 03 02 # 04 02 # 05 fe ####### 06 02 # 07 04 # 08 08 # 09 30 ## 10 00 11 00
etc. So perhaps I will make another tool to take the binary string and render it into a bitmap that I can feed into google goggles or something to translate the text. E.g.:
ISRs
There's a bunch of interrupt sources on this board, mostly from the CTCs. Moreover, the handlers are dynamically changed during program execution. I was looking at one (isrCTCa1_624F) which does some port I/O and I stumbled across a code sequence:
6280 E6 7F and 7Fh 6282 EA 87 62 jp pe, loc_6287
Which was striking to me because that is a real parity check. The Z80 flag P/V is used for both parity and overflow (and also another function during block compare). Most of the time you're using it for the other functions, but contextually we know that we mean parity here because of the logical operation that precedes it. So is this part of the bit-banged serial routine? Surrounding context:
... 627D 3A 0E C9 ld a, (byte_C90E) 6280 E6 7F and 7Fh 6282 EA 87 62 jp pe, loc_6287 6285 F6 80 or 80h 6287 loc_6287: 6287 32 13 C9 ld (byte_C913), a 628A 3E 08 ld a, 8 628C 32 14 C9 ld (byte_C914), a 628F DB 66 in a, (66h) 6291 E6 7F and 7Fh 6293 D3 66 out (66h), a 6295 21 BB 62 ld hl, 62BBh 6298 22 02 C0 ld (word_C002), hl 629B loc_629B: 629B E1 pop hl 629C F1 pop af 629D FB ei 629E ED 4D reti
So we get a byte, mask off the most significant bit, and check for even parity. If it's not even, set the msb, which will make it even, otherwise skip that because it's already even. Then we stow that somewhere. Then we store 8 somewhere. Then we set one of the 8255 PPI port's bit 7 low. Then we load an address of a *new* isr to handle the *next* interrupt. And then we beat it.
So this does seem to be part of the machine for bit-banged serial out. Serial out is a little easier to interpret because you just clock the data out at the bit timing. (Serial in is more tricky because it's async.)
The line protocol is hard-coded as E-7-2. Not to my taste, but oh well, I didn't write this.
I did more rummaging and found the time constant for the timer, and knowing this one is one tick per bit, I worked out that this timer is clocked at the full system speed of 4 MHz. It supports two bit rate options of 300 bps and 200 bps. 200 is a new one on me.
OK, enough for the day.
-
20241107 -- Back to PCU Grindstone
3 days ago • 0 commentsReflecting on yesterday's experience with ROM 4, I'm a little daunted by what lays ahead. That was a 2 KiB ROM. This is a 32 KiB ROM, so 16 x as much to cover. At the same time, the understandings I got from ROM 4 might provide context that speeds things along.
The PIO A is configured much like on the MCU2, with differences in bit 7 and 6. On the MCU2 b7 is the interrupt in, and b6 is unused. On the PCU they are both outputs. B7 does something. B6 seems unused so far.
PIO A does not provide an interrupt source as it does on MCU2, but there are plenty of timers, and in particular CTCb0 is configured free running. It prescales by 256 and has a time constant of 3fh, so that would imply a total division of 16384. I don't know its clock source. If it was the 4 MHz system clock, then that would be 244 per second, and if that were externally prescaled by 4, then that would be 61 per second. Again, I don't know if any of that is the case.
What I did negatively confirm is that there seems to be no systick as with the other board. There is a counter that is incremented, but it is 8 bits and it saturates rather than rolling over.
I was eager to find such a tick, because my assumption was that PIO Port B would be serviced similarly here -- and it is, except for the timeout. Had it been the case I might have been able to work out what the clock rate likely is. Oh well.
The PIO B ISR is freaky. It starts off sane:
3D48 isrPIOb_3D48: 3D48 FB ei ; allow nesting 3D49 F5 push af 3D4A C5 push bc 3D4B D5 push de 3D4C E5 push hl 3D4D CD 56 3D call impl_isrPIOb_3D56 ; XXX PIO B isr implementation; invoked after saving all regs 3D50 E1 pop hl 3D51 D1 pop de 3D52 C1 pop bc 3D53 F1 pop af 3D54 ED 4D reti
and then things get weird:
3D56 impl_isrPIOb_3D56: 3D56 4F ld c, a ; XXX freaky; where was A set? non-isr code? 'expect'? 3D57 DB 10 in a, (10h) ; XXX PIO A data; ostensibly 'state' (though we aren't masking off high bits for some reason) 3D59 B9 cp c 3D5A 20 FA jr nz, impl_isrPIOb_3D56 ; loop, there it is! 3D5C E6 07 and 7 ; mask only b2,b1,b0 ('state') 3D5E C2 68 3E jp nz, sub_3E68
So, two things:
- a spin-wait in an ISR for anything tweaks my spidey-sense
- we are checking for a value that is specified in A, but we never explicitly set A ourselves. A will be whatever it was in the pre-interrupt environment. All the more curious because Port B I/O is not synchronous to this system.
And I guess a minor thing is that we didn't mask off bits 7 and 6.
This is a head-scratcher. I'll have to come back to it later. (maybe some weird coordination with non-isr code: "wait for this and let me know". I'll keep an eye out for that pattern.)
Another treat:
7FA9 3E 4F ld a, 4Fh 7FAB D3 E3 out (0E3h), a ; XXX set PIO B mode 1 (input) 7FAD 3E 87 ld a, 87h 7FAF D3 E2 out (0E2h), a ; XXX wut? it's now an input port, so what is write doing?
Since [Nigel] was concerned about the port 30/38/39 stuff I took a little look at that. On cold boot, port 38h is read, and then jumps into the warm boot routine which immediately writes it out to port 39h. Warm boot ("boot_100") may be entered in other ways, and those ways a value left over from a previous write to both 30h and 38h is the one that is written to 39h.
And that is the end of the story for those ports. They don't affect code flow in direct way (conceivably they might indirectly by causing some unknown hardware to do something different). So looking into what the 'leftover value' is that comes in on the warm boot path led me to some RTC stuff.
02C3 sub_2C3: 02C3 3A 6D C0 ld a, (byte_C06D) ; a flag set in NMI 02C6 B7 or a 02C7 20 0D jr nz, loc_2D6 ; horror 02C9 3A 68 C0 ld a, (byte_C068) ; another flag that could cause us to reboot 02CC B7 or a 02CD C8 ret z 02CE F3 di 02CF CD A2 0E call sub_EA2 ; XXX RTC alarm stuff; leaves something in A 02D2 D3 38 out (38h), a ; hmm! 02D4 D3 30 out (30h), a ; hmm! 02D6 loc_2D6: ; horror; warm boot 02D6 CD AC 02 call sub_2AC ; a very long delay 02D9 C3 00 01 jp boot_100 ; bootilicious
The sub_EA2 sets the RTC alarm. The oddity is that it is setting down to the seconds, but the RTC data sheet explicitly says that the seconds are ignored. So why does it bother? Also, we only are writing out the 10's digit, not the 1s digit. Anyway, that is the last value that is in A before exiting the routine, and so is the value that is written to 38h and 30h. I'm suspecting that this is a bug of sorts, and that the value in the ostensible 'alarm seconds' is either something that is not really alarm seconds, or that it doesn't matter. Have to dig in more...
Anyway, it makes me suspect that the port 30h,38h,39h are somehow related to sleeping or power or something. But they do not directly guide the flow of execution.
-
20241106 -- ROM 4
3 days ago • 0 commentsI spent yesterday disassembling ROM 4 of the MCU2 board. I got to 100% code coverage of that one. That doesn't mean 100% understanding, it just means all the jigsaw puzzle pieces are now on the table.
It was interesting. There are a lot of table-dispatched functions.
Dispatch Magic
I found some code which seems to be in a spin-wait for something to come into 8000h.
7874 loc_7874: 7874 E1 pop hl ; discard the return address 7875 ED 73 CE FF ld (word_FFCE), sp ; XXX stores SP during some Cefucom ROM4 stuff 7879 loop_7879: 7879 CD B9 78 call sub_78B9 ; XXX some stuff with keys (as in buttons) 787C 2A C3 FF ld hl, (word_FFC3) ; XXX Cefu; a pointer into buffer @8000h 787F 11 00 80 ld de, unk_8000 7882 B7 or a 7883 ED 52 sbc hl, de 7885 28 F2 jr z, loop_7879 ; XXX nothing 'received'; spin 7887 21 00 00 ld hl, 0 788A 22 C5 FF ld (word_FFC5), hl 788D CD E3 78 call sub_78E3 ; XXX messes with DE, which will be a synthetic return address 7890 CD 7F 79 call sub_797F 7893 01 99 78 ld bc, sub_7899 7896 C5 push bc ; queue sub_7899 on the stack 7897 D5 push de ; queue the call sub_78E3 computed 7898 C9 ret ; (not really returning from here since we queued the above two)
And there is magicry at the end.
The code infers the availability of data by the difference between the start of buffer and end of buffer, so that end of buffer pointer must be atomically updated. Cross referencing word_FFC5 I find that is indeed happening:
7899 sub_7899: 7899 F3 di ; critical section around these pointer updates 789A 2A C5 FF ld hl, (word_FFC5) ; XXX Cefu; an OFFSET into buffer @8000 while building ... move block into position at 8000h and computes end in DE and other stuff 78B2 ED 53 C3 FF ld (word_FFC3), de ; XXX Cefu; an end pointer into buffer @8000h 78B6 FB ei ; end critical section ...
so word_FFC5 seems to be used while transferring the block, and when it is completed then word_FFC3 is atomically updated with the final value.
The magicry at the end depends on sub_78E3 leaving a return address in DE, which eventually gets pushed to the stack prior to the ret, effectively synthesizing 'jp (de)'.
78E3 ; XXX lookup dispatch info 78E3 sub_78E3: 78E3 21 00 80 ld hl, unk_8000 78E6 E5 push hl 78E7 7E ld a, (hl) ; get the code from buffer 78E8 ED 4B 1C 00 ld bc, (off_1B+1) ; XXX freaky as it is in the middle of a constant; val c7e0. bug? 78EC 21 0B 79 ld hl, dispatchByCode_790B ; XXX dispatch 29 entries/116 by: (code, C, addr) 78EF loop_78EF: 78EF ED A1 cpi 78F1 28 08 jr z, leave_78FB ; found it 78F3 E2 05 79 jp po, loc_7905 ; finished; but not found 78F6 23 inc hl ; (HL already +1, so we only need +3 to get to next) 78F7 23 inc hl 78F8 23 inc hl 78F9 18 F4 jr loop_78EF 78FB leave_78FB: 78FB 4E ld c, (hl) 78FC 06 00 ld b, 0 78FE 23 inc hl 78FF 5E ld e, (hl) 7900 23 inc hl 7901 56 ld d, (hl) 7902 E1 pop hl ; (which will be 8000h) 7903 23 inc hl 7904 C9 ret 7905 loc_7905: 7905 23 inc hl 7906 23 inc hl 7907 23 inc hl 7908 23 inc hl 7909 18 F0 jr leave_78FB
The sub_78E3 basically looks up the servicing address from the code that is at the start of the block, and returns an additional associated parameter in C. Here's the first entry:
The sub_78E3 basically looks up the servicing address from the code that is at the start of the block, and returns an additional associated parameter in C. Here's the first entry: 790B 21 dispatchByCode_790B:db 21h ; code to match @8000h 790C 05 db 5 ; XXX goes in C 790D A6 79 dw sub_79A6 ; XXX goes in DE (and becomes a call address) ...
That table has 29 entries.
"State"
Rummaging through the references to PIO A code, there were sections like this:
7E96 sub_7E96: 7E96 3E 02 ld a, 2 7E98 32 E4 FF ld (byte_FFE4), a 7E9B 3E 10 ld a, 10h 7E9D D3 E0 out (0E0h), a 7E9F 3A EA FF ld a, (byte_FFE8+2) 7EA2 D3 E2 out (0E2h), a ; PIO B data out 7EA4 C9 ret
Knowing that PIO A b2,1,0 are inputs, and that b5,4,3 are outputs, it occurred to me that those might be bitfields of a 3-bit number. One expressed from MCU2 to PCU, and one expressed from PCU to MCU2. I re-annotated that code throughout:
7E96 sub_7E96: 7E96 3E 02 ld a, 2 ; transition state 2 7E98 32 E4 FF ld (byte_FFE4), a ; XXX PIO A data related; dispatch index 7E9B 3E 10 ld a, 10h 7E9D D3 E0 out (0E0h), a ; PIO A set b5 low, b4 high, b3 low (send 2) 7E9F 3A EA FF ld a, (byte_FFE8+2) 7EA2 D3 E2 out (0E2h), a ; PIO B data out 7EA4 C9 ret
Things start to make a little more sense in that context. So in sum PIO Port A is structured this way:
PIO A: b7 - /FS from VDG b6 - x (unused) b5 - \ b4 - +=> "MCU2 State (to ioboard)" b3 - / b2 - \ b1 - +=< "PCU State (to main board)" b0 - /
Port B has no bit structure, and seems to be used for bulk data transport between the boards.
Because there is no 'strobe' between the boards to notify of state change, I am suspect that happens as a consequence of data being available on B.
Interboard Communications
Reviewing the ISR for Port B:
7E1C isrPIOb_7E1C: 7E1C FB ei ; allow nesting this interrupt 7E1D F5 push af 7E1E C5 push bc 7E1F D5 push de 7E20 E5 push hl 7E21 CD 2F 7E call sub_7E2F 7E24 3E 0A ld a, 10 7E26 32 E3 FF ld (byte_FFE3), a 7E29 E1 pop hl 7E2A D1 pop de 7E2B C1 pop bc 7E2C F1 pop af 7E2D ED 4D reti
There is a constant whacking of byte_FFE3 to the value of 10. Cross referencing that, it can be found in the ISR for Port A:
7F95 isrPIOaHelper_7F95: 7F95 F5 push af 7F96 E5 push hl 7F97 21 E3 FF ld hl, byte_FFE3 7F9A 35 dec (hl) 7F9B 20 1A jr nz, leave_7FB7 ...
So Port B whacks it to 10 ever time a byte comes over, and Port A decrements it each time a systick comes in, doing different things based on whether is reaches zero or not. So I surmise this is a 'receive data timeout'. Since I know the systick is 60 Hz, this means a timeout of 167 ms.
I elided the code above, but the short story is that 'if it times out, the system is returned to state 0'. So state 0 seems to be the quiescent state.
I should also point out that in these state processing, that Port B is turned around several times. So in both systems it is configured in 'input mode' but during the operation is it changed to output as well.
So sub_7E2F is probably the 'stow a byte and maybe advance the state machine' function.
After I went though all the dispatch tables (there's something like 12 of them), I am now at 100% disassembly coverage of ROM4. I don't know what all these things do, but it's still a milestone because code and data are now separated, so cross referencing can happen. On the other hand, this state table design will possibly require my building of another document to keep track of the system.
I did take a peek at [Nigel]'s instruction trace. At this point it's not exciting news because it is simply stuck in the 'memory test failed' loop on the PCU board. He had configured the emulator for less than 32 KiB RAM and so it was failing there. Easily fixed, and he got to a prompt, but still not working. No big surprise, still so much more to do.
OK, with this understanding of PIO A and B, I'm going back to PCU board.
-
20241106 -- ROM 4
3 days ago • 0 commentsI spent yesterday disassembling ROM 4 of the MCU2 board. I got to 100% code coverage of that one. That doesn't mean 100% understanding, it just means all the jigsaw puzzle pieces are now on the table.
It was interesting. There are a lot of table-dispatched functions.
Dispatch Magic
I found some code which seems to be in a spin-wait for something to come into 8000h.
7874 loc_7874: 7874 E1 pop hl ; discard the return address 7875 ED 73 CE FF ld (word_FFCE), sp ; XXX stores SP during some Cefucom ROM4 stuff 7879 loop_7879: 7879 CD B9 78 call sub_78B9 ; XXX some stuff with keys (as in buttons) 787C 2A C3 FF ld hl, (word_FFC3) ; XXX Cefu; a pointer into buffer @8000h 787F 11 00 80 ld de, unk_8000 7882 B7 or a 7883 ED 52 sbc hl, de 7885 28 F2 jr z, loop_7879 ; XXX nothing 'received'; spin 7887 21 00 00 ld hl, 0 788A 22 C5 FF ld (word_FFC5), hl 788D CD E3 78 call sub_78E3 ; XXX messes with DE, which will be a synthetic return address 7890 CD 7F 79 call sub_797F 7893 01 99 78 ld bc, sub_7899 7896 C5 push bc ; queue sub_7899 on the stack 7897 D5 push de ; queue the call sub_78E3 computed 7898 C9 ret ; (not really returning from here since we queued the above two)
And there is magicry at the end.
The code infers the availability of data by the difference between the start of buffer and end of buffer, so that end of buffer pointer must be atomically updated. Cross referencing word_FFC5 I find that is indeed happening:
7899 sub_7899: 7899 F3 di ; critical section around these pointer updates 789A 2A C5 FF ld hl, (word_FFC5) ; XXX Cefu; an OFFSET into buffer @8000 while building ... move block into position at 8000h and computes end in DE and other stuff 78B2 ED 53 C3 FF ld (word_FFC3), de ; XXX Cefu; an end pointer into buffer @8000h 78B6 FB ei ; end critical section ...
so word_FFC5 seems to be used while transferring the block, and when it is completed then word_FFC3 is atomically updated with the final value.
The magicry at the end depends on sub_78E3 leaving a return address in DE, which eventually gets pushed to the stack prior to the ret, effectively synthesizing 'jp (de)'.
78E3 ; XXX lookup dispatch info 78E3 sub_78E3: 78E3 21 00 80 ld hl, unk_8000 78E6 E5 push hl 78E7 7E ld a, (hl) ; get the code from buffer 78E8 ED 4B 1C 00 ld bc, (off_1B+1) ; XXX freaky as it is in the middle of a constant; val c7e0. bug? 78EC 21 0B 79 ld hl, dispatchByCode_790B ; XXX dispatch 29 entries/116 by: (code, C, addr) 78EF loop_78EF: 78EF ED A1 cpi 78F1 28 08 jr z, leave_78FB ; found it 78F3 E2 05 79 jp po, loc_7905 ; finished; but not found 78F6 23 inc hl ; (HL already +1, so we only need +3 to get to next) 78F7 23 inc hl 78F8 23 inc hl 78F9 18 F4 jr loop_78EF 78FB leave_78FB: 78FB 4E ld c, (hl) 78FC 06 00 ld b, 0 78FE 23 inc hl 78FF 5E ld e, (hl) 7900 23 inc hl 7901 56 ld d, (hl) 7902 E1 pop hl ; (which will be 8000h) 7903 23 inc hl 7904 C9 ret 7905 loc_7905: 7905 23 inc hl 7906 23 inc hl 7907 23 inc hl 7908 23 inc hl 7909 18 F0 jr leave_78FB
The sub_78E3 basically looks up the servicing address from the code that is at the start of the block, and returns an additional associated parameter in C. Here's the first entry:
The sub_78E3 basically looks up the servicing address from the code that is at the start of the block, and returns an additional associated parameter in C. Here's the first entry: 790B 21 dispatchByCode_790B:db 21h ; code to match @8000h 790C 05 db 5 ; XXX goes in C 790D A6 79 dw sub_79A6 ; XXX goes in DE (and becomes a call address) ...
That table has 29 entries.
"State"
Rummaging through the references to PIO A code, there were sections like this:
7E96 sub_7E96: 7E96 3E 02 ld a, 2 7E98 32 E4 FF ld (byte_FFE4), a 7E9B 3E 10 ld a, 10h 7E9D D3 E0 out (0E0h), a 7E9F 3A EA FF ld a, (byte_FFE8+2) 7EA2 D3 E2 out (0E2h), a ; PIO B data out 7EA4 C9 ret
Knowing that PIO A b2,1,0 are inputs, and that b5,4,3 are outputs, it occurred to me that those might be bitfields of a 3-bit number. One expressed from MCU2 to PCU, and one expressed from PCU to MCU2. I re-annotated that code throughout:
7E96 sub_7E96: 7E96 3E 02 ld a, 2 ; transition state 2 7E98 32 E4 FF ld (byte_FFE4), a ; XXX PIO A data related; dispatch index 7E9B 3E 10 ld a, 10h 7E9D D3 E0 out (0E0h), a ; PIO A set b5 low, b4 high, b3 low (send 2) 7E9F 3A EA FF ld a, (byte_FFE8+2) 7EA2 D3 E2 out (0E2h), a ; PIO B data out 7EA4 C9 ret
Things start to make a little more sense in that context. So in sum PIO Port A is structured this way:
PIO A: b7 - /FS from VDG b6 - x (unused) b5 - \ b4 - +=> "MCU2 State (to ioboard)" b3 - / b2 - \ b1 - +=< "PCU State (to main board)" b0 - /
Port B has no bit structure, and seems to be used for bulk data transport between the boards.
Because there is no 'strobe' between the boards to notify of state change, I am suspect that happens as a consequence of data being available on B.
Interboard Communications
Reviewing the ISR for Port B:
7E1C isrPIOb_7E1C: 7E1C FB ei ; allow nesting this interrupt 7E1D F5 push af 7E1E C5 push bc 7E1F D5 push de 7E20 E5 push hl 7E21 CD 2F 7E call sub_7E2F 7E24 3E 0A ld a, 10 7E26 32 E3 FF ld (byte_FFE3), a 7E29 E1 pop hl 7E2A D1 pop de 7E2B C1 pop bc 7E2C F1 pop af 7E2D ED 4D reti
There is a constant whacking of byte_FFE3 to the value of 10. Cross referencing that, it can be found in the ISR for Port A:
7F95 isrPIOaHelper_7F95: 7F95 F5 push af 7F96 E5 push hl 7F97 21 E3 FF ld hl, byte_FFE3 7F9A 35 dec (hl) 7F9B 20 1A jr nz, leave_7FB7 ...
So Port B whacks it to 10 ever time a byte comes over, and Port A decrements it each time a systick comes in, doing different things based on whether is reaches zero or not. So I surmise this is a 'receive data timeout'. Since I know the systick is 60 Hz, this means a timeout of 167 ms.
I elided the code above, but the short story is that 'if it times out, the system is returned to state 0'. So state 0 seems to be the quiescent state.
I should also point out that in these state processing, that Port B is turned around several times. So in both systems it is configured in 'input mode' but during the operation is it changed to output as well.
So sub_7E2F is probably the 'stow a byte and maybe advance the state machine' function.
After I went though all the dispatch tables (there's something like 12 of them), I am now at 100% disassembly coverage of ROM4. I don't know what all these things do, but it's still a milestone because code and data are now separated, so cross referencing can happen. On the other hand, this state table design will possibly require my building of another document to keep track of the system.
I did take a peek at [Nigel]'s instruction trace. At this point it's not exciting news because it is simply stuck in the 'memory test failed' loop on the PCU board. He had configured the emulator for less than 32 KiB RAM and so it was failing there. Easily fixed, and he got to a prompt, but still not working. No big surprise, still so much more to do.
OK, with this understanding of PIO A and B, I'm going back to PCU board.
-
20241105 -- PCU Initial Results, Serial, and on to ROM 4
3 days ago • 0 commentsI delivered my initial results to [Nigel]. We are in agreement that the PIO added to the 'main' (MCU2) board is the thing that does the interboard communications.
On the MCU board I had observed that PIO Port A was the datasheet calls 'control' mode, which means 'general purpose I/O'. But PIO Port B is in 'handshake mode', which means it handles strobe and ready in hardware and you just have to service interrupts. After spending time on the PCU board, I see the same pattern: Port A is gpio, and port B is handshaking. So I believe the port B is cross connected between the boards, and this is what does interboard communications.
One curiosity is that in the board shot, I can see a serial jack next to the PIO. But there are no Z80-SIO or other UART chips on board. This unit was meant to have an online feature that apparently was never developed. So, no UART? Could this be bit-banged serial? That was a thing back in the day; even the TRS-80 Model I could manage 300 bps.
I also found an NMI handler. It's so weird that it might be junk code:
0066 ; XXX NMI 0066 nmi_66: 0066 F5 push af 0067 C5 push bc 0068 D5 push de 0069 E5 push hl 006A 21 82 60 ld hl, loc_6081+1 ; XXX wut? middle of an instruction; we're probably in the weeds 006D 22 00 C0 ld (word_C000), hl ; XXX wut? we write here, but we immediately over write with the ldir 0070 11 00 C0 ld de, word_C000 ; XXX initted from 0238; has IM2 vectors and other stuff 0073 01 18 00 ld bc, 18h 0076 ED B0 ldir 0078 3E 80 ld a, 80h 007A 32 6D C0 ld (byte_C06D), a 007D E1 pop hl 007E D1 pop de 007F C1 pop bc 0080 F1 pop af ; XXX what, no RETN? 0081 C9 ret
So it's blasting the IM2 vector table with bytes from an instruction stream. It doesn't even blast the whole table, just most of it. Definitely serious devastations. But it does this under disabled interrupts, and moreover the routine ends with a RET, which leaves the interrupts disabled. (Normally you end this routine with a RETN, which restores the interrupts to whatever state they were in before.) So the blasted table doesn't cause immediate destruction. And moreover it's not the whole table that's blasted, just most of it.
cross referencing it to other locations, notably I find:
02C3 sub_2C3: 02C3 3A 6D C0 ld a, (byte_C06D) 02C6 B7 or a 02C7 20 0D jr nz, loc_2D6 ... 02D6 loc_2D6: 02D6 CD AC 02 call sub_2AC 02D9 C3 00 01 jp boot_100
so when set, it does cause a warm boot that reinitializes the system.
There is another reference:
55D6 3A 6D C0 ld a, (byte_C06D) 55D9 A7 and a 55DA C0 ret nz
which just exits a subroutine early if it's set. Maybe because that routine knows that it's all going downhill from here, so don't bother with whatever you were going to do.
So, it could be that a pushbutton is attached to /NMI and serves as a reset on this board. I don't know why they would do it this way other than that maybe it permits a more orderly shutdown. At the same time, the weird code blasting the interrupt vector table is puzzling.
[Nigel] inquired about port 30h, 38h, and 39h. I have no idea what those are for yet. All I can say is that they are written very early in the boot process, and the data does not involve the instruction stream at all.
I expressed my belief that I should go back to MCU2 and spend some time with ROM 4, which I hadn't looked at much before other than to verify that it does set the system into IM2. The rest is unexplored.
[Nigel] has a bit of urgency since his emulator is not running, so he sent me some instruction traces of where it is located.