Close

20241030 - "Undocumented Instructions" and Puzzlin' Evidence

A project log for ROM Disassembly - Cefucom-21

Peering into the soul of this obscure machine

ziggurat29ziggurat29 11/04/2024 at 18:590 Comments

While disassembling, I found a sequence the disassembler refused to interpret

24F7     ; XXX maybe LET 87h
24F7     sub_24F7:
24F7 CD 34 21  call    sub_2134
24FA CD 14 1D  call    sub_1D14
24FD D2 D5 3A  jp      nc, loc_3AD5
2500 55        ld      d, l
2501 FD        db 0FDh        ; XXX ??? wut; undefined function. probably has no effect.
2502 F5        push    af
...

In the Z-80 there are a lot of 'undocumented instructions' (that have long since been documented).  The Z-80, being binary-compatible with the 8080, added an IX and IY register, and uses a prefix that indicates 'use the IX/IY instead of HL in the following instruction'.  This is great, though there are a bunch of code points that do nothing with HL that can be prefixed as well.  Some of them are interesting.  But many are not.  This was one of those cases.  The FDh prefix normally means 'use IY instead of HL' but the next instruction involves AF, so the FDh is effectively a NOP.  What a weird thing to have in code.  A bug?  Maybe; it would be benign.

Elsewhere was some other suspicious code:

095F CD 19 1D  call    sub_1D19
0962 C7        rst     0               ; XXX wut? restart the computer?
0963 2B        dec     hl              ; XXX wut? can't get here?

and

238B CD 19 1D  call    sub_1D19
238E D2 54 5D  jp      nc, unk_5D54    ; XXX wut? jumps into text at end of rom

So, it's time for a closer look.

1D19        sub_1D19:
1D19 7E        ld      a, (hl)
1D1A E3        ex      (sp), hl
1D1B BE        cp      (hl)
1D1C 23        inc     hl
1D1D E3        ex      (sp), hl
1D1E C2 BB 05  jp      nz, showSyntaxError_5BB ; show "Syntax Error"
...

 So there's some legerdemain going on with the stack.  HL in these BASIC implementation points to the current program byte, usually text.  So we're getting that in A, then we exchange the Top-Of-Stack value (which is the return address) with HL.  And we compare the program byte in A with the value that is at the return address (which is usually code!).  Then we increment that return address and put it back onto the stack.

If the values do not compare, then we go to the 'Syntax Error' routine.  Otherwise we continue on doing ... stuff.

OK, that means I need to take a pass through all call sites to sub_1D19 and fixup that a data byte follows, and that normal code disassembly resumes after that byte.

Doing so made things more sane; e.g. the above transformed to:

095F CD 19 1D  call    XXXsyntaxCheck_1D19;
0962 C7        db 0C7h             ; XXX BASIC token ???
0963 2B        dec     hl

and

238B CD 19 1D  call    XXXsyntaxCheck_1D19;
238E D2        db 0D2h             ; XXX BASIC token ???
238F 54        ld      d, h
2390 5D        ld      e, l

And the first one with the FDh prefix called sub_1D14 which is just prior to 1D19 and falls through into it.

1D13     loc_1D13:
1D13 23        inc     hl

1D14     ; XXX skip space and syntax check
1D14     sub_1D14:
1D14 7E        ld      a, (hl)
1D15 FE 20     cp      ' '
1D17 28 FA     jr      z, loc_1D13

1D19     ; expect @ hl byte following caller; syntax error on fail; skip spacess and get and qualify b Z if end-of-statement (could be nul, ELSE or colon), C if digit
1D19     XXXsyntaxCheck_1D19:
1D19 7E        ld      a, (hl)
1D1A E3        ex      (sp), hl        ; the the parameter pointer in hl
...

And that changes the thing that originally piqued my interest to the more sane: 

24F7     ; handle BASIC token LET 87h
24F7     bastokLET_24F7:
24F7 CD 34 21  call    sub_2134
24FA CD 14 1D  call    sub_1D14        ; XXX skip space and syntax check
24FD D2        db 0D2h             ; basic token '='
24FE D5        push    de
24FF 3A 55 FD  ld      a, (byte_FD55)  ; XXX data type?? 0 = int?, 1 = stringref?
2502 F5        push    af
...

Which is much more sane and doesn't involved undocumented instructions -- much less ones that do nothing.

I've seen this method of passing a parameter to a function before:  in the TRS-80 BASIC ROM.  E.g.:

;NOTE:  this is from the TRS-80 ROM; not Cefu
1C96     _impl_rst8_SyntaxCheckAndSkipWhitespace:
1C96 7E        ld      a, (hl)
1C97 E3        ex      (sp), hl        ; the the parameter pointer in hl
1C98 BE        cp      (hl)        ; see if the the parameter matches the value at the buffer position that was passed in (HL at time of call)
1C99 23        inc     hl          ; (the parameter is in the code following the call; asjust return address past it)
1C9A E3         ex      (sp), hl        ; fixup return address, and restore buffer pointer into HL
1C9B CA 78 1D  jp      z, _impl_rst10_AdvanceAndSkipWhitespace ; advance past whitespace if we succeeded our syntax check
1C9E C3 97 19  jp      errSyntax       ; SN ERROR routine.

It's also interesting that the Cefu does not (directly) get to this function via a RST, whereas the TRS-80 does.  So maybe time to take a closer look at the RSTs....

Discussions