Not much progress in the last week or so, but something still. I've been trying to get the TI-99/4A ROMs to run on my CPU, but no luck yet.
X instruction
I found a bug in the implementation of the X instruction. One memory read was missing from the execution state sequence, and caused the instruction to essentially execute a random opcode. The X instruction is certainly a very CISCy instruction, it allows one to execute a single word instruction provided as an operand. Very different from the normal case where instructions are fetched from the program counter, here one can write "X R5" to cause the CPU to execute the opcode stored in register 5. The X instruction was an instruction I did not test, but subsequently learned it was used by TI-99/4A ROMs. Of course there was a bug in implementation...
Single stepping
The way I found the bug in the CPU was such that I modified the classic99 emulator so that it outputs instruction traces, namely the values of PC and ST registers before each instruction execution. These are stored in a text file. Then I modified the FPGA CPU so that it has single stepping capability: one control register bit enables single stepping mode (effectively by just asserting a DMA request making the CPU stop), and another control register bit, when set, briefly releases the DMA request so that the CPU starts to execute an instruction. DMA request is then immediately asserted again, but my CPU implementation only samples DMA requests during opcode fetches. Together those features enable single stepping. I then added a 64-bit debug register, which is readable over the USB connection. In that register the values of the current opcode, PC and ST are available. That way I could create a similar trace of instructions as in the emulator run. I extended my host side Windows program to have this feature.
Doing single stepped instruction trace comparisons enabled me to see the differences in behaviour, and I found that the X instruction was bogus. Unfortunately this method only goes so far, since there are vast timing differences. Single stepping is done with PC control and runs very slowly due to that, so once interrupts are enabled the comparison method no longer works, as interrupts are served way too slowly and therefore are always pending. Once the CPU exists the interrupt service routine, it just jumps right back in. I guess I may have to add an additional interrupt debug mask bit to disable video processor interrupts during single stepping runs.
Byte wide instruction flag bug
My VHDL code has a process which computes byte alignment, i.e. when the CPU is reading a byte from memory, this block aligns it properly as a 16-bit entity. This is done by zero extending and moving the desired byte to the most significant byte of the 16-bit operand word. Yes, this is different from x86, byte operands are not handled by the TMS9900 as the least significant bytes of registers, but as most significant bytes. A mindset difference right there.
By comparing instruction traces I noticed that sometimes my FPGA CPU was setting status bits incorrectly, and after some pretty intensive testing I realised that if the source operand of a byte wide operation was a register (not memory location) the zero extension did not work, but instead the entire 16-bit contents of a register were passed to the ALU. The correct operation is to only pass the high byte of the register and zero out the least significant byte. This was difficult to find since the actual operation (datapath) worked correctly, only flags were sometimes set incorrectly due to the LSBs having some non-zero bits. Now this is fixed. The bug was not there if the source byte was read from memory (for example by indirection such as in MOVB *R2,@>1234). I am sure there is a ton of nasty bugs like this one left to debug. But one more down!
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.