Again time constrained...
Still running the core in simulation, I added the support of byte operations. The TMS9900 has only one category of instructions which support byte operations: the dual operand instructions with all addressing modes. These are the most flexible instructions.
In principle byte operations are simple, because they are done by reading and writing 16-bit values (the bus only supports these (except with single bit CRU operations that I don't support yet)). So you read a 16-bit word, and put the relevant byte as the most significant byte. When writing to memory, you need to do a read-modify-write cycle, and put the relevant byte where it belongs.
For example, if at address >1000 you have a data word >1234, you have as bytes >12 at address >1000 and >34 at address >1001. Now if you do a MOVB to the destination address >1000 with source data of >55, the result will become >55 at >1000 and still >34 at >1001. Since the bus only supports 16-bit values, you have >5534 at >1000. Similarly, if you store >55 at >1001, the memory word at >1000 becomes >1255. Note that with Ti assemblers the greater than sign > denotes a hexadecimal number.
Simple, right? In principle, yes, in practice not exactly. Since there is an exception. If the write destination is a workspace register, you always modify the high byte of the register. Conceptually for a programmer this is very simple. If you for example consider the add byte instruction and do AB @>1001,R2 and at >1000 you have >1234 then the memory word at >1000 will be read, the least significant byte >34 (since the LSB of the address was 1) will be shifted to the MSB with zero extension (i.e. the word >3400) and that will be added to the contents of the most significant byte of R2. So you preserve the least significant byte of R2.
But if you consider the above as a hardware designer, and keep in mind that the registers are actually in memory, you may need to special case direct register accesses to make sure you always deal with high bytes of registers. This comes back to how the hardware stores effective addresses, as the least significant bit of effective operand address calculation becomes a byte shifter control line. Now that I think about this, it actually maybe is not necessary to special case the registers... So it is useful to write this blog entries :)
Internally I use the following hardware block to handle read operand processing for bytes:
-- Byte aligner process(ea, rd_dat, operand_mode, operand_word) begin -- We have a byte operation. If the data came from register, -- we don't need to do anything. If it came from memory, -- we will zero extend and possibly shift. if operand_word or operand_mode(5 downto 4) = "00" then read_byte_aligner <= rd_dat; else -- Not register operand. Need to check that EA is still valid. if ea(0) = '0' then read_byte_aligner <= rd_dat(15 downto 8) & x"00"; else read_byte_aligner <= rd_dat(7 downto 0) & x"00"; end if; end if; end process;
These are the byte instructions:
- AB - add bytes
- CB - compare bytes
- SB - subsctract bytes
- SOCB - set ones corresponding bytes (actually OR operation)
- SZCB - Set zeros corresponding bytes (and not operation)
- MOVB - move bytes
For both source and destination operands you have the 5 addressing modes, using R3 as example we have:
- R3
- *R3
- *R3+
- @LABEL
- @TABLE(R3)
So this definitely is a CISC architecture, as you can do things like:
AB *R3+,@TABLE(R2)
This reads the source byte from the address R3 and increments R3 by one. It then retrieves the immediate 16-bit address operand TABLE, and adds that to R2 to have an indexed destination address. It then reads the byte from that destination address, and adds it to the source byte, and writes that byte back. As explained before, the actual read operations on the memory bus are 16-bit operations, so there is byte shuffling going on simultaneously, depending on the actual effective addresses. Again since the workspace registers are actually in memory, there CPU core must also calculate where they are and do read cycles of the registers, and also do a write to update R3. So in terms of memory cycles on the 16-bit bus, we have the following:
- Opcode read (AB) from instruction stream
- Source operand read for R3, this is at [W+(3 << 1)]. This is SA.
- Update of R3, so [W+(3 << 1)] gets written
- Fetch of source byte from *R3, i.e. from [SA]
- Fetch of the 16-bit address TABLE from instruction stream
- Fetch of R2 from [W+(2 << 1)]
- Fetch of destination byte from [TABLE+[W+(2 << 1)]]. This is the DA
- Write the result of the byte addition to [TABLE+[W+(2 << 1)]] i.e. to [DA]
So things get fairly complex - and the above does not show the ALU operations or program counter increments, or the 8 to 16-bit and 16-bit to 8-bit shifts. Definitely a CISC machine.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.