Experiment to create a minimal 4-bit CPU that can do something fun
To make the experience fit your profile, pick a username and tell us what interests you.
We found and based on your interests.
I made a couple of changes:
spg 3 ; set page register to 3 pga ; move accumulator value to page register out 5 ; set output register to 5 out ; move accumulator value to output register
These changes will add some functionality while simplifying the instruction decoding - always nice when that happens!
To make this diagram I tried using Digikey's online schematic tool. The end result isn't too bad but it's not particularly nice to use. If anyone has any good suggestions for tools for making this kind of block diagram, please let me know!
Here's where the design is at the moment:
The accumulator (A) can take the value B, A+B, A nand B, or IN, where B is either a 4-bit immediate, or a 4-bit value from RAM. The accumulator can be sent to an output register, or written to RAM.
The 8-bit RAM address is generated from 4 bits in the instruction word, and 4 bits from a page register (PG). So there are sixteen pages, each 16 words (nibbles) in size - an address of $3F is location F on page 3.
In the diagram I've got the input to PG coming from an immediate value, but thinking about it now, why doesn't it come from the accumulator? Then you can compute page addresses and have some indirection. The only reason I can see for not doing that is that you can't have a value in the accumulator and then save it to a specific 8-bit address - you need to clobber the accumulator to update PG. But you could save it temporarily to the current page - perhaps location 0 of each page is kept free for such a thing. More thinking is necessary here - a lot of this stuff has been paged out of my head.
Instructions come from a 4K ROM and are 8 bits wide. The 4-bit operand is either an immediate value or a RAM location. Or - in the case of a jump - the lower 4 bits of the 12-bit jump target. So where do the other 8 bits come from? They come from the next location in ROM, which handily is already available - it's the input to the instruction register rather than the output. All thanks to the "pipelined" nature of instruction fetching. The jump logic will be the topic of the next update.
It would be nice to be able to make use of the carry output from the 74LS283 adder, but it's going to require at least one extra chip to store the carry bit, and maybe more to decode the opcode into a "write carry" signal.
The alternative is to OR all the accumulator's bits together to test for zero. That doesn't need any chips, just four diodes connected like this:
"jnz" (jump if accumulator is not zero) will be our conditional jump.
The question now is, how do you synthesise a carry bit in software if you don't have one in hardware? I couldn't find much information about this - a common definition of the carry bit is "1 if the result of A+B is less than A (or B)", but that's not very helpful - it's not very easy to do an unsigned comparison without a carry flag! In the end I found the answer in the source code for the Gigatron, which I knew doesn't have a carry flag.
Q = A + B if top (sign) bit of Q is set: carry bit = top bit of (A & B) else: carry bit = top bit of (A | B)
This is where having a NAND operation becomes very useful. ANDing A and B is just a case of NANDing, then inverting:
lda $A nan $B nan f ; nand with 0b1111 = invert
OR is ~(~A . ~B), i.e. NAND with both inputs inverted. This requires a temporary location:
lda $A nan f sta $notA lda $B nan f nan $notA ; ~A nand ~B == A or B
Putting it all together, here's how to add two 8-bit numbers:
; input values lda f ;a=0xff (big endian, stored at $0/$1) sta $0 lda f sta $1 lda 5 ;b=0x52 (stored at $2/$3) sta $2 lda 2 sta $3 ; add two 8-bit numbers lda $1 ; add lo nibbles add $3 sta $5 ; store result at $5 nan %1000 ; check hi bit nan f jnz set lda $1 ; msb clr: a or b nan f sta $f ; $f = not a lda $3 nan f nan $f jmp next set: lda $1 ; msb set: a and b nan $3 nan f next: nan %1000 ; hi bit is carry nan f jnz carry jmp addhi ; acc already zero if no carry carry: lda 1 addhi: add $0 ; add hi nibs + carry add $2 sta $4 ; store result at $4
Here is what I am thinking of for the "ALU":
That allows for these instructions:
000m vvvv lda acc = value 001m vvvv add acc = acc + value 010m vvvv nan acc = ~(acc & value) 011x xxxx in acc = in m: 0=immediate value, 1=value from given RAM address v: 4-bit value x: don't care
I really like how this turned out, because the decoding for these instructions requires no additional chips!
opcode bits: 3210 vvvv |||| |||\- operand/memory mux select ||\-- accumulator source mux select |\--- -"- \---- accumulator write enable (active low)
Two opcodes are taken up by "in" but it's worth that small cost.
If we're going to write characters to the display we need a font, which is a lot of constant data. The easiest thing is for that constant data to reside in ROM in the form of instructions, which populate RAM:
lda %1011 ; load accumulator with 4 bits sta $1 ; store at some memory location (upper 4 address bits will come from somewhere else) lda %0101 ; next 4 bits sta $2 ...
The next part of the program then reads the data in RAM and clocks it out serially.
Control flow: we will have something simple like "jump if accumulator is zero". Is this enough? Can you have subroutines when you can only jump to an immediate address? Maybe if you have a jump table at the end of each subroutine, selecting which caller to jump back to.
Addressing: similar question - can all addresses be immediate, or do we need the ability to store addresses in RAM?
There will be many tradeoffs to look at - if removing a couple of chips causes the code size to balloon to make up for it, it may not be worth it.
Create an account to leave a comment. Already have an account? Log In.
Thanks for sharing - looks like quite a capable machine for a four-bitter!
Interesting project! 4-bit CPUs are underappreciated but they were produced by untold millions driving so many devices in 70-ies and 80-ies. My attempt was to use single Am2901 slice as the 4-bit CPU core, and add some GAL22V10 for instruction decode, input MUX etc. Sadly, never documented it sufficiently, but one test program can give idea how it worked: https://github.com/zpekic/tinycomputer/blob/master/testprog1.mif
Become a member to follow this project and never miss any updates
Hi @Kyle McInnes , cool design concepts!
I agree with @zpekic , the 4-bit CPUs are a good architecture to learn and understand many things. Also, they can do cool things!
Take a look at,
https://github.com/edson-acordi/4bit-microcomputer