-
Bus Access Timing Conflict - Does it Matter?
04/12/2020 at 18:30 • 0 commentsI've written before about timing constraints in the git hub project. https://github.com/portingle/spam-1/blob/master/docs/timing-considerations.md
I've recently been putting v2 of this processor's logic into Verilog to road test the design before committing to breadboard. Honestly, sometimes I think it would be less hassle to go straight to h/w flying by the seat of the pants, but then I'd have missed out on a bit of new learning with Verilog.
I have four devices that take turns writing to the bus; the ROM, the RAM, the ALU, the UART.
If more than one device is inadvertently asserting on the bus simultaneously then there are risks of high current spikes and potentially damage to chips.
The ROM and ALU both connect to the bus via a 75HCT245 and the RAM and UART connect directly. In each case there is a control line to signal the logic low output enable of each device.
The logic behind the four control line is ...
assign _decodedOp = 8 outputs of a 74138 decoder chip - decoding the op code assign _rom_out = _decodedOp[op_DEV_eq_ROM_sel]; assign #11 _ram_out = _decodedOp[op_DEV_eq_RAM_sel] && _decodedOp[op_DEV_eq_RAMZP_sel]; assign #11 _uart_out = _decodedOp[op_DEV_eq_UART_sel] && _decodedOp[op_RAMZP_eq_UART_sel]; assign #11 _alu_out = _decodedOp[op_NONREG_eq_OPREGY_sel] && _decodedOp[op_REGX_eq_ALU_sel] && _decodedOp[op_RAMZP_eq_REG_sel];
You can see that the _rom_out line is a direct connection to one of the 8 output wires of the 74138, whereas the other three control lines include an AND gate.
The timings shown, ie 11ns, are realistic for the 74HCT AND gates.
So .... today I put a constraint into my verilog model to check that we only have one of the bus control lines active at any instant and the check started failing.
The check looks like this....
wire [3:0] rrau = {_rom_out, _ram_out, _alu_out, _uart_out}; always @* begin assert (rrau == 4'b1111 || rrau == 4'b1110 || rrau == 4'b1101 || rrau == 4'b1011 || rrau == 4'b0111) else begin $error("Contention on data control lines Rom,Ram,Alu,Uart=%4b !!!!!!!!!!!!!!!!!!!!!!!!\n", rrau); $finish_and_return(1); end end
Going back to the control line logic we find that that 11ms delay is the likely culprit.
Since _rom_out transitions earlier than the other three control lines then each time the rom takes over the bus then momentarily two lines are simultaneously low. The ROM starts writing before the previous device has had time to relinquish control of the bus.
But, does it matter??
I think it's unlikely that damage will occur because the vanishly short 11ns window doesn't allow for much heating to occur in the gates involved in the push-pull.
It is probable that there will be some short current spikes on the power rail however a capacitor ought to fix that if it's an issue.
But, what other approaches could have been used to avoid or reduce the contention?
Well it's common to see home brew CPU's using a ROM for control logic. However, I don't think a ROM would have been 100% certain to avoid this kind of thing, due to uncertainty of the ROM output values during switching, but again if it's a very short contention then the same observations as above apply.
Alternatively, I could put a redundant AND gate on the _rom_out line to even up the timings.
Diode logic would also work in the case I have above, and that would have resulted in equivalent timings for all routes, if I also includes a diode on _rom_out.
All the above approaches go some way to mitigate the problem by reducing the time window of potential contention to smaller and smaller intervals.
Is there a better approach?
I could have organised the control logic so that there is an interval between the control wires returning to logic high before one of them alone returned to logic low. But, that's a lot of extra logic. Of perhaps some snazzy logic to make the negative transitions take longer than the positive transitions.
Clearly, glitching is a common issue, various 74xx series data sheets state "glitch free" logic in their description.
I think a better approach would have been if I had been able to use open-collector outputs for the ROM/RAM/ALU/UART. Open collector outputs were designed to gloss over transient issues like this, however, firstly ,the RAM and UART don't use open collector outputs and secondly I can't find a reasonably priced open collector alternative to the 74HCT245.
Conclusion
I've decided to live with it. :(
Am I wrong?
How could I do better?
(see also the following which applies not just to FPGA designs https://zipcpu.com/blog/2017/08/21/rules-for-newbies.html)
-
74245 Verilog
03/16/2020 at 08:14 • 0 commentsStill learning Verilog.
I needed a 74245 but couldn't find the code for one, so came up with my own.
Had to learn lots of lessons on INOUT pins and how to write tests for it. A lot harder than I thought and still some things I had to do that I don't understand why something simpler didn't work.
-
Waking up..
03/06/2020 at 00:28 • 0 commentsWaking up the design of the ALU and the instruction set and will then get back to the Verilog sim.
Will hopefully have something to share on the YouTube channel too soon.
Thanks
-
Still Here!
02/08/2020 at 00:34 • 0 commentsThis project is still alive, I just got distracted by life and also because I started another project to test the chips I'd bought from eBay and AliExpress as I was a bit doubtful about them.
The other project is an integrated circuit tester that has the uncommon feature that it can also detect tristate outputs. https://hackaday.io/project/169707-integrated-circuit-tester-tristate-too
Back soon on SPAM-1 !
-
EPROM/EEPROM alternatives?
09/16/2019 at 00:11 • 0 commentsBen Eater has just released his latest vid, this time he's demonstrating the build of a 6502 based computer. In this build he uses a 32k x 8 EEPROM, as he did in his earlier CPU series.
This reminded me of some research I'd done for memory for my own project SPAM-1 which I've attempted to write up here.
BIG ROMS
It's fairly common to use an EEPROM in place of combinatorial logic for an ALU or control logic.
A limiting factor with these devices is the address bits width and data bits width limiting the control space. For example if I wanted to use a ROM as an 8 bit ALU then I'd probably want to have the address space of something like 21 bits (2 lots of 8 bits for the arguments, 4 bits for the ALU operation selection, 1 bit carry in) and a data bus of 16 bits to allow for the 8 bit result plus flags like Carry, Neg, Zero, Overflow and comparison bits.
One can find EPROM's that would fit that bill but these need a UV lamp, higher programming voltages etc, and so I wondered what else was out there. Also of potential concern is that EPROMs and EEPROMs are typically quite slow.
Hunting around it seems that Flash memory is quite common and a lot faster than EPROMS or even EEPROMS. There are some rather large and cheap 16 bit data width NOR Flash devices that look really interesting: https://mou.sr/2NguVcp - some up to 26 bits of address space.
I need to find a programmer that's guaranteed to program these things - which one !!?
SMALL ROMS.
My research for SPAM-1 includes looking for a faster alternative to the usual program memory EEPROM so 32kx8 or 64kx8 would be fine for this applications.
The EEPROM is probably around the 120ns to 140ns mark however RAMs can be faster than as 25ns read cycle.
What I found my self looking at was various forms of non volatile RAM for the program memory. For example non volatile RAM such as a FRAM or an integrated battery backed nvSRAM.
One additional potential advantage of using non vol RAM is not needing a regular EPROM programmer. Non-volatile RAM, behaves like RAM but retains the data when disconnected from power. So there's no need for a custom EEPROM programmer if you already have something like an Arduino with the 50 pin I/O (or an nano with a bunch of shift registers!) - you can use that to write to the NV RAM. Unlike an EEPROM there are no complicated JEDEC timing and control signal protocols, instead just manipulate the I/O on device like its a RAM.
Here's a scratch pad of my notes on the non-volatile RAM:
Some of the more recent devices don't come in DIP but one can use an "interposer" to adapt the chips back to DIP. Eg see these http://www.proto-advantage.com/store/interposer.php
---
ROM alternatives:The M48Z35Y nvRAM is 32kx8 and £12 retail https://uk.rs-online.com/web/p/nvram/1686072/
This comes in a 28 pin DIP package you can plug straight into a breadboard.
----
CY14B256LA-SZ45XI This 45ns nvSRAM is 32x8 and costs about £10.
It is SOIC-32 so needs an adapter to DIP-32.
A SOIC-32 to 32 pin DIP interposer can be about £5There is a 25ns version for £2 more.
If you can find the STK16C88 then this is already in a DIP package so no soldering!!
---
FRAM is another special RAM type but is SMT only.
Eg ://www.mouser.co.uk/Semiconductors/Memory-ICs/F-RAM/_/N-488wv?P=1z0vhtuZ1yyxb3oZ1yxt9bd
You can then pick up a SMT to DIP adapter for the FRAM for between $1 and $5, depending on your choice of FRAM.
Using this adapter you can mount the FRAM on breadboard for programming and for use in the final circuit.
---
So say you were running at 3.6volts then you might get one of these for £3.50 https://www.mouser.co.uk/ProductDetail/ROHM-Semiconductor/MR48V256CTAZAARL?qs=sGAEpiMZZMtsPi73Z94q0LLtWXL8TrlCrovkBv8dHEM%3D
... and use a TSOP1 adapter like this for £5, perhaps ..
http://www.proto-advantage.com/store/product_info.php?products_id=2200248--
If using 5v then its bit more expensive for the FRAM at £14 for a FM18W08 FRAM
The 28 pin SOIC to 28 pin DIP adapter is only £1 and the SOIC package is bigger which you might find easier to solder.
http://www.hobbytronics.co.uk/soic-dip-breakout-28?utm_source=google&utm_medium=googleshopping&utm_campaign=googlebase&gclid=CjwKCAjwwvfrBRBIEiwA2nFiPdOdl0rjmINok3TjMRHf6SGsT8L8cazImt3jfQeJAdN3Try3Q5Q54xoCeuAQAvD_BwE -
AliExpress?!
09/08/2019 at 12:34 • 0 commentsWow stuff arrived in about 2 weeks not the expected 30-40 days
-
Clock Gating
09/07/2019 at 11:54 • 5 commentsI'd prefer not to mess with the clock. If I'm using a chip like the 8 bit register 74HC377 which has a LE (latch enable) and clock then I can bring LE high to disable the input to the latch.
But what if I'm using 74HC574 which is also an 8 bit register which is similar to the 74377 but swaps the LE for OE (output enable). If I need control over whether data is clocked into this chip then the only option left to me is to gate the clock.
One of the main things to avoid with clock gating is the problem of creating spurious clock signals. Whether there is a risk of this depends on the method taken for the gating and the relative timing of the raw clock versus the other signals with which you are gating the raw clock.
Whilst a spurious clock may or may not be a problem when latching it is more likely to cause an issue with a counter, such as in the program counter or within a microcode circuit's timing. Having the counter advance unexpectedly and with an extremely short duty cycle would likely cause a malfunction including violating timing constraints of other components within the overal circuit.
A common case is where the clock is ANDed with some other control signal - I need to do exactly this in a register file I'm designing because I'm using 74HC574 which doesn't have a LE input. I need the 74HC574 for it's tristate output which means I need to compromise on the clock gating. It a question of reducing the complexity. But now I'm worried about gating effects. A spurious clock might make me latch the wrong value. Surely an AND of a control signal and the clock won't cause a problem?
The paper A Review of Clock Gating Techniques covers a few of the approaches to clock gating and the first option discussed is a "simple" AND gated approach. The paper demonstrates how the relative timing aspect can cause spurious triggers.
However in the paper I wondered whether the analysis was correct, whether a hazard had been missed. Below I've highlighted the transition I'm concerned about in this extract.
It seems to me that there is a risk that this transition might cause a spurious clock on Gclk. This might happen if the en went high moments before the fall of the raw clk. Whether or not this potential glitch occurs in practice depends on the derivation of the en. In a CPU the en signal might be generated from some combinatorial logic and is likely to lag the clk due to propagation delays, in which case the problem I'm concerned about wouldn't happen.
The paper same paper goes on to criticise NOR and also other techniques - worth reading.
Timing matters.
-
Timing is Everything
08/29/2019 at 14:56 • 0 commentsResearching Propagation Delays
One might at some point ask a question "what limits the speed of a CPU?". In most circuits the speed limit probably comes down to the well documented timing limitations of the components you've chosen to use. To be fair factors such as capacitance, or even inductance, damping out or creating higher frequency signals might complicate things but I haven't the experience to say the extent of the role that capacitance and inductance play in limiting home brew CPU speeds; CPU's that run 10MHz or less. My instincts tell me that capacitance and inductance are unlikely to be a huge concern unless attempting to run at higher frequencies and in any case these factors would probably by highly dependent on the specific layout of one's CPU.
We know that running a home brew CPU in the 1MHz-5MHz range is entirely feasible and it's easy to find folk doing this, such as Warren Toomey and his CSCvon8.
Why not faster?
State changes in electronics are fast but not instantaneous. All electronic components have delays around the time needed to setup the inputs of the component and/or delays between an input changing and the output of the component reacting to that change. During these intervals the outputs of the component cannot be relied on to be have a stable and deterministic value and these kinds of delays put a hard limit on how fast one can run a given component.
In addition, many components will state how long the inputs of the device (eg the data in of an latch) must be held stable prior to a trigger or enable signal. If the inputs are not held stable sufficiently long prior to the trigger then the data seen by the device is unpredictable; ie potentially random behaviour.
When we put several components together in a circuit then the individual delays of the component sum up on the critical path of the larger circuit to further reduce the maximum operating frequency of the circuit as a whole.
For example, consider the logic around an instruction that latches a constant from the program memory into a register. The program memory, eg a ROM, will have associated control logic around it and the output of the ROM will loaded into a register and all those components will have their own delays.
For example, lets say the control logic, a 74HCHCT151 multiplexer, takes 25ns to stabilise the output enable flag on the ROM, the ROM, a AT28C256-15, takes 150ns to display the new data value and finally the register, a 74HCT374, takes 12ns to latch the ROM output, then the critical path through those components is 187ns. In this case then the fastest one could expect to operate this path would be of the order of 5.3Mhz ( = 1 divided by 0.000,000,187).
And depending on the choice of components this max clock rate could be considerably less. For example, if instead we choose the slower AT28C256-35 ROM then we are looking at a tACC (time from address to data output stabilisation) or tCE(time from chip enable to data output stabilisation) of 350ns, which would mean a max frequency of no more than 1/(25+350+12) = 2.5Mhz approx.
All the calcs above are approximations from the data sheets (which I may have misread) but the principal stands that longer delays in the individual components translate to lower operating frequencies for the CPU.
If one uses a more sophisticated simulation or modelling tool than Logisim then the tool will probably calculate your critical path for you and tell you the timings. The critical path is the slowest path and thus the path likely with the greatest impact on speed.
Most of the time I figure you can work it out roughly like I have above and then decide if you want to make adjustments to your design.
One significant improvement to the 150ns ROM example given above might be have the ROM contain a simple boot program that copies the program code from ROM into a much faster SRAM chip at power-on and then run the program entirely from the SRAM. For example, it is easy to find 32Kx8 SRAM chips with 20ns read cycle time instead of the 150ns EEPROM and this might get the CPU clock up to speeds closer to 10MHz. I've not tested that yet of course, but you get the idea.
More info on propagation delays can be found in the MIT Computational Structures lecture notes which are quite detailed. And, of course, don't forget to look in your data sheets.
Clock Domain Crossing
If interested in clocks and timing then you should also take a look at Warren Toomey's answer to a question I posed him on clock timing where he goes into a lot of useful and interesting details (thanks).
However, Warren also pointed me to a page about "Clock Domain Crossing" which I think bit me a while back when I was doing something "clever" with dividing my clock. I did it badly and started experiencing an issue where I got two clock pulses when expecting only one. I figured out on my own that my stupid divider wasn't reliable but didn't have any clue what the technical term for my stupid problem was. I am now wiser thanks to Warren and I advise you to visit that link also.
I know that the comments section in one of Ben Eater's vids contains a bunch of folk talking about why he didn't just "gate the clock" or something like that as it would have been easier.
I will now paraphrase the whole thing as "don't mess with the clock!".
Advice on clocks for newbies
Please also go and read the "Rules for new FPGA designers" page regardless of whether you are planning an FPGA or discrete IC based design. This page is linked off the Clock Domain Crossing page that I mentioned earlier.
You'll see immediately that I (also Ben Eater and others) violate rule 2 by having the PC advance on one edge and the execution firing on the other edge. In fact that same page says that this setup "acts like separate clocks", which I agree with. They act like inverse clocks and Ben Eater even says we need inverse clocks in one of his vid's. The important point is knowing what the options are and the pro's and con's then making an informed decision (and simulating first !!!).
Like myself, Warren Toomey also had a signal glitch problem in the Crazy Small CPU build that was resolved by bringing the clock signal into the logic. He discusses the problem in Crazy Small CPU #13 at about 2m27s. The approach of brining in the system clock is also cited on the newbie advice page as a potential solution to such problems.
The same newbie advice page gives useful guidance on synchronising buttons and other external inputs with the system clock. Coincidentally the need to synchronise a button press with the system clock came up recently where I had to synchronise the system reset button with the program counter logic and I needed to achieve this by use of a SR latch.
-
Great Advice
08/16/2019 at 00:26 • 0 commentsBeen speaking to Warren Toomey of Crazy Small CPU fame and The Unix Heritage Society.
I strongly recommend you check out CrazySmallCPU and the Youtube playlist for the build. Also see CSCvon8 the 8 bit successor. There are some really interesting compromises in there that will probably directly influence my own build, in particular the ALU.
Many other have also used a ROM of some kind, typically an EEPROM, for the combinatorial logic of an ALU (and other components) but I've struggled a bit with most of them as they are typically 4 bit or don't support features like the full set of typical ALU flags out etc. I wanted an 8 bit ALU that would support exploring arbitrary arithmetic and also a full set of flags and compare and I also wanted more than just Add and Sub, I wanted a much more complete set of arithmetic and logic ops. The problem was I couldn't work out how to do it on the EEPROMs I was able to locate.
Doing all that with a pair of 8 bit wide ROMs seemed complicated as well, but Warren identified a 2Megx16 bit wide EPROM that has all the data pins necessary for his (and my needs) and also has a wide enough address range to support 2 sets of 8bits of data in, plus 5 bits left over for ALU operations. Also they are also cheap , however hard'ish to find, need UV erasure and won't program using the Arduino that most folk use for the EEPROMs - so there are some other challenges.
Anyway - that was incredibly useful to me.
Also useful was Warren's recent blog post on control signal timing that covered in detail some questions I had. Ben Eater's series and others don't go into a great deal of info on this and I'm grateful for the experiences shared by Warren.
Also useful to me were pointers about I/O which I will almost certainly implement.
-
Decisions decisions
08/11/2019 at 13:22 • 0 commentsNeed to decide
- whether to further simplify
or go the other way and ...
- add further general purpose registers
- and IO - IO might be achieved by memory mapping the devices, or by adding further special purpose registers etc
Also ...
- consider adopting a higher level language - ? something like ttlcpu ?
- deciding whether to use higher level instructions to drive the microcode or just put this complexity in the Assembler (or Compiler) instead
And ..
- working out how to integrate with VGA - probably using a NovaVGA and some shift registers as IO.
- working out how to integrate with input devices (eg a potentiometer if we were to implement a Pong game for instance)