In this post I want to go over the architecture of COMET68k in some more detail. Specifically I'll describe the busses that exist, which peripherals are on them, and how they function.
Basic Architecture
COMET68k is an enhanced collection of peripherals integrated onto one PCB, enough to make a rather capable standalone or single board computer (if all you care about is serial for input/output). These peripherals are connected to two basic busses, with a 3rd bus providing expansion capability:
- The processor "local" bus contains all 16-bit capable peripherals and devices
- The peripheral "X" bus contains all 8-bit peripherals
- The expansion "system" bus is a buffered copy of the local bus
All of the above busses relate to the data portion of the bus. The address bus is largely singular, except when it comes to the expansion header where most signals are buffered in some way. One address signal is synthesised from the upper data strobe to produce XA0 for the X bus to permit 8-bit devices to be readable and writeable on sequential addresses.
Here is a diagram showing how all of the peripherals fit in to these busses:
Local Bus
The local bus is the bus which is directly connected to the CPU. Its a 16-bit wide bus connecting directly to DRAM, the ethernet controller, the X bus buffers and latches, and buffers for the system bus. The CPU and ethernet controller can both be bus masters, and therefore have direct 16-bit access to DRAM to read instructions and data. This is particularly important for the ethernet controller so that it can DMA packet data as quickly as possible to and from DRAM during transmit and receive operations.
I chose the Am79C90 ethernet controller specifically for its capability to work directly from memory (also known as "zero copy") as this removes the need to copy packets to and from FIFOs or RAM internal to the ethernet controller itself. Although it complicates the system design a little bit, it helps improve performance.
X Bus
I refer to this as the "cross" or "X" bus because it allows 8-bit peripheral data to be directed to or from either half of the CPU local bus. The X bus houses all 8-bit peripherals such as the RTC, UARTs, and some GPIO to allow control of some status LEDs, ethernet loopback options, and reading of configuraiton jumpers and other status bits. It also houses both of the ROM sockets that the CPU executes code from.
As I mentioned in my first post, I have implemented a technique which permits execution of code from a single 8-bit wide ROM, despite the CPU having a 16-bit interface. This is achieved using a latch and buffer which, when directed appropriately, can enable a 16-bit value to be read from the ROM and presented to the CPU. The idea is shamelessly borrowed from a HP JetDirect print server card that I bought in a rummage sale. But I want to give you a more detailed look at how it works, and I think an (annotated) timing diagram is probably the best way to do this with a bried explanation of a 68000 cycle (you can click the image to access the Wavedrom website and view the source of this diagram).
At the falling edge of CPU state S0 the CPU asserts an address on to the address bus (a), arming address decoding circuitry.
At the rising edge following S1 the CPU the CPU then asserts its Address Strobe signal to indicate that the address is valid (b). This then causes the ROM chip select signal to be asserted (c) and it starts outputting data for the lower byte of data.
At the next negative edge of the 40MHz clock the X bus state machine has started to assert some of the control signals required to interface the X bus to the CPU local bus, in this case allowing the contents of the X Data bus through to the lower half of the CPU local bus (d) via a latch. It then proceeds to implement a small delay to satisfy the access time of the ROM.
At the end of this delay several things happen at marker (e):
- The latch enable signal is brought high to stabilise the data on the lower half of the CPU local bus
- XA0 is brought low to begin accessing the upper half of the word that is being read from ROM
- The buffer between the X Data bus and the upper half of the CPU local bus is enabled
- DTACK/ is asserted to indicate to the CPU that it may end the cycle when it is ready
At this point the X bus state machine moves to a state where it waits for AS/ to be negated. At the falling edge of the CPU clock at the end of S4 the CPU will begin looking for DTACK/ to be asserted so that it can end its current cycle, which we have already done and thus this will begin to happen immediately. But the cycle doesnt end immediately, it takes one more CPU clock cycle worth of time for the cycle to end, providing sufficient time for the upper half of the word in the ROM to be presented and stabilise on the upper half of the CPU local bus.
At (f) the CPU is ending its bus cycle by negating AS/. This causes DTACK/ to be negated by the X bus state machine, along with the ROM chip select which stops outputting data. The X bus state machine recognises that AS/ has been negated at the next falling edge of the 40MHz clock, and will then reset all of its control signals to their initial state (g) and proceed back to its idle state to wait for another cycle.
And there we have it, a word has been read from a single 8-bit ROM and presented to the CPU on its 16-bit data bus without requiring any wait states. This scheme and its timing is heavily tied to and factored on the CPLD clock (40MHz) being 4x the CPU clock. At higher CPU speeds this required modification to adjust the point at which DTACK/ is asserted to ensure that all required timing is met and the CPU doesnt end up reading garbage from the ROM.
For those of you who are familiar with the 68k, you may have noticed that there is no mention of the UDS/ and LDS/ signals above. For simplicity sake, a read of any size from the ROM always causes a full word to be read and presented towards the CPU. This makes use of the behaviour of the CPU whereby it will take what ever data it needs from the appropriate halves of the data bus given the operation being performed. And since a word read fits within a single CPU cycle anyway, there is no advantage gained from handling byte reads.
System Bus
The system bus is the bus which permits the system to be expanded with external peripherals on additional cards. It mostly comprises of a buffered copy of the address and data busses from the CPU, along with many of the control signals. Physically it follows the VMEbus standard for its pinout, but logically it is more like a standard asynchronous expansion bus with multi-master capability to allow DMA operations.
It is a bi-directional bus, whereby an off-board master taking control of the bus causes all of the on-board peripherals and memories to be accessible externally. This would permit, for example, a second ethernet interface to be provided on another card, which would be able to send and receive packets from memory contained on the COMET68k board itself. This is the theory at least - at time of writing (and due to not having any expansion capability due to a design issue with my rev 1 board) this is the only functionality not yet tested in the design.
CPLD
The CPLD glues everything together and implements a lot of logic that might otherwise be done in discrete chips. For a board with this level of integration and in such a form factor, it would not be possible without using a CPLD (or a small FPGA) as the quantity of discrete logic required would far exceed the board space available.
The CPLD is responsible for many and varied tasks:
- Clock generation for the CPU, ethernet controller and timers
- DRAM refreshing
- Address decoding for various peripheral address ranges
- Bus watchdog to terminate bus cycles if they are not otherwise terminated in a timely manner
- Interrupt prioritisation
- Bus arbitration between the multiple devices that can be masters
- Specialised handlers for the X bus and ethernet controller
There is so much going on in the CPLD that it will be covered in a post of its own in the future (maybe even a couple). Needless to say, nothing happens without this crucial piece of the puzzle.
The CPLD I have used is an EPM7128S and current macrocell usage is approximately 110 of the available 128. It still blows my mind that so much logic can be crammed into such a small device! Macrocell usage can likely be optimised and potentially further reduced (but maybe by only a small amount) by shuffling some signals about to improve fitting, and changing some of the logic from synchronous to asynchronous. This is perhaps something for another time though!
Discrete Registers (GPIO)
Two system-level registers are implemented via some discrete buffers and latches to provide some basic GPIO. These provide some basic I/O for reading configuration jumpers, controlling status LEDs, enabling loopback for the ethernet phy, and reading some flags which can help determine the cause of a reset.
The flags that determine the reset cause are something I have brought over from my hobby work with microcontrollers. Microcontrollers tend to implement a register with a (more comprehensive) set of bits that can provide all kinds of insight into reset causes, such as power-on, brownout, watchdog, and often times various other reasons. In COMET68k I have implemented two bits that provide a good level of visibility (but does still leave some "blind spots" because a variety of different resets either dont set any flags or are treated the same):
- Power-on reset
- Watchdog timeout
The POR flag is set when ever the system powers up, or if a brownout is detected (VCC drops to 4V or less). The watchdog timeout flag is set when ever a reset is generated due to the watchdog timer not being reset soon enough (1.6 seconds according to the MAX705 supervisor that I use). Combinations of these bits being set or cleared can help software determine why it is starting up.
For example, the POR bit being set would indicate that the system has just turned on, and my prompt software to perform a rigorous memory test and initialise some areas of memory in different ways. If th POR bit is clear, software can probably skip the memory test because this was likely already performed. If the watchdog bit is set, and/or the POR bit is clear, the software may like to preserve some areas of memory which may contain logs or other information to help a developer determine what happened, or to help it recover back to the state it was in before the reset.
Some General Updates
Revision 2
At this point in time I have fixed a small number of minor issues, and made a couple of enhancements for revision 2:
- I had no pull ups/downs on chipselect and buffer enable signals, and I believe this contributed towards excessive current draw on power-up. This is what caused me to remove the expansion bus buffers on my original PCB (which you can see in the photos). For rev 2 I have added pull ups/downs as required on many of these signals to hopefully alleviate this issue.
- I determined that the POR and watchdog timeout flags werent as useful as I had hoped in rev 1 - it seems that the watchdog flag was set on power-up. A design change (which I tested on a breadboard this time!) has fixed that, but means that the watchdog flag cannot be set unless the POR flag is cleared. This means software must read the status register on startup to ensure that future events will be captured correctly, but for software that doesnt care about these status flags it makes no difference.
- The trace which I added at the last moment before sending off my rev 1 PCB design for manufacturing, which caused all of the power planes to be shorted together requiring some fine drilling-out of some vias with a dremel, and which was meant to provide a means for me to do "something" with the CPLD under software control, has been re-connected with a bodge wire and repurposed as a software IRQ mechanism. This helped me implement better task yielding in my 68k FreeRTOS port, and is very welcome.
- I discovered that the timers can be gated in software, so the external hardware gate that I had run was no longer needed. This has been re-routed to the CPLD so I still have a spare signal that I can use for "something" with the CPLD. Exactly what I dont know at this stage, it was always just a "nice to have" kind of thing that might find some purpose in the future.
- With a small amount of shuffling of some address decoding related to the GPIO, I have provided myself with an ability to read the status of the ethernet phy's link status pin. This may be useful in FreeRTOS+TCP to provide a way to report to FreeRTOS when the ethernet link has gone down or come up.
Design Files
With the design now stabilised I will work to get my GitHub repo cleaned up a bit and release it to the public in the near future.
Thanks for reading! :-)
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.