-
Back in the Game
10/12/2018 at 14:28 • 0 commentsIt's been a while but I thought I would add a log.
I decided to write a generic routine to allow text to be sent to the LCD screen on the dev board I have. I wrote a specific one a long time ago that sent 'Hello World' out to the display.
Now with the Dual and Single complexs I thought lets start to put bits of code together so that they can be used.
Using the techniques used in getting unformatted text out on the 'console' in the simulator time to use that and add in the LCD option.
Extracted the specific routine and then started to write the more generic routine.
Ran it up and hit a problems. This can be quite disheartening in a project that has been running this long. When you you hit problems you start to think
'Is it all worth while ?'
An interregnum ensues where Phased Array audio is investigated and using HP Calculators.
But laptop is with me on holiday and I start to skin the onion, all bugs have layers and each layer brings tears to the eye.
Crack the first RTL bug, sometime skip the stack pointer update ! Move onto the Assembler, fix a bug there, hit a new RTL bug (where have the instructions gone ?), fix that. Finally fix a typo in the Assembler (this was part of the old code but not up to date with present instruction name.Now I have the PIO block sending out the control signalling for the LCD. Next step to get it all synthed again and to plug into the dev board.
Next thing after this will be to add in a proper comms link, still not got a RS232 block in there which hopefully should not be too hard to get in there !
-
Full !
01/14/2018 at 17:22 • 0 commentsIt's been a while since I synthesized anything relating to Trinity.
I thought I would however take the Dual Core Complex and build it.When it was synthesized back in 2011 it was around 13000 Logic blocks, lot's of room.
Now the Dual Core Complex, the Timer, PIO, Interrupt, the Memory Interconnect has consumed the Cyclone III FPGA, it's at 95% fill !
I spotted some long paths in there which I wasn't expectiing so managed to find a way of reducing those. Also added the PLL to give a clean clock.The timing reports advise that I can get up to 30 MHz but I've got the clocks down at 25 MHz.
Yes it would be great to get up higher that this but at 95% fill I am pleasantly surprised it goes this fast.
There is a basic rule of thumb, once you start to get above 60% fill timing becomes harder, the more fill the harder. This is because the logic can't all be placed next to each other and the more logic requires more routing within the FPGA, this is a continuous issue with FPGA work.Because I'm not doing this professionally I don't need to get it up to 100 MHz, 25 MHz is fine.
One thing did appear to be quite odd. The General Purpose register file appeared to be about 1500 Logic element (1024 registers), however the Control and Status registers was at ~ 20k Logic elements which seemed to twice the expected area. This is something to examine.
There is also something that is quite good to know, I appear to have only used 10 % of the available SRAM that is onboard which means that I can possibly expand the memories from their 4 KB blocks, however the routing may become an issue.
I have been looking at the tools again and have discovered that there is a methodology of reprogramming the memories without having to go through the whole process of resynthing the FPGA. Just get the updated if files and then run the tool and re -assemble. So all good.
A few days ago I was chatting to a friend with regards to documentation in Software and was astonished as to there being little done in comparison to Hardware. Well that has come back to bite me, I've been looking at some assembler I wrote six years ago attempting to update it for the new Dual Core complex and there aren't even any comments !!
Mea Culpa !
I intend to get the code converted and run up the LCD that is on the dev board that I have.
By the looks of it there needs to be a bit of preamble which fixed me and then we are into ASCII which is good news.What next, well I think I need to get some kind of RS232 input in there so I can communicate properly.
This is not going to be as trivial as it sounds.There is an RS232 port on the board but I need to workout how to connect to it, put a RS232 block within the code and then put in support to use it. Open ended projects are cool in this respect.
-
Partial Success
10/06/2017 at 21:29 • 0 commentsI have managed to get the latest binary for GHDL, 0.34.
This allows me to select signals to wave up.
Or rather it almost does.
Unfortunately it cannot cope with For Generate loops which means that if I want a specific signal each instance in the loop is also waved up.
Still this is a significant bit of progress.
Also I've put in a request for this to be fixed in future iterations.
-
Rebuild
09/24/2017 at 14:12 • 0 commentsI've created a Dual Core Complex that now has the Trinity Net block in it.
Then created a better frame work which instantiated this 'node' in a the three dimensional matrix.
The matrix is sized as a 2x2x2 which gives a total of 16 cores.
Run a test program but now back to the original issue with the previous array of cores which was that it takes a long time for anything to be simulated.
10 us takes 1 minute 40 seconds as each signal is recorded. I really need to get the latest version of the simulator to see if I can reduce the number of signals recorded.
-
Challenging
09/04/2017 at 21:57 • 0 commentsIt's been an interesting couple of weeks.
Testing out the Single Complex had as you may expect a wealth of entertainments.
The Arbiter needed some TLC. Springing into copying a simple count into memory the DMA copied it to elsewhere.
Interrupts. Interrupts opened up several bugs. The first being that the interrupt needed to be extended beyond a single pulse.
Interrupts also flagged up an issue with respect to Jumps. An interrupt was successfully called but the return was incorrect. The jump was flushing out the return address before it was captured.The Jump is now held at the Execution stage until the next instruction is just about to arrive thus keeping a valid address for the Interrupt return address.
A enhancement was put in with respect to the Branch Prediction whic means it is now more efficient.
With a simple update to the CSR registers it was possible to add in a Processor ID. The reason for this was to allow a multiple core environment which would allow a shared ROM. This allows a core to determine which one it is and then run the appropriate code. It was also very simple to add in the extra core.
By adding the Timer I can now start to work on Coarse Grain Multitasking.
So fun fun fun :).
-
Simple Complex
07/02/2017 at 15:18 • 0 commentsI now have a first cut Simple Complex, this has Trinity, a simple DMA, 16kB of Data SRAM, 8kB of Instruction SRAM, 4 kB of ROM, simple word DMA, Interconnect, PIO and the Interrupt controller.
I need to test it to ensure that it works.
Note the Instruction SRAM, the intention is that this will be an area which will allow the block to receive a block of code and then run from it.
-
Application ?
05/01/2017 at 15:07 • 0 commentsI was thinking about applications for Trinity Net.
It comes to mind that a distributed array would be to analyse Radio Astronomy data.
While it would not be up to the demands of the Square Kilometer Array it would be interesting to have something which could analyse an array of data. -
Interrupt
04/15/2017 at 11:45 • 0 commentsI now have a first cut nested interrupt controller and will need to test it. There is an initial interrupt conditioning block so that interrupts can be pulse, level and asynchronous. The conditioning set up via registers. It can have up to 256 interrupts at present but in theory it can have a programmable at build number.
Assuming this is all ok I just need to write a simple gpio block and maybe a Hitachi LCD Text driver rather than a bit bash code I got to display "Hello World" wayyyy back.
After that I can put together a
"Processor Complex"
which will have the main features of a minimal micrcontroller by adding in the DMA, Interrupt Controller, Non Blocking Interconnect and the Open Cores UART.
This will allow me to place the elements into a small and thus cheap FPGA board to allow deployment into projects.
The Multicore environment will not be forgotten, I still have plans for this, initially formalising the system so that it is extracted from the testbench enviroment and become a deployable bit of IP with an enhanced Master Core.
-
Simple DMA
03/19/2017 at 17:24 • 0 commentsI now have a simple DMA, it just moves words around the memory given a to and from address and the length in words.
Now that this is complete I can start to put together the Interrupt Controller.Once this is complete I can create a simple processor complex comprising of a Trinity Core, Non Blocking Multi Master Slave Interconnect, Interrupt Controller, an Open Cores Uart, simple PIO block and a simple Two Line LCD Display Driver.
This can be then used for a simple FPGA Board which will allow some 'Hardware' Hobby work.
However this does not mean that the Multicore work will have come to a standstill.
On the contrary I will take the back plane and formalise the structure a bit more.
Once that is done I can have two sets of designs. One for a small FPGA and the other which can be arrayed as I need.
-
Interconnect
03/01/2017 at 14:38 • 0 commentsI am in the middle of writing a Non Blocking Interconnect. So far have written a programmable Master Decode Block and now just completed the Master to Slave Arbiter block.
So far Trinity has sat insided a very simple memory system. The problem with that is that it only allows one Master and prohibits the use of DMA to shuffle data around as this would have to be a Master.
The reason why I mention Non Blocking is that if you had a blocking interconnect you could not have two masters accessing two different memories which would slow things down.
How will this improve things, well as mentioned it will allow a DMA block to be used, this would mean that in the Multicore environment we could relocate some code/data that has arrived tosomewhere else for example. Or shuffle some data into the payload area in preparation to be transmitted back.
I've managed to re-use some code for the Arbiter in respect to the Finding the First One execute block in the Execution unit (reaping the rewards :) ).
The initial DMA will be very simple, just a word to word movement for speed.The next block to write will be an interrupt handler. Trinity has the concept of a Processor State Level. Should the core be dealing with an Exception or an Interrupt it can increase this level to a determined value. This can then be used to triage out a further interrupt if it's associated programmable 'Priority State Class' is below that of the present Processor State Level.
Say for example Trinity is dealing with a UART interrupt. It may place its Processor State Level to 1. However suddenly a really time critical interrupt comes in, this has to be dealt with as soon as possible and Trinity come back to servicing the UART interrupt because it is less important.