Thoughts about a Software Managed MMU

Yann Guidon / YGDES wrote 09/01/2018 at 19:27

Overall, an interesting read :-)

It contains some well-known old techniques, some more recent ideas, and I recognise thought processes I had followed long ago...

The idea of using ASI to reduce thrashing is pretty cool and new to me :-)

Are you sure? yes | no

Julian wrote 09/01/2018 at 10:43

I've often thought that the inability to implement large CAMs in FPGAs (and other similar issues brought up by the lack of support for internal tristate buses of any kind) is the major weakness of the entire architecture. Have you looked at the #F-CPU project? If memory serves (and it's been a good 15 years since I looked at in-depth, so I could be wrong), they ended up with an architecture pretty similar to the one you're discussing in their FC0 design.

Are you sure? yes | no

Yann Guidon / YGDES wrote 09/01/2018 at 12:03

CAM are great... in theory :-)

I remember it was one of the causes of troubles in the FC0 design.

It's not possible to totally get rid of of them but FC1 won't heavily rely on them.

Are you sure? yes | no

Samuel A. Falvo II wrote 09/01/2018 at 17:31

My only interaction with F-CPU project was convincing them to abandon explicit AND/OR/etc. instructions, and just using a generic ALU instruction which takes a look-up table for bitwise operations. My inspiration was how the Amiga's blitter could perform literally *any* logical operation because of this programmability.

As you might imagine, since that was my only interaction, nobody remembers this happened. Oh well.

I tried looking up information about how FC0 and FC1 handle memory management, but maybe I just don't know the online resources well enough, because I couldn't find any details. Would someone care to give an executive summary and links to relevant info here? Thanks!

Are you sure? yes | no

Yann Guidon / YGDES wrote 09/01/2018 at 18:35

Samuel, I do remember about the ROP2 thing :-) having arbitrary boolean operations is an extremely precious feature, and I was easily convinced because at that time I was programming a heavy boolean program using the limited MMX instruction set extension (having only AND, ANDN, OR and XOR drove me crazy...)

However "usually" normal CPU use the trick outlined at https://hackaday.io/project/46000-pdp-processor-design-principles/log/143828-how-to-design-a-better-alu

Speed is critical and the ROP2 unit uses surface, gates, power and time to perform operations that are already mostly performed by the ALU, which explains why opcodes are often limited to the above 4. I know this sucks but "it works for 99,99% of the cases".

Concerning the memory management, it's described in the FC0 manual http://archives.f-cpu.org/manual-20021116/ (damnit, this thing is so old, I'm almost ashamed, but it contains a lot of interesting bits). It was always quite fuzzy because a modern memory system is ... complicated, and we didn't have a whole working system, whereas x86 and others had a working machine before they added a MMU.

FC0 had "a pool of 63 registers" and some could be linked to a line of L0 cache. The lookup used some sort of CAM. It was required because we didn't want to "lock" registers into a given fixed function. This proved to be a bad choice, making the FC0 overly complicated and uselessly slow. FC1 will have 64 registers as well but roughtly half of them will be fixed functions, mostly for memory access, which moves the CAM complexity to software and compiling, the aliases will be handled by the coder and not the hardware (further going down the RISC route). This will also ease the MMU because the 4 individual load/store units will have their own small TLB which can be looked up in parallel with less efforts, and a larger L2 TLB will handle the misses, aliases and conflicts.

@Samuel A. Falvo II what else would you like to know ?
Regards !

Are you sure? yes | no

Yann Guidon / YGDES wrote 09/01/2018 at 19:02

Ah, I forgot a little detail...
.... sorry ...

Since the early days of F-CPU, my understanding and vision of Operating Systems operation and principles has evolved a lot, of course :-)

I believe that microkernels are an unavoidable necessity, one day or another, and this requires features that deeply affect CPU cores and organisation, even programming. For example, code and data spaces are strictly separated (Harvard-style) to prevent many, many issues that are unfortunately too common in today's processors (security and safety, mostly, but not only).

I also envision a huge flat, 64-bits addressing space shared by all applications with very strict rules for aliasing. Threads communicate by passing pointers to blocks that then change ownership, which requires a flexible and fine-grained memory protection system.

Aliasing is required for the program memory (because several threads can run on the same code) but this is easier to handle because there is barely any writes. This makes it a candidate for "broadcast" in case a "write to many threads" is required.

I admit that so far, the system is still very... unclear. It's not my priority (I have a job and a wife now) but it's still very important and I welcome discussions about this subject.

Are you sure? yes | no

f4hdk wrote 09/02/2018 at 04:49

@Yann Guidon / YGDES you say "FC1 will have [...]"

Does it mean that F-CPU is not dead, and that you plan to revive the FC1 project?

Are you sure? yes | no

Yann Guidon / YGDES wrote 09/02/2018 at 04:58

@f4hdk : F-CPU is still alive as one of the greatest vaporware projects ever :-P

FC0 (its first architecture and ISA) is now history, after more than a decade hidden under the rug.

FC1 is being considered, benefiting from my experience of the last years. There are some discussions at #F-CPU .

Are you sure? yes | no

Whoops! Here's some bulk updates!

E2 Emulator

Discussions

Become a Hackaday.io Member