The trap logic
When I started DME, I decided I'd do a RISC kind of architecture (actually Bill Buzbee decided it for me, as I took his Magic1.6 ISA as a starting point). So for the trap mechanism, I opted for a RISC like approach as well. DME has two sets of registers: USER and IRQ. User programs run in the user registers until a trap happens, the cpu then immediately switches register banks and continuous operating using the IRQ registers. The upside of this, is that during most traps the OS does not need to spend time saving user registers as they are preserved and can just be switched back to once the OS returns to user space.
The flaw
DME does not service IRQs when it is in its IRQ mode. Initially, I thought that would be no problem: it would finish servicing the current IRQ and then after it returned to user space it would immediately trap again and service the missed IRQ (it does get saved). This thinking is still mostly valid except for: syscalls. As I started fleshing out the OS it occurred to me that often a sycall, for example 'read', would trigger DME into IRQ mode only to kick-off some operation, f.e. read from the SD Card, and then go to sleep (in IRQ-mode!) waiting for the operation to finish. This 'finish' is signaled by an IRQ! Now, if every process is in IRQ mode waiting for an IRQ, the system will deadlock as the IRQ will never get serviced.
Possible workarounds
1. Don't sleep waiting for IRQ, wait(poll) for finish
I could simply make a process wait for whatever operation needs to happen (keyboard input, reading a disk block) instead of putting it to sleep. In practice, since DME will never have multiple users working on it at the same time, the user experience of both approaches would probably be much the same. However, it means I don't get to play with IRQs as much in my OS. Just as important, this does not work for the timer IRQ.
2. I could tell my scheduler that 'if there aren't any processes to run' (all are sleeping), it should run each irq handler. This works, and is actually what my OS does now (simple to implement). However, again this does not work for the timer_irq: i cannot run that each time there is nothing to run (it would accomplish nothing, as there is nothing to run!). It means timer_irq is only triggered when a process returns to user space.
3. I could have a special user space process that always operates in user space and never does a syscall and thus is always runnable to catch irqs. Although this works, its an expensive solution. The process would eat CPU slices just to catch irqs. No good.
A better design
The real solution is of course to add another bank of registers: user -> system -> irq. User would always trap to system. When operating in system we could then still catch (real, hardware) irqs which would push us into irq mode. Only when in irq mode are all irqs disabled.
DME is defined in verilog and the addition of a third register bank would not be too complex I think (ha!). However, I've decided against tackling that task atm as I want to keep the momentum going. I may decide to do it later, but lets first see how big of a problem this turns out to be. Meanwhile, i'm going with workaround (2).
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.