I'm not sure why it's not widespread yet, or why it's not implemented or even spoken about but...
Imagine you free some memory, and the kernel reclaims it for other purposes. It can never be sure the physical data in RAM can be read by another thread to exfiltrate precious information. So the kernel spends some of its precious time clearing pages after pages, just in case, writing 0s all over the place to clean up after the patrons.
It's something the hardware could do, by marking a page as "read as zero" or "trap on read dirty" in the TLB. Writes would not trap and you can read your own freshly written data. In fact it's as if you allocated a new cache line without reading the cache...
The cache knows about the dirty bits, and a coarser "dirty map" could be stored to help with cache lines. That's 128 bits, one for each cache line if lines are 256 bits wide.
It's still preliminary but I'm sure people have worked on this already... Because scrubbing data was a thing for a long time.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.
Another thread or another process? Because Linux (and probably other OSes) already clears pages before giving them to another process, otherwise there would be far more cross-process leakage. Clearing pages to zero is not a time consuming task in the scheme of things to warrant the expense of extra specialised hardware in the memory path. Just throw in a bit more CPU power to handle the block clear. Even the Z80 had an instruction for this.
Are you sure? yes | no
the difference between process and thread is ... context-dependent :-D
In Unix this is clear.
In NPM, I chose to use "thread" because it's processor-centric : a sequence of instructions. it has a set of rights and an owner like in UNIX but can't split/fotk into sub-threads (a UNIX process can contain several threads).
Clearing pages uses bandwidth and I expect that a LOT of pages will be requested because that's how communication between programs occurs. Reuse is advised but... if one program constantly sends data streams to another program, the block is yielded from P1 to P2 and then discarded/freed.
Another solution would be to have "read only" and "write only" attributes, attached to different thread IDs. But then this doesn't allow P2 to send the data to P3 or P4 later, without copy...
The trick is that the cache system has a whole mechanism to handle "dirty" attributes to cache lines and individual bytes, so it would be interesting to expand on this...
Are you sure? yes | no
Threads in a Linux process belong to the same memory protection domain so nothing is guaranteed about another thread being able or not to read memory of this thread.
Blocks used by I/O don't need to be zeroed. They will be overwritten when the block is next used for I/O. Of course they are dirty in the kernel, but if the attacker has kernel access it game over anyway. The block cache also applies to program loading and file I/O, in fact you do not want to clear the blocks in the file cache in case it's used again. AFAICT the only place where the kernel needs to clear memory is for the brk/sbrk calls.
Are you sure? yes | no
"Threads in a Linux process belong to the same memory protection domain so nothing is guaranteed about another thread being able or not to read memory of this thread."
Yes, Linux threads are different. I played a bit with them back in the 2.4 era :-)
"Blocks used by I/O don't need to be zeroed. They will be overwritten when the block is next used for I/O."
Yes but the monolithic kernel relies on the assumption that the I/O is trusted.
With a microkernel and derivatives like GNU Hurd, nothing is trusted (except maybe the monitor/microkernel).
I'll have to write and explain more details in more logs :-)
Are you sure? yes | no
Sounds like a NOSM.
Are you sure? yes | no
What is criterion to stop reading as zero? Would it be necessary that each location in the freshly allocated memory block has an individual dirty bit which, would be reset by a first write to respective location?
Are you sure? yes | no
I am not sure yet. I'm throwing ideas at the wall to see which ones stick.
For now, "manual" clearing of the blocks is necessary. It's not convenient though because it eats up memory bandwidth.
I wonder if/how I could allocate some space in the page table entries to store the dirty bits of the cache lines. For 512 bytes blocks, that's 16 bits to store somewhere...
Are you sure? yes | no