Close

Re-birth (2) : the modulo.

A project log for miniMAC - Not an Ethernet Transceiver

custom(izable) circuit for sending some megabytes over differential pairs.

yann-guidon-ygdesYann Guidon / YGDES 04/07/2026 at 00:280 Comments

The last log 138. Re-birth (1) has found 3 more coding flaws and there are probably even more.

At this moment, I start to doubt the choice of the modulus, because being so close to 2^18 makes it pretty unlikely to overflow and create a carry (remember: the carry creates an additional avalanche). But then I forget that Y changes too (it's fixed for now) and I'll have to observe the activity of the carries. It will be interesting to compare the different values...

The principle of the dual cycle is to pass the first sum into the adder again, but adding a constant (derived from the modulus) and see if the result overflows. If it does, the new (adjusted, wrapped-around) sum is written to the register, otherwise it is not modified.

It's simple but not obvious because the whole thing mixes the numerical bases: the first phase is mod 262144 and the second phase is mod 258114, so there is an adjustment of 4030.

Managing the carry bit is notoriously tricky in this situation and I have lost some hair already...

................................................

Now something occurs to me. There could be a way out of this madness.

First: I have noticed how ASIC have a different balance from FPGA. DFFs are 3x more costly than typical logic gates, which means that an adder is more economical than a whole pipeline regsiter.

Second: It may be better to chain the adders, instead of cycling and sequencing and managing. This makes a slower clock but it does more in one period. Particularly if the 2nd add is a constant, which would use fewer gates.

Third (and this is the stroke of genius that didn't occur to me until this morning): what if the constant add is moved upstream so it doesn't bother us? There would be a simpler pipeline and less complexity...

Fourth: the pre-add would also work as error detection, because it does the work of the comparator.

Except that the adjustment is conditional...

Engineering really is the art of moving problems around.

................................................

Back to the circuit. The RB1 does one thing: out = in + Y.

In the RB2 (with modulo implemented) one way to know whether it has overflown is to check the carry bit, but there are 4030 cases out of 264144 (1 chance in 65, roughly, actually it's more complicated than that) that the sum is actually between the modulus and 2^18. The low chance makes me think again that the modulus is too high, but the 2nd turn is still necessary, if only because the result provides the corrected sum.

Actually we want 2 things:

Then we select the result according to the carry bit of sum2. This is a rather typical configuration that I have seen in previous works.

................................................

I think I have uncovered a 5th flaw in the previous design, probably present in the VHDL "reference" as well: the A/X pipes are fed with B/Y but these values change with the phase, because they are not stabilised/registered => race condition
 
No wonder I had such a bad time making it work !!!

................................................

Now there are 2 ways to make the modulo adder work.

(as I have seen somewhere and it looks good) or

And I'd love to use this since this would reduce the effort on the Y/B pipelines that reuse the input. Pre-adjusting the input saves multiplexers (constant or classic) so in a way, this simplifies the system.

This adds an adder in the pipeline but with two alternating constants (0 and 4030).

However I recently found that a 2-cycles system needs more registers to prevent race conditions => A one-cycle solution would prevent this !

Furthermore, it appears that factoring the adjustment into the input does not save much. There is still the need for the two bigger adders, and the added temporary registers bloat the system.

................................................

Fortunately I have found a paper that covers modulo-M adders, not just 2^n+1 and 2^n-1 (which amounts to some form of EAC / End-Around-Carry).

Discussions