Close

YGMII (draft)

A project log for miniPHY

the thing you need to plug to a miniMAC

yann-guidon-ygdesYann Guidon / YGDES 4 days ago0 Comments

The TT IHP26a experience has taugh valuable lessons...

One of them is the need to transmit data reliably with few pins at high speed and efficiently yet easily/simply.

I'd like to have a 6-pin "link" interface à la SHARC but it does not look possible or at least practical, 6 is too tight. But 7 or 8 looks good.

For comparison, RGMII uses 6 signals in each direction (source clocking included) plus 2 config pins (MDC/MDIO) for a total of 14.

But I'm also throwing some TMPI in the lot to reduce power draw and EMI so that's one more pin.

I have come up with this 5-bit popcount

The argument being :

that's 20 bits = 5×4

5 being odd, it's not considered at the number of words, however 4 is perfect.

So we get 5 bits of data, plus 1 bit of TMPI and 2 bits of staggered/4-phase clock, to keep transitions low.

There would be a TMPI-AC: it's a time-differential transition minimisation so we need to "clear" the initial value. This is done by the clock signals that do not complete a full 4-phase cycle, just doing a pulse with all-0s (or something).

.

(state machine diagram here)

00 => 01 => 00 : clear the ACcumulator

00 => 01 => 11 => 10 =>  00 : data without inversion

00 => 10 => 11 => 01 =>  00 : data with inversion

.

With 4-phase signals, there is the potential of transmitting another bit by changing the phase / swapping the signal that sends the first rising edge => this can encode a 2nd-level TMPS bit, for extra-lower power draw.

So it is actually transmitting 21 bits over 8 wires in 4 cycles, which is 21/32 of efficiency but with much fewer transitions.

Now there is the challenge of designing a 20-bit popcount.

Maybe I could even drop the 5-bit popcount that adds another wire but it's too easy/alluring to pass... Saving one pin is great anyway, I'll have to check how it reduces power draw.

.

Just for the record here is the 6-bit TMPS circuit

...

Anyway, for that MAJ20 circuit...

I can easily have four 3-bit partial sums and they must be added together. The circuit must detect a value of 10 or more, requiring 4 bits. No need to compute the 5th bit if a custom adder is used.

...

I get this circuit

I have not found how to combine both the word-level and quibble level TMPI. There are now 2 possibilities :

The most attractive version at first glance would be the 7-bit version but the 8-bit version would be required for extra performance. Combining both would be rad though.

Discussions