Re-birth (3) : the Y/B paths

It's time to restore the Y path. For the encoder, it's mostly a copy-paste-rename of the X path, then the results are interconnected.

(and seeing this, knowing what I now know, I realise it could become a mean TurboCode system if the Y output is interleaved with the X output)

Copy-paste is easy, just as it is easy to mess up. Beware of the excess of confidence.

Just as with the encoder, the B path is a copy-paste of the Y path, but the source is not the decoded output, rather it's a delayed version of the input.

The A register has no feedback, it's just a temporal fence, and the T register is simply a delay (that gets initialised with INIT_X)

............................

The copy-paste-rename took longer than it should, as usual, but the result matched the theory immediately. The encoder shows its duplication:

// Scrambler
int Message_in, OpM, OpY2, ResX, X, Y, CX,
    CXin, CXout, Scrambled_out, ResX2,
    OpX, OpY, CYin, CoutY, CoutY2, ResY, ResY2, CY;

void init_scrambler() {
  X=INIT_X;
  Y=INIT_Y;
  CX=0;
  CY=0;
}

void cycle_scrambler() {
  OpM = Message_in;
  OpY2 = Y;
  CXin = CX;
  ResX = OpM + OpY2 + CXin;

  OpX = X;
  OpY = Y;
  CYin = CY;
  ResY = OpX + OpY + CYin;

  X = ResX & MASK18;
  ResX2 = ResX + ADJUST;
  CX = (ResX | ResX2) >> 18;
  if (CX)
    X = ResX2 & MASK18;

  Y = ResY & MASK18;
  ResY2 = ResY + ADJUST;
  CY = (ResY | ResY2) >> 18;
  if (CY)
    Y = ResY2 & MASK18;

  Scrambled_out = X;
}

The decoder is a copy-paste-rename of the B path and the asymetry is obvious:

// Decrambler
int Scrambled_in, OpM2, OpB2, B, A, ResA, CA,
    CAin, CAout, Message_out, error, ResA2,
    T, OpT, CBin, OpB, ResB, ResB2, CoutB, CoutB2, CB;

void init_descrambler() {
  A=-1;
  B=INIT_Y;
  T=INIT_X;
  CA=1;
  CB=0;
}

void cycle_descrambler() {
  OpM2 = Scrambled_in;
  OpB2 = (~B)&MASK18;
  CAin = CA;
  ResA = OpM2 + CAin + OpB2;

  OpB = B;
  OpT = T;
  CBin = CB;
  ResB = OpB + OpT + CBin;
  T = Scrambled_in;

  A  = ResA & MASK18;
  ResA2 = A + MODULUS;
  CA = (ResA >> 18) & 1;
  if (!CA)
    A = ResA2 & MASK18;

  B = ResB & MASK18;
  ResB2 = ResB + ADJUST;
  CB = (ResB | ResB2) >> 18;
  if (CB)
    B = ResB2 & MASK18;

  Message_out = A & MASK17;
  error = (A >> 17 ) & 1;
}

..................

VHDL passes without effort but the tests don't match the C vectors. The INIT_X and INIT_Y are different so I have redefined them

#define INIT_X  (187319) // "101101101110110111"
#define INIT_Y  (111981) // "011011010101101101"
...
generic (
  INIT_X : std_ulogic_vector(17 downto 0) := "101101101110110111"; -- 187319
  INIT_Y : std_ulogic_vector(17 downto 0) := "011011010101101101"; -- 111981
  ADJUST : std_ulogic_vector(17 downto 0) := "000000111110111110"; -- 4030
  MODULO : std_ulogic_vector(17 downto 0) := "111111000001000010"  -- 258114
);

And now the vectors are good !

[   1234,  113215],
[  38650,   79836],
[  76066,  230468],
[ 113482,   89606],
[  19826,  226419],
[  57242,   95327],
[  94658,  101049],
[   1002,  102721],
[  38418,  241186],
[  75834,  123209],
[ 113250,  143698],
[  19594,  173252],
[  57010,   96252],
[  94426,   48807],
[    770,   51404],
[  38186,  137627],
[  75602,  226447],
[ 113018,  143376],
[  19362,   18054],
[  56778,  198847],
[  94194,  254317],
[    538,  101394],
[  37954,  135014],
[  75370,   15711],
[ 112786,  188142],
[  19130,  110197],
[  56546,   77641],
[  93962,  225255],
[    306,  209240],
[  37722,  213797],
[  75138,  202340],
[ 112554,  195440],
[  18898,   46011],
[  56314,   20754],
[  93730,  104182],
[     74,   31280],
[  37490,  172878],
[  74906,  241574],
[ 112322,  193754],
[  18666,   83559],
[  56082,   56616],
[  93498,  177592],
[ 130914,   13510],
[  37258,   97447],
[  74674,  148373],
[ 112090,   25122],
[  18434,   79840],
[  55850,  142378],
[  93266,    1520],
[ 130682,  181315],
[  37026,   89179],
[  74442,   49796],
[ 111858,  176392],
[  18202,  132532],
[  55618,   88226],
[  93034,      61],
[ 130450,  125704],
[  36794,   32109],
[  74210,  195229],
[ 111626,    6640],
[  17970,  108214],
[  55386,  152270],
[  92802,   39786],
[ 130218,  229473],
[  36562,  175603],
[  73978,  184378],
[ 111394,  139284],
[  17738,  230007],
[  55154,  148593],
[  92570,  157903],
[ 129986,   85799],
[  36330,  150047],
[  73746,   15148],
[ 111162,  202612],
[  17506,  124104],
[  54922,  106018],
[  92338,    9425],
[ 129754,  152860],
[  36098,   68629],
[  73514,     791],
[ 110930,  106837],
[  17274,   13972],
[  54690,  158225],
[  92106,  209613],
[ 129522,  147140],
[  35866,    4984],
[  73282,  189541],
[ 110698,  231941],
[  17042,   69712],
[  54458,   80956],
[  91874,  188085],
[ 129290,   48343],
[  35634,  142773],
[  73050,  228532],
[ 110466,  150607],
[  16810,   27370],
[  54226,  215394],
[  91642,   22066],
[ 129058,   16763],
[  35402,  203288]

And finally: Verilog.

And it's a weird bug:

8240000.00ns INFO     cocotb.tb                          RB3 Descrambling Mode
 - in: 113215   found: 1234   expected: 1234
 - in: 79836   found: 79834   expected: 38650
 - in: 230468   found: 230467   expected: 76066
 - in: 89606   found: 89605   expected: 113482
 - in: 226419   found: 226418   expected: 19826

so the output is input +1 or +2.

...

A few more bugs later and https://github.com/ygdes/miniMAC_IHP/actions/runs/24273856501:

The size is roughly the same but the double-adder makes it 2× faster in the end, since I could push to 5ns cycle time (at the cost of quite a few buffers). The IO can't stand 200MHz so it's an exercise and an exploration, which is useful in itself.

now if it worked at 200MHz, that would be 400MBytes per second, or 4Gbps, almost a Gen2 PCIe link.

Re-birth (2.2) : modulo code

Discussions

Become a Hackaday.io Member