Close

Brute-forcing the permutations

A project log for miniMAC - Not an Ethernet Transceiver

custom(izable) circuit for sending some megabytes over differential pairs.

yann-guidon-ygdesYann Guidon / YGDES 12/25/2025 at 08:330 Comments

The manual design is getting tedious, I have to automate it.

The final form (VHDL source code) will be something like, for the encoder :

Avalanche_D
Permutation_P1
Avalanche_E
Permutation_P2
Avalanche_D
Permutation_P3
Avalanche_E

For now, P1=P3 for the ease of design but it doesn't have to be so.

The permutation can be a cyclical interleaving but the parameters must be determined and it might be too "regular".

Then a Galois sequence is a natural choice because it is not too regular and can be reversed.

By chance, we have 18 bits and 18+1=19 is a prime number ! so the generators are 2;3;10;13;14;15
So for each of the 3 layers, we have 6 sequences to test, but each sequence must also be "rotated" for the 18 bits, 6*18=108 combinations. This makes 108×108×108=1,259,712 combinations to test!

It is possible but not practical with GHDL. I must code it in C, along with the figures of merit. I can then use __builtin_popcount(x) and other things directly and scan the whole range in a few minutes.

....

OK there is one more parameter to add to the generators : offset and index are different so each layer has 6×18×18=1944 combinations (should be all different). The 3 layers amount to 7,346,640,384 tests... so yeah I'll run that on 6 cores or something, and I'll strip all the computations down to the absolute minimum.

...

But Galois sequences have some lousy patterns:

Generator[0] =  2 :  1,  2,  4,  8, 16, 13,  7, 14,  9, 18, 17, 15, 11,  3,  6, 12,  5, 10
Generator[1] =  3 :  1,  3,  9,  8,  5, 15,  7,  2,  6, 18, 16, 10, 11, 14,  4, 12, 17, 13
Generator[2] = 10 :  1, 10,  5, 12,  6,  3, 11, 15, 17, 18,  9, 14,  7, 13, 16,  8,  4,  2
Generator[3] = 13 :  1, 13, 17, 12,  4, 14, 11, 10, 16, 18,  6,  2,  7, 15,  5,  8,  9,  3
Generator[4] = 14 :  1, 14,  6,  8, 17, 10,  7,  3,  4, 18,  5, 13, 11,  2,  9, 12, 16, 15
Generator[5] = 15 :  1, 15, 16, 12,  9,  2, 11, 13,  5, 18,  4,  3,  7, 10, 17,  8,  6, 14

The 18s are aligned, right in the middle, reminding me that the sequence is symmetrical :-/

OTOH we have to start somewhere since 18! = 6.4E15

My Galois generator can already generate 1944 sequences but that's less than 1 billionth of the sequence space, and I can't scan it exhaustively. It's possible to do 1944^2=3779136 by permuting two of these sequences. Then, it's possible to compute the pipeline up to some point, for the 18 interesting values, and be more thorough at the last stage.

Some permutation table would help too, though I'm not sure if it would slow things down instead, because the LUT would occupy about 1MiB. Making the array would be worth it only if it is used more than a million times but it's unlikely. However, the 9-bit tiles fits in 2K bytes and it's perfect.

....

C-ing...

...

And I can generate 3.7 million permutations in about 0.24 second, using 3 stages of increasing buffers.

#include <stdlib.h>
#include <stdio.h>

#define GENCOUNT (6)
#define BITS     (18)
#define MODULUS  (19)
#define PERMS    (GENCOUNT*BITS*BITS) // 1944

unsigned char generators[GENCOUNT]={ 2, 3, 10, 13, 14, 15 };
unsigned char sequences[GENCOUNT][BITS];
unsigned char perms[PERMS][BITS];

unsigned char T[BITS];

static inline void combine_perm(
    unsigned char* perm_val,
    unsigned char* perm_index,
    unsigned char* perm_dst
) {
  for (unsigned int i=0; i<BITS; i++)
    perm_dst[i] = perm_val[perm_index[i] & 31];
}

int main(int argc, char **argv) {

  for (unsigned int gen = 0; gen < GENCOUNT; gen++) {
    unsigned int generator=generators[gen];
    unsigned int num=1;
    for (unsigned int index=0; index<BITS; index++) {
      sequences[gen][index] = num;
      num = (num * generator) % MODULUS;
    }
  }

  unsigned int perm=0;
  // Scan 18*18*6=1944 sequences
  for (    unsigned int gen = 0;    gen < GENCOUNT; gen++) {
    for (  unsigned int index = 0;  index < BITS;   index++) {
      for (unsigned int offset = 0; offset < BITS;  offset++) {
        unsigned int i=index;
        unsigned int k=0;
        do {
          // convert Galois domain (which excludes 0) to bit index domain
          unsigned int j=sequences[gen][i]-1;
          j += offset;
          if (j >= BITS)
              j -= BITS; // wrap around / modulo
          perms[perm][k++]=j;
          i++;
          if (i>=BITS)
              i=0;  // circular addressing
        } while (i!=index);
        perm++;
      }
    }
  }

  for (unsigned int i=0; i < PERMS; i++) {
    for (unsigned int j=0; j < PERMS; j++) {
      combine_perm(perms[i], perms[j], T);
      for (unsigned int k=0; k < BITS; k++)
        putchar(T[k]+'a');
      putchar('\n');
    }
  }

  return EXIT_SUCCESS;
}

... of which only 1/36th are unique (104976) :-/

So i'll stay with 1944*1944*1944 = 7,346,640,384 as exhaustive scan, then I'll switch to pure random-driven permutations with reductions...

Discussions