Close

Brute-forcing in C

A project log for miniMAC - Not an Ethernet Transceiver

custom(izable) circuit for sending some megabytes over differential pairs.

yann-guidon-ygdesYann Guidon / YGDES 4 days ago0 Comments

As I try to find the best permutations in C, the tiles need to be emulated as well, and I have come up with this C code, optimised for speed on recent OOO CPUs:

static inline unsigned int Encode_layer(unsigned int X) {
  // inversions
  unsigned int Y = (X & 0x02010) * 30;

  // Gray encoding
  unsigned int G = (X & 0x03C1E) >> 1;

  return X ^ Y ^ G ;
}

static inline unsigned int Decode_layer(unsigned int X) {
  unsigned int
    Y =  X & 0x02010,
    A = (Y          ) >> 4,
    B = (X & 0x03018) >> 3;
  Y = Y*30;
    A ^= (X & 0x0381C) >> 2;
    B ^= (X & 0x03C1E) >> 1;
  return X ^ Y ^ A ^ B;
}

I process both 9-bit "tiles" at once, in the same word (SWAR approach).

For each tile, bits 5-9 are inverted by bit 4, and bits 0-4 are a Gray cascade.

Optimising the permutation was a different story and I opted to allocate 3MB of L3 cache...

..

..

..  (typingtypingtypingtypingtypingtypingtypingtyping)

..

And I was dubious but I got some pretty impressive results. The top ones so far look like this, after the 3rd layer :

min    Encoder      Decoder
sum  sum max min  sum max min  permutation #s
 5    99  15  2   165  18  3    1663:1314   
 5    99  15  2   165  18  3    1672:1476   
 5    99  15  2   165  18  3    1825:1323  
 5    99  15  2   165  18  3    1834:1485  
 5   100  12  2   118  11  3    1087:1380  
 5   100  12  2   118  11  3    1096:1542  
 5   100  12  2   118  11  3    1249:1371  
 5   100  12  2   118  11  3    1258:1533  
 5   101  14  1   145  14  4     201:1353  
 5   101  14  1   145  14  4     210:1515  
 5   101  14  1   145  14  4      39:1362  
 5   101  14  1   145  14  4      48:1524  
 5   101  14  1   159  12  4     201:1362  
 5   101  14  1   159  12  4     210:1524  
 5   101  14  1   159  12  4      39:1353  
 5   101  14  1   159  12  4      48:1515  
 5   102  11  2   120  10  3     476: 108  
 5   102  11  2   120  10  3     485: 270  
 5   102  11  2   120  10  3     638: 117  
 5   102  11  2   120  10  3     647: 279  

So far I have a good set of criteria to filter further brute-forcing for the 4th layer (3rd permutation). And I have optimised the heck out of it so it produces these results in under 1 second, so the further tests will take only like 20 minutes...

And if I'm patient enough I can relax the thresholds to find more interesting and better behaving configurations.

And it seems that 4 layers could be enough after all :-)

.....................

Adding the 3rd permutation, I get these results:

 Min      Encoder       Decoder
 Sum    Sum Max Min   Sum Max Min    Perm. numbers
  12    127  13  3    200  14   9    1663:1482:1458
  12    127  13  3    200  14   9    1672:1320:1467
  12    127  13  3    200  14   9    1825:1491:1296
  12    127  13  3    200  14   9    1834:1329:1305
  12    130  12  3    196  17   9     669:1154: 386
  12    130  12  3    196  17   9     678: 992: 395
  12    130  12  3    196  17   9     831:1163: 548
  12    130  12  3    196  17   9     840:1001: 557
  12     99  10  2    203  15  10     218:1241:1478
  12     99  10  2    203  15  10     227:1079:1487
  12     99  10  2    203  15  10      56:1232:1316
  12     99  10  2    203  15  10      65:1070:1325

The decoder gets impressive avalanche but the encoder doesn't take off :-/ the minimum avalanche doesn't get above 3...

Which is explained by a bug in the code ...

.......................

.......................

And with the stupid bug found, here it goes:

 13   148 12  5   172 13  8   - 1646:1465:1157
 13   148 12  5   172 13  8   - 1655:1303:1166
 13   148 12  5   172 13  8   - 1808:1474: 995
 13   148 12  5   172 13  8   - 1817:1312:1004
 13   148 12  5   184 13  8   - 1646:1465:1166
 13   148 12  5   184 13  8   - 1655:1303:1157
 13   148 12  5   184 13  8   - 1808:1474:1004
 13   148 12  5   184 13  8   - 1817:1312: 995
 13   177 13  8   127 10  5   - 1425:1705: 762
 13   177 13  8   127 10  5   - 1434:1867: 771
 13   177 13  8   127 10  5   - 1587:1696: 924
 13   177 13  8   127 10  5   - 1596:1858: 933
 13   178 12  8   149 13  5   - 1730:  30:1314
 13   178 12  8   149 13  5   - 1739: 192:1323
 13   178 12  8   149 13  5   - 1892:  21:1476
 13   178 12  8   149 13  5   - 1901: 183:1485
 13   195 14  7   150 11  6   - 1731: 854:1016
 13   195 14  7   150 11  6   - 1740: 692:1025
 13   195 14  7   150 11  6   - 1893: 863:1178
 13   195 14  7   150 11  6   - 1902: 701:1187
 13   196 15  7   156 11  6   -  398:1847:1353
 13   196 15  7   156 11  6   -  407:1685:1362
 13   196 15  7   156 11  6   -  560:1838:1515
 13   196 15  7   156 11  6   -  569:1676:1524

I expected a slightly higher minimal score, but if a particular behaviour is needed, on the send or the receive side, then we can select a score of 8 at the cost of "only" 5 on the other side.

But these results are actually excellent, in fact. So here is the dump of the four sets of permutations (highlighted above).

Dumping permutations:
Perm398 =
 forward(  6 16  8  3  7  1 17 11 12 15  5 13  0 14  2  4 10  9 )
 reverse( 12  5 14  3 15 10  0  4  2 17 16  7  8 11 13  9  1  6 )
Perm1847 =
 forward( 17  2  9  0 16  6 11  7  8  4  1 12  3  5 15 10 14 13 )
 reverse(  3 10  1 12  9 13  5  7  8  2 15  6 11 17 16 14  4  0 )
Perm1353 =
 forward( 10  1 12  9  5  6  2  7 15 13  4 11 14  0 17  3 16  8 )
 reverse( 13  1  6 15 10  4  5  7 17  3  0 11  2  9 12  8 16 14 )

Dumping permutations:
Perm407 =
 forward( 15  7 17 12 16 10  8  2  3  6 14  4  9  5 11 13  1  0 )
 reverse( 17 16  7  8 11 13  9  1  6 12  5 14  3 15 10  0  4  2 )
Perm1685 =
 forward(  4  1 12  3  5 15 10 14 13 17  2  9  0 16  6 11  7  8 )
 reverse( 12  1 10  3  0  4 14 16 17 11  6 15  2  8  7  5 13  9 )
Perm1362 =
 forward(  1 10  3  0 14 15 11 16  6  4 13  2  5  9  8 12  7 17 )
 reverse(  3  0 11  2  9 12  8 16 14 13  1  6 15 10  4  5  7 17 )

Dumping permutations:
Perm560 =
 forward( 15  5 13  0 14  2  4 10  9  6 16  8  3  7  1 17 11 12 )
 reverse(  3 14  5 12  6  1  9 13 11  8  7 16 17  2  4  0 10 15 )
Perm1838 =
 forward(  8 11  0  9  7 15  2 16 17 13 10  3 12 14  6  1  5  4 )
 reverse(  2 15  6 11 17 16 14  4  0  3 10  1 12  9 13  5  7  8 )
Perm1515 =
 forward( 13  4 11 14  0 17  3 16  8 10  1 12  9  5  6  2  7 15 )
 reverse(  4 10 15  6  1 13 14 16  8 12  9  2 11  0  3 17  7  5 )

Dumping permutations:
Perm569 =
 forward(  6 14  4  9  5 11 13  1  0 15  7 17 12 16 10  8  2  3 )
 reverse(  8  7 16 17  2  4  0 10 15  3 14  5 12  6  1  9 13 11 )
Perm1676 =
 forward( 13 10  3 12 14  6  1  5  4  8 11  0  9  7 15  2 16 17 )
 reverse( 11  6 15  2  8  7  5 13  9 12  1 10  3  0  4 14 16 17 )
Perm1524 =
 forward(  4 13  2  5  9  8 12  7 17  1 10  3  0 14 15 11 16  6 )
 reverse( 12  9  2 11  0  3 17  7  5  4 10 15  6  1 13 14 16  8 )

By altering the Galois sequences (some permutations here and there), more results are obtained. So far, I haven't found better scores. I'll continue a bit, then try another approach.

.

Note : all "hits" go in quartets, I don't know yet why, but we find the same deltas : 9 between a pair of results and  171 between extremes... at least for the permutation number of the 1st level.

.

Source code is there : PermParam_20251226.tgz

.

.

.

Quadrupling the number of sequences, using xors (that do some swap actually), I get this one:

14  169 188  14  6   14  8 -  840  812 5745

Perm840 =
 forward(  2  7  0  6  9  1 15 13 12  3 16  5 17 14  4  8 10 11 )
 reverse(  2  5  0  9 14 11  3  1 15  4 16 17  8  7 13  6 10 12 )

Perm812 =
 forward(  1 10 15  8 14 17  9  5  3  2 11  6 13  7  4 12 16  0 )
 reverse( 17  0  9  8 14  7 11 13  3  6  1 10 15 12  4  2 16  5 )

Perm5745 =
 forward( 14  1  8 10  0  5 15 16 12 13  6 11 17  9  2  4  3  7 )
 reverse(  4  1 14 16 15  5 10 17  2 13  3 11  8  9  0  6  7 12 )

 It's imbalanced but still looks nice.

This is achieved by expanding the Galois sequences with XOR by a number (between 1 and 3) to create alternate versions. I get 7776 sequences, more opportunities for cross-checking and finding happy numerical accidents.

There is also the balanced ones:

14  188 168  14  7   14  7 - 1965 7515 4021
14  188 168  14  7   14  7 - 1974 7677 4030
14  188 168  14  7   14  7 - 2127 7506 4183
14  188 168  14  7   14  7 - 2136 7668 4192

Perm1965 =
 forward(  3  5  9 17 16 10 15 12  1  2  0 14  6  7 13  8 11  4 )
 reverse( 10  8  9  0 17  1 12 13 15  2  5 16  7 14 11  6  4  3 )

Perm7515 =
 forward( 17  2 11  0  6 16  8  9 10 14  1  7 13 15  5 12  4  3 )
 reverse(  3 10  1 17 16 14  4 11  6  7  8  2 15 12  9 13  5  0 )

Perm4021 =
 forward(  4 17  6  5  1 15  7 14 16 13  0  9 10  8 12  2  3 11 )
 reverse( 10  4 15 16  0  3  2  6 13 11 12 17 14  9  7  5  8  1 )


Perm1974 =
 forward( 12 14  0  8  7  1  6  3 10 11  9  5 15 16  4 17  2 13 )
 reverse(  2  5 16  7 14 11  6  4  3 10  8  9  0 17  1 12 13 15 )

Perm7677 =
 forward( 14  1  7 13 15  5 12  4  3 17  2 11  0  6 16  8  9 10 )
 reverse( 12  1 10  8  7  5 13  2 15 16 17 11  6  3  0  4 14  9 )

Perm4030 =
 forward( 13  8 15 14 10  6 16  5  7  4  9  0  1 17  3 11 12  2 )
 reverse( 11 12 17 14  9  7  5  8  1 10  4 15 16  0  3  2  6 13 )


Perm2127 =
 forward(  2  0 14  6  7 13  8 11  4  3  5  9 17 16 10 15 12  1 )
 reverse(  1 17  0  9  8 10  3  4  6 11 14  7 16  5  2 15 13 12 )

Perm7506 =
 forward(  8 11  2  9 15  7 17  0  1  5 10 16  4  6 14  3 13 12 )
 reverse(  7  8  2 15 12  9 13  5  0  3 10  1 17 16 14  4 11  6 )

Perm4183 =
 forward( 13  0  9 10  8 12  2  3 11  4 17  6  5  1 15  7 14 16 )
 reverse(  1 13  6  7  9 12 11 15  4  2  3  8  5  0 16 14 17 10 )

And now I understand why I get these deltas of 9 : the permutations are identical, rotated/offset by 9 because the tiles are symmetrical too, so the whole permutations are +9 everywhere... It's obvious from Perm1965 vs Perm1974.

Discussions