Stats with GrayPar17 and PEAC16x2

I know I need a better PEAC scrambler but I first want to establish a base, with which to compare later developments.

So I hooked the now validated GrayPar17 layer with the existing PEAC16x2 to see what would happen, and it's... interesting.

Disclaimer:

C/D is now part of the "payload" but not processed by PEAC so some stats might be off by something or so.
Input data comes from a LFSR, which might not be representative, but injects enough "noise" to help sampling the space.

Thing to check are

the number of iterations required to (if ever) detect an error, plot the histogram of the latency, and find an eventual upper limit (to help in defining the protocol)
see which lengths are harder to detect and where

Overall GrayPar17 does a pretty good job at filtering most events, and I could refine the rough stats from 69. spurious errors.

At first, I only check 1-bit flips. The upper limit of loop iterations is 30 though ideally it should be much shorter, but I want to see what happens.

Scan of all the 20 bit positions, 100K times:

~~~ Testing PEAC16+GrayPar17 w. error ~~~
burst length=1
index=0   Stage1: 50019  Stage2: 49981
index=1   Stage1:100000
index=2   Stage1: 33246  Stage2: 66754
index=3   Stage1: 50501  Stage2: 46343  Stage3: 3049  missed:107
index=4   Stage1:100000
index=5   Stage1: 33300  Stage2: 66700
index=6   Stage1: 49853  Stage2: 43765  Stage3: 6377  missed:5
index=7   Stage1:100000
index=8   Stage1: 33241  Stage2: 66759
index=9   Stage1: 50003  Stage2: 49997
index=10  Stage1:100000
index=11  Stage1: 33407  Stage2: 66593
index=12  Stage1: 50307  Stage2: 46534  Stage3: 3042  missed:117
index=13  Stage1:100000
index=14  Stage1: 33384  Stage2: 66616
index=15  Stage1: 49821  Stage2: 43858  Stage3: 6316  missed:5
index=16  Stage1:100000
index=17  Stage1: 33221  Stage2: 66779
index=18  Stage1: 50043  Stage2: 37403  Stage3:12552  missed:2
index=19  Stage1: 66702  Stage2: 33298

Stage 1 is detection out of SUB3, it should average 50% detection rate.
Stage 2 is immediate detection by comparison of the marker: most of the rest but could miss 1/4 (1/8)
Stage 3 is when the marker is not rejected immediately, but after a number of iterations.
Missed : PEAC has not bubbled the error to the carry in 30 cycles...

Offending indices are 3 and 12 (1/1000th chance of missing a single bit flip), followed by 6, 15 and 18 (< 1/10000).

OTOH indices 1, 4, 7, 10, 13, 16 are spot on. But that's only 1 in 3 positions.

The other positions are caught immediately at stage 2.

What is at indices 3 and 12 to be so sensitive ? and why is 9 not affected ?

2-9-16 are the summed output of the parities and the markers, these bits do not appear fragile.
3-6-12-15-18 all belong to the middle V/P2
in particular 3 and 12 are closely related, neighbours in the parity circuits. They share the input/output bit 6 (and 2 and 12).
paradoxically, the very great 1-4-7-10-13-16 signals are related to P3, which is combined by a longer chain in the ADD3/SUB3 units.
the remaining bits 5-8-11-14-17 are related to P1/S0 by a simple XOR.

So the ADD3/SUB3 system might not be ideal, end-around carry could become necessary.

I also want to see how long it takes for an error beyond stage 2 to be detected.

Pushing to 50 loops, I get:

 1 :   5830 - *****************************************************************************
 2 :   7574 - ****************************************************************************************************
 3 :   4492 - ************************************************************
 4 :   2897 - ***************************************
 5 :   2249 - ******************************
 6 :   1587 - *********************
 7 :   1359 - ******************
 8 :    998 - **************
 9 :    756 - **********
10 :    599 - ********
11 :    559 - ********
12 :    447 - ******
13 :    371 - *****
14 :    317 - *****
15 :    171 - ***
16 :    209 - ***
17 :    189 - ***
18 :    155 - ***
19 :     97 - **
20 :    104 - **
21 :    129 - **
22 :     71 - *
23 :     46 - *
24 :     24 - *
25 :     46 - *
26 :     52 - *
27 :     15 - *
28 :     18 - *
29 :     23 - *
30 :     31 - *
31 :     18 - *
32 :      4 - *
33 :      9 - *
34 :     14 - *
35 :     12 - *
36 :     17 - *
37 :     12 - *
38 :      0 -
39 :     12 - *
40 :     11 - *
41 :     10 - *
42 :      0 -
43 :     10 - *
44 :      0 -
45 :      6 - *
46 :      4 - *
47 :      6 - *

That's 31560 errors detected by PEAC, but still 14 are missed out of 2 million error injections.

The 10 first loops catch 28341 errors already, or 86%, but what is an acceptable error rate ?

Convergence is a problem because pushing to 20 loops catches 30960, or 98%, but it is enough yet, and can the protocol accommodate such a long delay ? Particularly since, with such a long latency, other errors could pop up and "cancel" the first flip.

Hopefully, the convergence should be solved by an more appropriate PEAC structure.

spurious errors

Not XOR, not ADD, then what ?

Discussions

Become a Hackaday.io Member