With the last log 139. Heuristic breakthrough, we found a sieve-like structure for the moduli that give perfect and maximal orbits. We also found that perfect and maximal orbits obey to the same rules except for one (the base 4 "detail").
I have only tested up to the prime 131 but it is interesting to get more of these because they will help to better sieve out more candidates with little effort. In return, more effort is needed with more thorough heuristics. As a warm-up, let's find the prime numbers that match our criteria: let's find a list of primes, up to 1 million for example, and keep only those with 1 and 9 as last digits. Well I know I could write a sieve but I have better than that to do so let's find something ready in the interwebs. It was a breeze with https://numbergenerator.org/numberlist/prime-numbers. which let me download a file of 78496 odd numbers up to 999983. That should be enough.
grep '[19]$' Primes.1M.txt > Primes.19.txt wc -l Primes.19.txt
ok that was easy. 39210 numbers left. What is way less easy is to get the corresponding pairs of forbidden remainders for each of these bases. I have already noticed some funky patterns in them and the sieve implies some successive elimination algorithm but let's not get too dizzy: the numbers, not the ideas, rule, so let's make them speak first. But for that we need *more* than the 131540 orbits already computed. It's going to take even more time than before...
Fortunately now we have even more insight into them and we can apply 2 new heuristics:
- We can stop the iteration once they exceed the maximal threshold, thus halving the computation time one in every 4 unsieved candidates. This would save about 20% of computation time overall.
- We can pre-sieve out more numbers. A previous version would discard all 2 mod 5 candidates but we can do much better.
Oh and we can run the code on more cores at once and concatenate the results. But first, a bit of benchmarking.
gPEAC_bases_basic.c scans the first 10000 orbits in
... * 9974 99490648 -49745324 * 9975 49755299 0 * 9978 99570460 -49785230 * 9985 49855104 0 1383 / 10000 147.60user 0.02system 2:27.94elapsed 99%CPU
Stopping the long iterations early is easy to code. Here is the output of gPEAC_bases_earlystop.c
M 9959 49595819 P 9974 49745329 M 9975 49755299 P 9978 49785235 M 9985 49855104 1383 / 10000 153.53user 0.01system 2:33.95elapsed 99%CPU
I expected a 1/5th speed increase but there is no gain, probably because the early stop adds a test/condition in the inner loop. More inlining and peep-hole optimising might be required, removing conditions inside the computation. gcc -O3 barely made a dent:
M 9985 49855104 1383 / 10000 147.81user 0.01system 2:28.18elapsed 99%CPU
A bit of unrolling removed a few seconds. Streamlining the return mechanism (smaller loop exit computation, gotos...) removes a few more seconds. But nothing significant yet.
I got a small gain with this code:
uint64_t orbit(uint64_t modulo, uint64_t max) {
uint64_t len=0, X = 1, Y = 0;
reloop:
Y += X;
if (Y >= modulo) {
Y -= modulo;
X++;
}
if ((0== ((Y ^ 1) | X))
|| (len > max))
goto return_1;
X += Y;
if (X >= modulo) {
X -= modulo;
Y++;
}
len+=2;
if ((X^1) | Y )
goto reloop;
return len;
return_1:
len++;
return len;
}
I like that it saves the S variable but the actual gain is in the % range. The loop exit conditions don't affect much the runtime. I suppose that the most taxing part is the "if > modulo" part, which flushes the pipeline so often. I can come up with some x86 sequences but inline code is dirty, right ?
.
.
.
Oh wait, I opened another can of worms. See the next log: The end of benchmarking.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.