The article is out !
I have about 14 pages in GNU/Linux Magazine France #213 where I go through all the optimisation process and go even further.

This article goes beyond the methods shown so far and I will try to develop them. The AVX SIMD extension brings some significant advantages and require a deep redesign of the algorithm but the basic ideas remain the same, summarised in the following graph:

I'll have to update the project, or maybe create a new one. Meanwhile, I'm pretty proud :

Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.