Square roots

Project Logs

Collapse

The binary comb instruction
Yann Guidon / YGDES • 09/18/2022 at 23:10 • 0 comments
I remember being "somewhat focused" on integer square roots, from about 1994 to 1996 until other pet ideas took over. I saw some algorithms from naïve/crude to more established and the Newton method seemed nice. Except that it never converges fast enough and Newton-Raphson is too computationally intensive for small/weak processors. The critical aspect was now to find the best approximation with the fewest complexity and I didn't even have my Baccalauréat when I found a way, which I still haven't seen out there. The problem is that it requires a new opcode but this one would be very simple to compute.
It didn't give it a proper name that I remember, but today I would call this opcode COMB because it acts like a comb on binary data. It would keep one bit in each pair and dump the other. As an illustration, if you have 8 bits callled ABCDEFGH respectively, the result would be 0000ACEG. It drops one bit every 2 bits, and pads the MSB with 0.
Mathematically, it selects all the odd indices of bit, which are a way to rearrange the polynomial construction of the value. The algorithm would be :
- COMB n to o
- COMB (n>>1) to e
- a = e + (o × 1.41... normalised to the binary representation)
Of course, if the opcode map is large enough, one could implement two opcode versions to select the odd and even bits to save one shift instruction.
The multiply with a constant is necessary to get enough precision for somewhat large values, though the mathematical definition is clearly still not fully respected (there are extra terms to compute), but I have tested this and seen that the approximation is "good enough" to speed up Newton to satisfaction.
A cruder version would use no multiply and provide a worse approximation. In fact I believe it was my first idea :
a = COMB ( n | n>>1 )
It's faster but lousy.
Anyway, COMBE and COMBO look like opcodes that can be very easily implemented in #F-CPU for example.

View project log

Discussions

Tim wrote 09/18/2022 at 20:34

The old ARM sqrt has been floating around forever, here seems to be a summary

https://www.pertinentdetail.org/sqrt

Are you sure? yes | no

Yann Guidon / YGDES wrote 09/18/2022 at 22:36

Great link, even though a bit specialised and slow...

Related : https://www.pertinentdetail.org/invsqrt

Don't know if I should mirror them here though.

Are you sure? yes | no

Tim wrote 09/18/2022 at 22:49

>slow...

nah... scroll down a bit for 3 cycles/bit version.

(and please don't rebuke with newton raphson or the "quake" invsqrt, that is not from quake ;) )

Are you sure? yes | no

Tim wrote 09/18/2022 at 22:54

Here is a good page on algorithms based on numeric approximation, that is useful for floats:

http://www.azillionmonkeys.com/qed/sqroot.html

Are you sure? yes | no

Olivier Pirson wrote 09/18/2022 at 17:17

My goal is to compute ⌊√n⌋ for n unsigned integer on 64 bits, exactly and in efficient way.

Are you sure? yes | no

Yann Guidon / YGDES wrote 09/18/2022 at 15:33

https://gcc.godbolt.org/z/1deK4sffE

My friend Olivier Pirson is still struggling with integer 64-bit sqrt and getting fast code from portable C source...

Are you sure? yes | no

Square roots

Become a Hackaday.io member

Just one more thing

Description

Project Logs Collapse

The binary comb instruction

Enjoy this project?

Discussions

Become a Hackaday.io Member

Does this project spark your interest?

Report project as inappropriate

Send message

Remove Member

Project Logs

Collapse