fast integer square root approximation

I am currently searching for a very fast integer square root approximation, where floor(sqrt(x)) <= veryFastIntegerSquareRoot(x) <= x

The square root routine is used for calculating prime numbers, which get considerably faster if only values below or equals sqrt(x) are checked for being a divisor of x .

What I am currently having is this function from Wikipedia, adjusted a small bit to work with 64-bit integers.

Because I have no other function to compare against (or more precise, the function is too precise for my purposes, and it probably takes more time, than being higher than the actual result.)


Loopfree/jumpfree (well: almost ;-) Newton-Raphson:

/* static will allow inlining */
static unsigned usqrt4(unsigned val) {
    unsigned a, b;

    if (val < 2) return val; /* avoid div/0 */

    a = 1255;       /* starting point is relatively unimportant */

    b = val / a; a = (a+b) /2;
    b = val / a; a = (a+b) /2;
    b = val / a; a = (a+b) /2;
    b = val / a; a = (a+b) /2;

    return a;
}

For 64-bit ints you will need a few more steps (my guess: 6)


On modern PC hardware, computing the square root of n may well be faster using floating point arithmetics than any kind of fast integer math.

Note however that it may not be required at all: you can instead square the candidates and stop when the square exceeds the value of n . The dominant operation is the division anyway:

#define PBITS32  ((1<<2) | (1<<3) | (1<<5) | (1<<7) | (1<<11) | (1<<13) | 
                  (1UL<<17) | (1UL<<19) | (1UL<<23) | (1UL<<29) | (1UL<<31))

int isprime(unsigned int n) {
    if (n < 32)
        return (PBITS32 >> n) & 1;
    if ((n & 1) == 0)
        return 0;
    for (unsigned int p = 3; p * p <= n; p += 2) {
        if (n % p == 0)
            return 0;
    }
    return 1;
}

This version can be faster as DIV is slow and for small numbers (Val<20k) this version reduces the error to less than 5%. Tested on ARM M0 (With no DIV hardware acceleration)

static uint32_t usqrt4_6(uint32_t val) {
    uint32_t a, b;

    if (val < 2) return val; /* avoid div/0 */
    a = 1255;       /* starting point is relatively unimportant */
    b = val / a; a = (a + b)>>1;
    b = val / a; a = (a + b)>>1;
    b = val / a; a = (a + b)>>1;
    b = val / a; a = (a + b)>>1;
    if (val < 20000) {  
        b = val / a; a = (a + b)>>1;    // < 17% error Max
        b = val / a; a = (a + b)>>1;    // < 5%  error Max
    }
    return a;
}
链接地址: http://www.djcxy.com/p/85738.html

上一篇: g ++和gcc的区别

下一篇: 快整数平方根近似