most 1 bit for x=2^k?

Possible Duplicate:
Efficient bitwise operations for counting bits or find the right|left most ones

I wrote a search algorithm. In order to optimize it, I want to use a 32bit int to mark if number 0~31 could be used. That is, if I have a state State , I could easily fetch all possible number by using:

x = State & -State
State -= x

But actually, I need to know where is the 1 in x (notice that there is only 1 in x 's binary form)? For example, if x = 0000 0100 , I want to know that is the third.

I know I could do that by using a for loop, but it seems that it would cost much time.

I wonder that if there is a method using bitwise operation ? And would static_cast<int>(log2(x)) costs much time?

Thanks for any help.


Many CPUs have a native hardware instruction, something like CTZ ("count trailing zeros"). GCC exposes this through the built-in function __builtin_ctz ; other compilers should have similar facilities.


Nobody seems to have mentioned the obvious solution - to use a giant switch for all possible powers of two.

The below code implements this, plus binary search and division by two.

NOTE : these functions expect that the input will be a power of two; they may return nonsense if it is not.

#include <inttypes.h>
#include <stdio.h>

int get_exp_switch (uint64_t x)
{
    switch (x) {
        case (uint64_t)1 << 0: return 0;
        case (uint64_t)1 << 1: return 1;
        case (uint64_t)1 << 2: return 2;
        case (uint64_t)1 << 3: return 3;
        case (uint64_t)1 << 4: return 4;
        case (uint64_t)1 << 5: return 5;
        case (uint64_t)1 << 6: return 6;
        case (uint64_t)1 << 7: return 7;
        case (uint64_t)1 << 8: return 8;
        case (uint64_t)1 << 9: return 9;
        case (uint64_t)1 << 10: return 10;
        case (uint64_t)1 << 11: return 11;
        case (uint64_t)1 << 12: return 12;
        case (uint64_t)1 << 13: return 13;
        case (uint64_t)1 << 14: return 14;
        case (uint64_t)1 << 15: return 15;
        case (uint64_t)1 << 16: return 16;
        case (uint64_t)1 << 17: return 17;
        case (uint64_t)1 << 18: return 18;
        case (uint64_t)1 << 19: return 19;
        case (uint64_t)1 << 20: return 20;
        case (uint64_t)1 << 21: return 21;
        case (uint64_t)1 << 22: return 22;
        case (uint64_t)1 << 23: return 23;
        case (uint64_t)1 << 24: return 24;
        case (uint64_t)1 << 25: return 25;
        case (uint64_t)1 << 26: return 26;
        case (uint64_t)1 << 27: return 27;
        case (uint64_t)1 << 28: return 28;
        case (uint64_t)1 << 29: return 29;
        case (uint64_t)1 << 30: return 30;
        case (uint64_t)1 << 31: return 31;
        case (uint64_t)1 << 32: return 32;
        case (uint64_t)1 << 33: return 33;
        case (uint64_t)1 << 34: return 34;
        case (uint64_t)1 << 35: return 35;
        case (uint64_t)1 << 36: return 36;
        case (uint64_t)1 << 37: return 37;
        case (uint64_t)1 << 38: return 38;
        case (uint64_t)1 << 39: return 39;
        case (uint64_t)1 << 40: return 40;
        case (uint64_t)1 << 41: return 41;
        case (uint64_t)1 << 42: return 42;
        case (uint64_t)1 << 43: return 43;
        case (uint64_t)1 << 44: return 44;
        case (uint64_t)1 << 45: return 45;
        case (uint64_t)1 << 46: return 46;
        case (uint64_t)1 << 47: return 47;
        case (uint64_t)1 << 48: return 48;
        case (uint64_t)1 << 49: return 49;
        case (uint64_t)1 << 50: return 50;
        case (uint64_t)1 << 51: return 51;
        case (uint64_t)1 << 52: return 52;
        case (uint64_t)1 << 53: return 53;
        case (uint64_t)1 << 54: return 54;
        case (uint64_t)1 << 55: return 55;
        case (uint64_t)1 << 56: return 56;
        case (uint64_t)1 << 57: return 57;
        case (uint64_t)1 << 58: return 58;
        case (uint64_t)1 << 59: return 59;
        case (uint64_t)1 << 60: return 60;
        case (uint64_t)1 << 61: return 61;
        case (uint64_t)1 << 62: return 62;
        case (uint64_t)1 << 63: return 63;
        default: return 0; // not allowed
    }
}

int get_exp_simple (uint64_t x)
{
    int i = -1;
    do {
        i++;
        x /= 2;
    } while (x > 0);
    return i;
}

int get_exp_binsearch (uint64_t x)
{
    int left = 63;
    int right = 0;

    while (left > right) {
        int middle = (left + right) / 2;
        uint64_t middle_value = (uint64_t)1 << middle;
        if (x < middle_value) {
            left = middle - 1;
        }
        else if (x > middle_value) {
            right = middle + 1;
        }
        else {
            return middle;
        }
    }

    return left;
}

int main ()
{
    uint64_t sum = 0;

    for (int j = 0; j < 100000; j++) {
        for (int i = 0; i < 64; i++) {
            uint64_t x = (uint64_t)1 << i;
            int l = get_exp_switch(x);
            //int l = get_exp_simple(x);
            //int l = get_exp_binsearch(x);
            sum += l;
            //printf("%" PRIu64 ": %dn", x, l);
        }
    }

    printf("%" PRIu64 "n", sum);

    return 0;
}

Benchmark results on my (64-bit) system with clang -O2 :

get_exp_switch: 0m0.103s
get_exp_simple: 0m0.196s
get_exp_binsearch: 0m0.158s

However, note that as you use larger integers (bignum), the binary search method will quickly begin to outperform the simple method, and the code size of the switch method may get unacceptably large.


如果您必须使用按位运算符,那么您可以使用位扫描技巧,但大多数编译器对于找到第一个/最后一个位(又名BitScanForward和BitScanReverse)具有内在性,这些功能在现代处理器上非常快,应该使用从技术上讲,这些都是按位操作,即它们在位上操作,它们只是没有操作符形式)。

链接地址: http://www.djcxy.com/p/89582.html

上一篇: CUDA中缓冲区的按位移动

下一篇: x = 2 ^ k时最多1位?