Does C++ compiler recognizes powers of 2?

2018-06-10 23:26:09

I'm building a custom hash where I sum all the letters in string according to formula:

string[0] * 65536 + string[1] * 32768 + string[2] * 16384 + ...

And I've came to a problem whether I should have these numbers defined as constants in int array like this:

const int MULTIPLICATION[] = {
    65536,
    32768,
    16384,
    8192,
    4096,
    2048,
    1024,
    512,
    256,
    128,
    64,
    32,
    16,
    8,
    4,
    2,
    1
}

Or, maybe I should just generate these numbers while counting hash itself (while probably losing some speed due to them not being generated already)? I'll need to count this hash millions of times and the main thing I want compiler to understand is that instead of normal MUL operation

MOV EBX, 8
MUL EBX

it would do

SHL EAX, 3

Does compiler understand that if I'm multiplying by power of 2 to shift bits instead of usual multiplication?

Another question, I'm pretty sure it does shift bits when you write in c++ number *= 2; But just to clarify, does it?

Thanks, I've found out how to view dissasembly in debugger. Yes, compiler does understand to shift bits if you use it like

number *= 65536

However, it does normal multiplication if you do

number1 = 65536
number *= number1;

Try it!

What compiler are you using? You can tell most compilers to leave the intermediate files in place after compilation, or to just compile (and not assemble), so you can actually look at the assembly code it generated.

You can see on this other question of mine that this is just what I did.

For example, in gcc the -S flag means "compile only". And -masm=intel generates more readable assembly, IMO.

Edit

That all said, I think the following is the algorithm you're looking for (untested):

// Rotate right by n bits
#define ROR(a, n)   ((a >> n) | (a << (sizeof(a)*8-n)))


int custom_hash(const char* str, int len) {
    int hash = 0;
    int mult = 0x10000;  // 65536, but more obvious

    for (int i=0; i<len; i++) {
        hash += str[i] * mult;
        mult = ROR(mult, 1);    
    }

    return mult;
}

First of all, you didn't specify what happens when you have more than 16 characters (what is the multiplier?) So in this implementation, I used a bitwise rotate. x86 has bitwise rotate instructions ( ror and rol for rotate right and left, respectively). However, C provides no way of expressing a rotate operation. So I define the ROR macro which does the rotate for you. (Understanding how it works is left as an exercise for the reader!)

In my loop, I start the multiplier at 0x10000 (65536) like you did. Each iteration of the loop, I rotate the multiplier right by one bit. This essentially divides it by two, until you get to 1, after which it becomes 0x80000000.

The answer depends on your compiler, hardware architecture, and possibly other things.

It is not even obvious a priori that replacing such multiplication with a shift is the optimal thing to do. I think that generally one should let instruction-level optimizations to the compiler.

That said, let's see what my compiler does :)

int i, j;

int main() {
  j = i * 8;
}

This, compiled using gcc 4.7.2 with -O3 , results in

_main:
LFB0:
        movq    _i@GOTPCREL(%rip), %rax
        movl    (%rax), %edx
        movq    _j@GOTPCREL(%rip), %rax
        sall    $3, %edx                  ;<<<<<<<<<< THE SHIFT INSTRUCTION
        movl    %edx, (%rax)
        ret

So, in my environment, the answer is clearly "yes".

As to your other question, don't precompute MULTIPLICATION . To get the coefficients in

string[0] * 65536 + string[1] * 32768 + string[2] * 16384 + ...

just start with coeff = 65536 and shift it one bit to the right every iteration:

coeff >>= 1;

你为什么不使用C ++中的shift运算符？

(string[0] << 16) + (string[1] << 15) + (string[2] << 14) + ...

链接地址: http://www.djcxy.com/p/31572.html

上一篇: 在PHP中执行FOR vs FOREACH

下一篇: C ++编译器是否可以识别2的幂次？