Assembly memory math and looping

I'm struggling to figure out how a certain block would function. With the following address on the heap

004B0000 73 6D 67 66 74 smgft

and the following assembly:

77A701B8 xor eax, eax
77A701BA mov ecx, 4
77A701BF lea edi, DWORD PTR DS:[ecx+4B0000]
77A701C5 xor DWORD PTR DS:[edi], ecx
77A701C5 loopd short ntdll.77A701BF

The problem is to provide the value of the five bytes on the heap in ASCII after the instructions have executed. What I can understand from it is as follows

xor eax, eax ; 0 out eax

mov ecx, 4 ; set ecx 4

lea edi, dword ptr ds:[ecx+4b0000] ; this loads into EDI whatever is stored at ecx+4b0000, so 4b0004. I'm not sure what this would grab. I'm not even sure what 4b0000 would get, since it's 5 bytes. mgft, or smgf? I think smgf? And how does the +4h affect this? Makes it 736D676678?

xor dword ptr ds:[edi], ecx ; So this will xor 4h with the newly loaded dword at edi, but what does it do with it in the loopd?

loopd short ntdll.77A701BF ; So this is a "loop while equal" but I'm not sure what that translates to with a xor above it. And does it decrement ecx? But then it jumps back to the lea line.


The lea edi, dword ptr ds:[ecx+4b0000] loads the value ecx+0x004b0000 into EDI, and doesn't access memory at all. The loop instruction is like " ecx = ecx - 1; if(ecx != 0) goto ntdll.77A701BF ".

Not that this code can be unrolled, so that it becomes:

    xor eax, eax

    lea edi, DWORD PTR DS:[4+0x004B0000]
    xor DWORD PTR DS:[edi], 0x00000004

    lea edi, DWORD PTR DS:[3+0x004B0000]
    xor DWORD PTR DS:[edi], 0x00000003

    lea edi, DWORD PTR DS:[2+0x004B0000]
    xor DWORD PTR DS:[edi], 0x00000002

    lea edi, DWORD PTR DS:[1+0x004B0000]
    xor DWORD PTR DS:[edi], 0x00000001

    xor ecx,ecx

Which can be optimised more, so it becomes:

    xor BYTE PTR DS:[0x004B0004], 0x04
    xor BYTE PTR DS:[0x004B0003], 0x03
    xor BYTE PTR DS:[0x004B0002], 0x02
    xor BYTE PTR DS:[0x004B0001], 0x01

    xor eax, eax       ;May be unnecessary if value unused by later code
    mov edi,0x004B0001 ;May be unnecessary if value unused by later code
    xor ecx, ecx       ;May be unnecessary if value unused by later code

Which can be optimised a little more by combining the XORs:

    xor DWORD PTR DS:[0x004B0001], 0x04030201

    xor eax, eax       ;May be unnecessary if value unused by later code
    mov edi,0x004B0001 ;May be unnecessary if value unused by later code
    xor ecx, ecx       ;May be unnecessary if value unused by later code

Note: Yes, this is a misaligned XOR, but likely faster than multiple smaller aligned XORs on modern CPUs as it doesn't cross a cache line boundary.

Essentially; the entire loop can be reduced to a single instruction.

链接地址: http://www.djcxy.com/p/43856.html

上一篇: 在gprof中描述单个函数

下一篇: 装配内存数学和循环