What does the compiler do in this assembly code?

I'm trying to understand what a C compiler will do when it compiles to assembly. The code I've compiled to assembly is this:

void main() {
    int x = 10;
    int y = 10;
    int a = x + y;
}

Which produces the following assembly:

                .Ltext0:
                    .globl  main
                main:
                .LFB0:
0000 55             pushq   %rbp
0001 4889E5         movq    %rsp, %rbp
0004 C745F40A       movl    $10, -12(%rbp)
000b C745F80A       movl    $10, -8(%rbp)
0012 8B45F8         movl    -8(%rbp), %eax
0015 8B55F4         movl    -12(%rbp), %edx
0018 01D0           addl    %edx, %eax
001a 8945FC         movl    %eax, -4(%rbp)
001d 5D             popq    %rbp
001e C3             ret

However I'm having some trouble understanding what in particular is going on in this snippet. I understand all the labels, and some of the assembly. Here's what I think it does:

  • push rbp? - is this for a stack frame or something?
  • set stack pointer to base pointer? (ie clear stack)
  • moves 10 into stack? Offset by -12? Why 12, and why is it negative?
  • moves 10 into stack, though this time at -8 instead of -12 (difference of 4, perhaps bytes or something?)
  • move value at -8 into eax
  • move value at -12 into edx
  • add eax and edc
  • move the value from eax into the stack
  • pop rbp? end of function stack frame possibly?
  • return from the function??
  • Can anyone clarify certain points of this assembly, maybe the reasoning the compiler has in choosing -8, -12, why it chooses eax and edc over some other registers, why it pushes and pops rbp, etc?


    push rbp? - is this for a stack frame or something?

    Yes. The compiler creates a stack frame for local variables. push %rbp / movq %rsp, %rbp is the standard method for doing this. It allows easy access of local variables.

    moves 10 into stack? Offset by -12? Why 12, and why is it negative?

    In this case, the compiler chose to use the 4-byte ( int size) part of the stack from -12(%rbp) to -9(%rbp) for the variable x .

    Once a stack frame is created, you can access local variables with negative offsets, and function arguments with positive offsets:

    ------------------------------------------------------
                            | R |
         New stack (locals) | B | Old stack (parameters)
                            | P |
    ------------------------------------------------------
                              ^
                              RBP is updated to point here as well so you get negative offsets (to the left) for locals and positive offsets (to the right) for parameters.
    

    Note that since the stored RBP also takes up space, as well as the return address for the function, you need to add 16 bytes to any parameter offset. (8 bytes for 32-bit systems)

    Often, you have to update RSP before doing any work with local variables, like this: subq $12, %rsp . When leaving the function, use addq $12, %rsp or leave . This example updates the stack pointer to show we are using 12 bytes on the stack. When you're done with them, you simply restore the stack pointer. In your example, though, none of this is needed, because the function has no other use for the stack than local variables.

    moves 10 into stack, though this time at -8 instead of -12

    Once again, referencing a local variable, except this time, the compiler chose the 4 byte section starting at -8(%rbp) to -5(%rbp) for variable y .

    In this case, the pop %rbp restores the stack at the end of the function, to what it was before entry:

    ------------------------------------------------------
                            | R |
         New stack (locals) | B | Old stack (parameters)
                            | P |
    ------------------------------------------------------
                              ^
                              RSP points here, so a `pop %rbp` will restore both RSP and RBP
    

    The compiler probably tries to use EAX and EDX first because EAX is designed for math operations and EDX is designed for generic data operations. You'll often find them paired in operations.


    To understand the assembly generated by a compiler, you'll have to understand about stack frames. SP is the stack pointer, BP points to the current stack frame, which is used to address local variables (hence moving the value "10" to [bp-12] and [bp-8]. Then it load that to the first available registers for an addition (ax and dx in this case) and performs the add. Finaly, it restores the old stack and returns.

    链接地址: http://www.djcxy.com/p/72414.html

    上一篇: 缓慢的jmp

    下一篇: 编译器在这个汇编代码中做了什么?