What does the compiler do in this assembly code?
I'm trying to understand what a C compiler will do when it compiles to assembly. The code I've compiled to assembly is this:
void main() {
int x = 10;
int y = 10;
int a = x + y;
}
Which produces the following assembly:
.Ltext0:
.globl main
main:
.LFB0:
0000 55 pushq %rbp
0001 4889E5 movq %rsp, %rbp
0004 C745F40A movl $10, -12(%rbp)
000b C745F80A movl $10, -8(%rbp)
0012 8B45F8 movl -8(%rbp), %eax
0015 8B55F4 movl -12(%rbp), %edx
0018 01D0 addl %edx, %eax
001a 8945FC movl %eax, -4(%rbp)
001d 5D popq %rbp
001e C3 ret
However I'm having some trouble understanding what in particular is going on in this snippet. I understand all the labels, and some of the assembly. Here's what I think it does:
Can anyone clarify certain points of this assembly, maybe the reasoning the compiler has in choosing -8, -12, why it chooses eax and edc over some other registers, why it pushes and pops rbp, etc?
push rbp? - is this for a stack frame or something?
Yes. The compiler creates a stack frame for local variables. push %rbp
/ movq %rsp, %rbp
is the standard method for doing this. It allows easy access of local variables.
moves 10 into stack? Offset by -12? Why 12, and why is it negative?
In this case, the compiler chose to use the 4-byte ( int
size) part of the stack from -12(%rbp)
to -9(%rbp)
for the variable x
.
Once a stack frame is created, you can access local variables with negative offsets, and function arguments with positive offsets:
------------------------------------------------------
| R |
New stack (locals) | B | Old stack (parameters)
| P |
------------------------------------------------------
^
RBP is updated to point here as well so you get negative offsets (to the left) for locals and positive offsets (to the right) for parameters.
Note that since the stored RBP also takes up space, as well as the return address for the function, you need to add 16 bytes to any parameter offset. (8 bytes for 32-bit systems)
Often, you have to update RSP
before doing any work with local variables, like this: subq $12, %rsp
. When leaving the function, use addq $12, %rsp
or leave
. This example updates the stack pointer to show we are using 12 bytes on the stack. When you're done with them, you simply restore the stack pointer. In your example, though, none of this is needed, because the function has no other use for the stack than local variables.
moves 10 into stack, though this time at -8 instead of -12
Once again, referencing a local variable, except this time, the compiler chose the 4 byte section starting at -8(%rbp)
to -5(%rbp)
for variable y
.
In this case, the pop %rbp
restores the stack at the end of the function, to what it was before entry:
------------------------------------------------------
| R |
New stack (locals) | B | Old stack (parameters)
| P |
------------------------------------------------------
^
RSP points here, so a `pop %rbp` will restore both RSP and RBP
The compiler probably tries to use EAX
and EDX
first because EAX
is designed for math operations and EDX
is designed for generic data operations. You'll often find them paired in operations.
To understand the assembly generated by a compiler, you'll have to understand about stack frames. SP is the stack pointer, BP points to the current stack frame, which is used to address local variables (hence moving the value "10" to [bp-12] and [bp-8]. Then it load that to the first available registers for an addition (ax and dx in this case) and performs the add. Finaly, it restores the old stack and returns.
链接地址: http://www.djcxy.com/p/72414.html上一篇: 缓慢的jmp
下一篇: 编译器在这个汇编代码中做了什么?