What does subl do here?
So... I'm compiling into assembler, with gcc -S -O2 -m32:
void h(int y){int x; x=y+1; f(y); f(2); }
And it gives me the following:
.file "sample.c"
.text
.p2align 4,,15
.globl h
.type h, @function
h:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl 8(%ebp), %eax
movl %eax, (%esp)
call f
movl $2, 8(%ebp)
leave
jmp f
.size h, .-h
.ident "GCC: (GNU) 4.4.3 20100127 (Red Hat 4.4.3-4)"
.section .note.GNU-stack,"",@progbits
Now I know what pushl and movel: they store the current frame pointer onto the stack and then set the value of the frame pointer register to the value of the Stack Pointer.
Thanks!
The compiler is reserving space on the stack for locals and whatever other needs it might have. I'm not sure offhand why it's reserving 24 bytes (it doesn't seem to need or use it all).
When calling function f()
, instead of using a push instruction to put the parameter on the stack, it uses a simple movl
to the last location it reserved:
movl 8(%ebp), %eax ; get the value of `y` passed in to `h()`
movl %eax, (%esp) ; put that value on the stack for call to `f()`
A more interesting (in my opinion) thing happening here is how the compiler is handling the call to f(2)
:
movl $2, 8(%ebp) ; store 2 in the `y` argument passed to `h()`
; since `h()` won't be using `y` anymore
leave ; get rid of the stackframe for `h()`
jmp f ; jump to `f()` instead of calling it - it'll return
; directly to whatever called `h()`
To answer your question, "immed by the way?" - that is what the instruction reference uses to indicate that the value is encoded in the instruction opcode instead of coming somewhere else like a register or memory location.
To answer those numbered questions:
1) subl $24,%esp
means esp = esp - 24
GNU AS uses AT&T syntax, which is the opposite of Intel syntax. AT&T has the destination on the right, Intel has the destination on the left. Also AT&T is explicit about the size of the arguments. Intel tries to deduce it or forces you to be explicit.
The stack grows down in memory, the memory at and after esp is the stack contents, addresses lower than esp are unused stack space. esp points to the last thing pushed onto the stack.
2) x86 instruction encoding mostly allows the following:
movl rm,r ' move value from register or memory to a register
movl r,rm ' move a value from a register to a register or memory
movl imm,rm ' Move immediate value.
there is no memory-to-memory instruction format. (Strictly speaking you can do memory-to-memory operations with movs
or by push mem
, pop mem
, but neither take two memory operands on the same instruction)
"Immediate" means the value is encoded right into the instruction. For example, to store 15 at the address in ebx:
movl $15,(%ebx)
15 is an "immediate" value.
The parentheses make it use the register as a pointer to memory.
3) movl 8(%ebp),%eax
means,
esp is the stack pointer. In 32-bit mode, each push and pop on the stack is 4 bytes wide. Typically, most variables take up the 4 bytes anyway. So you could say 8(%ebp) means, starting at the top of stack, give me the value 2 (4 x 2 = 8) int's into the stack.
Typically, 32-bit code uses ebp to point to the beginning of the local variables in a function. In 16-bit x86 code, there was no way to use the stack pointer as a pointer (hard to believe, right?). So what people did was copy sp
to bp
and use bp as the local frame pointer. This became completely unnecessary when 32-bit mode came out (80386), it did have a way to just use the stack pointer directly. Unfortunately, ebp makes debugging easier so we ended up continuing to use ebp in 32-bit code (it's trivially easy to make a stack dump if ebp is being used).
Thankfully, amd64 gave us a new ABI which does not use ebp as a frame pointer, 64-bit code typically uses esp to access local variables, ebp is available to hold a variable.
4) Explained above
5) leave
is an old instruction that simply does movl %ebp,%esp
and popl %ebp
and saves a few code bytes. What it actually does is undo the changes to the stack and restore the caller's ebp. The called function must preserve ebp
in the x86 ABI.
On entry to the function, the compiler did subl $24,%esp to make room for local variables and sometimes temp storage that it didnt have enough registers to hold.
The best way to "imagine" the stack frame in your mind is to see it as a structure sitting on the stack. The first members of the imaginary structure are the most recently "pushed" values. So when you push to a stack, imagine inserting a new member at the beginning of the structure, while none of the other members moved. When you "pop" from the stack, you get the value of the first member of the imaginary struct, and that (first) line of the structure disappears from existence.
Stack frame manipulation is mostly just moving the stack pointer to make more or less room in that imaginary struct we call the stack frame. Subtracting from the stack pointer just puts multiple imaginary members at the start of the struct in one step. Adding to the stack pointer makes the first so many members disappear.
The end of the code you posted is not typical. That jmp
is typically a ret
. The compiler was clever about it and did a "tail call optimization", meaning it just cleans up what it did to the stack and jumps to f
. When f(2)
returns, it will actually return straight to the caller (not back to the code you posted)
上一篇: 使用保存的EBP值识别堆栈中的堆栈帧
下一篇: subl在这里做什么?