How do stack and registers work in assembler?

2018-06-28 14:31:40

I know there are EBP, ESP, EAX, and more and using these registers, stack is made and all the stacking. I'm confused if a certain register (ie EBP) is THE stack and ESP and other registers are stacked on top of EBP to stack on EBP.

Or stack is just a visualization of memory allocations (boundary) to understand memory better and registers are the real memory stuff.

What it makes me confusing is when main function is calling a function:

In main, before calling a function, any parameters to the function is pushed to ESP from EAX, then does "call" on the function which pushes return address (next address after "call" in main) to the stack (I thinks return address is stacked on ESP with the parameters to the function in order to be stacked on EBP when the function is called. And I think this is wrong?), then EIP is moved to the beginning of the function.

Then when the function is called, EBP is pushed (again? Is this because while inside the function, EBP is nothing? But isn't EBP register that already contains some value from the main?) and ESP value is pushed on EBP (this is why I think EBP is THE stack. Everything is stacked on EBP at this point isn't it?) and then, ESP is "sub" with some value to give a room for local variables of the function. (Does ESP have ESP value when ESP was pushed on EBP at the entry of the function? or was it emptied?)

At the end of the function, function does "leave" and "ret" which erases stack frame of the function (EBP? or ESP? or just "stack frame" which is neither EBP nor ESP? If it erases either EBP or ESP, what happens to EBP for main? I read that EBP is re initialized from stack pointer but when was stack pointer pushed on the stack?) then "ret", for which EIP moves to return address which was pushed in "main" before executing the "call" on function.

So it's all confusing to me because I'm not sure if "stack" is a certain register or a flexible memory boundary for better understanding. And I'm not sure where and when stack pointer is pushed on the stack.

The "stack" is just memory. Somewhere in your processor you have a "stack pointer". The thing about stack is that you don't care exactly where in memory it is, everything is relative to the stack pointer, stack pointer plus or minus some number of memory locations.

Hope/assume that your stack has enough space to do what your program needs to do (that is another topic). So in that respect the stack is just a bunch of memory which is more than just a single registers worth of data.

Think of the stack as literally a stack of something, stack of memory locations actually but perhaps a stack of index cards on which you write various things. Pictures usually help.

[     ]  <- sp

I don't remember the details of the x86, some processors the stack pointer points at the current item on the "top" of the stack. And other processors the stack pointer points at the first free location. I will just pick one method and run with it and then adjust as needed. Also some processors the stack naturally grows "down" meaning as you add something to the stack the stack pointer address gets smaller. Some the stack grows up, but that is less common, visually though it makes more sense if your stack of note cards is stacked up on some table rather than inverse gravity and they are pushing into the ceiling by some force.

So we have the above picture before we prepare to call a function. Let's say the stack pointer is pointing at the top of stack and we don't care who or what is on the top of stack for the moment other than it is someones data that we shouldn't touch, another property of a stack, one side of the stack pointer is fair game, the other side of the stack pointer is someone's data that you should not touch unless it is your own data. When you call another function make sure the stack pointer is such that the stack pointer points at the top of the stack, pushing doesn't destroy anything.

So we want to pass two arguments to our function, we push those in reverse order so that when the function gets called they look more natural, this is arbitrary and based on the compilers calling conventions. So long as the rule is the same all the time it doesn't matter what order, pushing in reverse is pretty common though so.

fun( a, b);

before we push b

[stuff] <-sp

after we push b

[  b  ] <- sp
[stuff]

where each [item] is one memory location on the stack of some fixed size, lets assume 32 bits for now but it could be 64 bits.

then we push a

[  a  ] <- sp
[  b  ] 
[stuff]

and we are ready to call the function, so assume that a call puts the return address on the stack

call fun

[retadd] <- sp
[  a  ] 
[  b  ] 
[stuff]

So right now within the fun function relative to the stack pointer we can address the various items in the stack:

[retadd] <- sp + 0
[  a  ]  <- sp + 4
[  b  ]  <- sp + 8
[stuff]  <- sp + 12

assuming a 32 bit wide stack in this example.

Stack frames are generally not required, they help make the code more readable, and thus easier for the compiler folks to debug, but it does just burn a register (which may or may not be general purpose depending on your architecture). But here is how that picture works

push fp since we are going to modify it we don't want to mess up the callers fp register
fp = sp;  (Frame pointer (ebp) = stack pointer (esp));


[  fp ]  <- sp + 0  <- fp + 0
[retadd] <- sp + 4  <- fp + 4
[  a  ]  <- sp + 8  <- fp + 8
[  b  ]  <- sp + 12 <- fp + 12
[stuff]  <- sp + 16 <- fp + 16

So if I want to access the first parameter passed into my function I can access it at the memory address of fp+8.

Now say I want to have two local variables, which are usually on the stack so I need to make room for those, I can either push dummy data or just modify the stack pointer either way I end up with

[  x  ]  <- sp + 0  <- fp - 8
[  x  ]  <- sp + 4  <- fp - 4
[  fp ]  <- sp + 8  <- fp + 0
[retadd] <- sp + 12 <- fp + 4
[  a  ]  <- sp + 16 <- fp + 8
[  b  ]  <- sp + 20 <- fp + 12
[stuff]  <- sp + 24 <- fp + 16

And now the frame pointer starts to make a whole lot of sense, as I muck with the stack pointer where my parameters are relative to the stack pointer is mucked with as well, the first parameter used to be at sp+8 now it is at sp+16, the compiler or programmer would have to keep track of that at each point in the function to know where everything is, quite doable but sometimes not done that way.

But even though we messed with the stack pointer the frame pointer did not move; we didn't touch it, so our first parameter is still be at fp+8. As the stack adds and removes stuff or even if it doesn't so long as we don't touch the frame pointer from that initial save and set, to the very end of the function we can access the passed parameters and the local variables using known offsets throughout the function.

Just before returning we re-adjust the stack pointer to where it points at the frame pointer

[  fp ]  <- sp + 0  <- fp + 0
[retadd] <- sp + 4  <- fp + 4
[  a  ]  <- sp + 8  <- fp + 8
[  b  ]  <- sp + 12 <- fp + 12
[stuff]  <- sp + 16 <- fp + 16

then we pop off the frame pointer to restore the callers frame pointer so they are not messed up for the rest of their function

[retadd] <- sp + 0
[  a  ]  <- sp + 4
[  b  ]  <- sp + 8
[stuff]  <- sp + 12

then we return from the function which uses the address pointed to by the stack pointer

[  a  ]  <- sp + 0
[  b  ]  <- sp + 4
[stuff]  <- sp + 8

then the calling function cleans up the stack to what it was before it started to make the call to fun

[stuff]  <- sp + 0

There are many web page and books that talk about stack basics, too many to mention.

You are correct with you understanding that the stack is just a place in memory. A stack is quite big compared to a register.

You can look at the stack, like a stack of pancakes. The property of a stack, is that yu can only add or remove elements from the top.

There are two registers, that help organize this memory structure. The first one is (E)SP which is short for Stack Pointer. The other one is (E)BP which is a Base Pointer.

To understand why we need the two registers, we need to look at the operations that the stack permits. There is the PUSH and the POP.

PUSH does 2 things:

SUB ESP,4
MOV [ESP],REGISTER,

This decreases the stack pointer, and saves the register into the new place.

POP does the opposite:

MOV REGISTER,[ESP]
ADD ESP,4

This moves the contents of the top of the stack to the register, and moves the pointer accordingly.

Let's look now at the way a function uses it's parameters.

At the start of a function, we are able to access the parameters by [ESP+4],[ESP+8]. But what happens when we want to have some local variables? Changing ESP will make the above statements invalid.

This is where the Base Pointer comes in. At the start of every function we have the so called prolog:

PUSH EBP
MOV EBP,ESP

This saves the previous Base Pointer, and saves the Stack Pointer, so that we will be able to get the offset of the parameters without worrying about the changing Stack Pointer.

At the end of the function you will see a epilog, which involves POPing back the old value of EBP.

Using EBP as a base or frame pointer is optional. Some compilers (like Microsoft) have the option to disable frame pointers, in which case EPB is freed up to be used as a generic register, and all stack relative references are made as offsets from ESP.

In 16 bit real mode, SP can't be used as a base register or index for a memory operand, so BP has to be used for stack relative references.

链接地址: http://www.djcxy.com/p/79898.html

上一篇: 哪个段是分配库函数的内存？

下一篇: 堆栈和寄存器如何在汇编程序中工作？