基于虚拟机的函数调用/返回实现问题

2018-06-28 18:43:09

今天，我决定在C ++ 11中创建一个基于堆栈的小型虚拟机，以获得更多的乐趣 - 在函数调用和函数返回之前，一切都非常顺利。

我一直在尝试遵循类似于x86汇编的调用指导原则，但我感到非常困惑。

我无法处理堆栈基址指针偏移量和返回值。

看起来很难跟踪堆栈上用于返回值和参数（用于函数调用）的寄存器。

我创建了一个简单的汇编语言和编译器。这是一个评论示例（我的虚拟机编译并执行）。我试图解释发生了什么，并在评论中分享我的想法。

//!ssvasm

$require_registers(3);

// C++ style preprocessor define directives to refer to registers
$define(R0, 0);
$define(R1, 1);
$define(R2, 2); 

// Load the 2.f float constant value into register R0
loadFloatCVToR(R0, 2.f);

// I want to pass 2.f as an argument to my next function call:
// I have to push it on top of the stack (as with x86 assembly)
pushRVToS(R0);

// I call the FN_QUAD function here: calling a function pushes both
// the current `stack base offset` and the `return instruction index`
// on the stack
callPI(FN_QUAD); 

// And get rid of the now-useless argument that still lies on top of the stack
// by dumping it into the unused R2 register 
popSVToR(R2);

halt(); // Halt virtual machine execution



$label(FN_DUP); // Function FN_DUP - returns its argument, duplicated

// I need the arg, but since it's beneath `old offset` and `return instruction`
// it has to copied into a register - I choose R0 - ...

// To avoid losing other data in R0, I "save" it by pushing it on the stack
// (Is this the correct way of saving a register's contents?)
pushRVToS(R0);

// To put the arg in R0, I need to copy the value under the top two stack values
// (Read as: "move stack value offset by 2 from base to R0")
// (Is this how I should deal with arguments? Or is there a better way?)
moveSBOVToR(R0, 2);

// Function logic: I duplicate the value by pushing it twice and adding
pushRVToS(R0); pushRVToS(R0); addFloat2SVs();

// The result is on top of the stack - I store it in R1, to get it from the caller
// (Is this how I should deal with return values? Or is there a better way?)
popSVToR(R1);

popSVToR(R0); // Restore R0 with its old value (it's now at the top of the stack)

// Return to the caller: this pops twice - it uses `old stack base offset` and
// unconditionally jumps to `return instruction index`
returnPI();



$label(FN_QUAD); // Function FN_QUAD

pushRVToS(R0);
moveSBOVToR(R0, 2);

// Call duplicate twice (using the first call's return value as the second
// call's argument)
pushRVToS(R0); callPI(FN_DUP); popSVToR(R2);
pushRVToS(R1); callPI(FN_DUP); popSVToR(R2);

popSVToR(R0);
returnPI();

我以前从未在程序集中编程，所以我不太确定我使用的技术是否正确（或有效）。

我处理参数/返回值/寄存器的方式是否正确？

如果函数的调用者推送参数，然后调用，然后弹出参数？ 看起来使用寄存器会更容易，但我已经读过x86使用栈来传递参数。我相信我在这里使用的方法是不正确的。

我是否应该在函数调用时同时推送old stack offset和return instruction index ？ 还是应该将old stack offset存储在寄存器中？ （或者避免存储它？）

你在谈论的是调用通话约定。换句话说，定义谁构建了堆栈以及调用者或被调用者的方式以及堆栈的外观。

他们有很多方法可以做到，没有人比其他人更好，你只需要保持一致。

由于要描述不同的呼叫召集时间会很长，因此您应该查看完整的维基百科文章。

但是，x86 C调用约定仍然很快，指定调用者必须保存其寄存器并构建堆栈，并让被调用者免于使用寄存器，返回值或仅仅执行操作。

对于您帖子末尾的具体问题，最好的做法是使用与C相同的堆栈，将其存储在最后的EIP和EBP中，并使寄存器免费使用。堆栈空间不像寄存器数量那么有限。

我在我的堆栈机器上解决了这个问题，方法如下：

一个void函数调用（不带参数）指令可以做到这样：

有_stack []（主堆栈）和_cstack []（调用堆栈，包含有关调用的信息，如返回大小）。

当调用函数时（遇到VCALL （void函数调用）），将执行以下操作：

        u64& _next = _peeknext; //refer to next bytecode (which will be function address)
        AssertAbort((_next > -1) && (_next < _PROGRAM_SIZE), "Can't call function. Invalid address");
        cstack_push(ip + 2); //address to return to (current address +2, to account for function parameters next to function call)
        cstack_push(fp); //curr frame pointer
        cstack_push(_STACK_SIZE); //curr stack size
        cstack_push(0); //size of return value(would be 4 if int, 8 for long etc),in this case void
        ip = (_next)-1; //address to jump to (-1 to counter iteration incrementation of program counter(ip))

然后，当遇到RET （返回）指令时，完成以下操作：

        AssertAbort(cstackhas(3), "Can't return. No address to return to.");
        u64 return_size = cstack_pop(); // pop size of return value form call stack
        _STACK_SIZE = cstack_pop(); //set the stack size to what it was before the function call, not accounting for the return value size
        fp = cstack_pop(); //reset the frame pointer to the current value to where it was before the function call
        ip = cstack_pop() - 1; //set program counter to addres storedon call stack from last function call

        _cstack.resize(_STACK_SIZE + return_size); //leave the top of the stack intact (size of return value in bytes), but disregard the rest.

这对你来说现在可能毫无用处，因为这个问题很古老，但如果你愿意，你可以提出任何问题:)

最好的解决方案取决于机器。

如果堆栈中的push和pop与使用寄存器（芯片堆栈或片上L1堆栈）一样快，并且同时你对寄存器的数量非常有限，那么使用堆栈是有意义的。

如果你有很多寄存器，你可以使用其中的一些来存储计数器（指针）或变量。

一般来说，为了使模块彼此通信或将其他语言翻译（或编译）到你的程序集中，你应该指定一个应用程序二进制接口。

您应该为不同的硬件（或虚拟机）比较不同的ABI，以找到适合您机器的技术。一旦你定义了你的ABI，程序应该符合二进制兼容性。

链接地址: http://www.djcxy.com/p/80371.html

上一篇: based virtual machine function call/return implementation issues

下一篇: How are the function local variables accessed from the stack?