Alternative syntax for the LEA instruction
Possible Duplicate:
What's the purpose of the LEA instruction?
When I need the value at an address I can use the effective address eg push dword [str+4]
. But when I need to reference an address -- I can't use push dword str+4
(which to me is the obvious and intiutive way to do it).
Instead need to use lea EAX, [str+4]
and then push EAX
. This a bit confusing and also gives an extra processor instruction, albeit a 'zero-clock' one. (See this answer)
Is there some hardware level explaination for this difference, or is it just a quirk of (NASM) assembly syntax?
Edit: Okay so this comment asks the same question as me. And it is answered in this comment just as Lucero's answer - the X86 does not support such addressing.
Assembly instructions directly represent x86 opcodes (no transforming compilation takes place as in higher-level languages). The opcodes have their limitations in what they can represent; as such, while address computations are possible as part of the x86 adressing, value computations are not. LEA covers this gap by storing the result of the address computation in any register instead of only consuming it internally.
Just use the correct syntax, you need the offset keyword:
push offset str+4
The LEA instruction is handy to use the plumbing of the address generation logic. Giving very cheap ways to add and multiply that don't use the ALU. High on the list of tricks for programmers that write code generators. Not needed here, afaict.
This is more of a long comment (since it doesn't answer the question), but readers ought to know..
lea
most certainly is not a zero-clock instruction. There are some of those, such as fxch
(on everything with register renaming), nop
( 90
and 0F 1F
) on Sandy Bridge, and certain idioms for setting a register to zero ( xor
or sub
with itself, even for XMM registers), also on Sandy Bridge. They still have a limited throughput, of course, so they're not free.
lea
always takes at least one cycle (at least, on any processor I know, and it may not always have been this way), it is commonly executed on an ALU instead of an AGU (some AMD's and Atom are exceptions) but even in the cases where it's executed on an AGU it still takes a cycle or more. lea
can even take more than 1 cycle, such as scaled lea
on P4, Sandy Bridge (seems like I'm mentioning SB a lot in this post..) or AMD processors. In fact, on AMD K10 the lea
that goes to the AGU is the slow case, where it's scaled and/or has 3 arguments and takes a cycle longer than the fast one, which goes to an ALU.
上一篇: LEA和MOV指令比较
下一篇: LEA指令的替代语法