Referencing the contents of a memory location. (x86 addressing modes)
I have a memory location that contains a character that I want to compare with another character (and it's not at the top of the stack so I can't just pop
it). How do I reference the contents of a memory location so I can compare it?
Basically how do I do it syntactically.
For a more extended discussion of addressing modes (16/32/64bit), see Agner Fog's "Optimizing Assembly" guide, section 3.3. That guide has much more detail than this answer for relocation for symbols and or 32bit position-independent code, among other things.
See also: table of AT&T(GNU) syntax vs. NASM syntax for different addressing modes, including indirect jumps / calls.
Also see the collection of links at the bottom of this answer.
Suggestions welcome, esp. on which parts are were useful/interesting, and which parts aren't.
x86 (32 and 64bit) has several addressing modes to choose from. They're all of the form:
[base_reg + index_reg*scale + displacement] ; or a subset of this
[RIP + displacement] ; or RIP-relative: 64bit only. No index reg is allowed
(where scale is 1, 2, 4, or 8, and displacement is a signed 32bit constant). All the other forms (except RIP-relative) are subsets of this that leave out one or more component . This means you don't need a zeroed index_reg
to access [rsi]
for example. In asm source code, it doesn't matter what order you write things: [5 + rax + rsp + 15*4 + MY_ASSEMBLER_MACRO*2]
works fine. (All the math on constants happens at assemble time, resulting in a single constant displacement.)
The registers all have to be the same size as the mode you're in, unless you use an alternate address-size, requiring an extra prefix byte. Narrow pointers are rarely useful outside of the x32 ABI (ILP32 in long mode).
If you want to use al
as an array index, for example, you need to zero- or sign-extend it to pointer width. (Having the upper bits of rax
already zeroed before messing around with byte registers is sometimes possible, and is a good way to accomplish this.)
Every possible subset of the general case is encodable, except ones using e/rsp*scale
(obviously useless in "normal" code that always keeps a pointer to stack memory in esp
).
Normally, the code-size of the encodings is:
[-128 to +127]
can use the more compact disp8
encoding, saving 3 bytes vs. disp32
. code-size exceptions:
[reg*scale]
by itself can only be encoded with a 32bit displacement. Smart assemblers work around that by encoding lea eax, [rdx*2]
as lea eax, [rdx + rdx]
, but that trick only works for scaling by 2.
It's impossible to encode e/rbp
or r13
as the base register without a displacement byte, so [ebp]
is encoded as [ebp + byte 0]
. The no-displacement encodings with ebp
as a base register instead mean there's no base register (eg for [disp + reg*scale]
).
[e/rsp]
requires a SIB byte even if there's no index register. (whether or not there's a displacement). The mod/rm encoding that would specify [rsp]
instead means that there's a SIB byte.
See Table 2-5 in Intel's ref manual, and the surrounding section, for the details on the special cases. (They're the same in 32 and 64bit mode. Adding RIP-relative encoding didn't conflict with any other encoding, even without a REX prefix.)
For performance, it's typically not worth it to spend an extra instruction just to get smaller x86 machine code. On Intel CPUs with a uop cache, it's smaller than L1 I$, and a more precious resource. Minimizing fused-domain uops is typically more important.
16bit address size can't use a SIB byte, so all the one and two register addressing modes are encoded into the single mod/rm byte. reg1
can be BX or BP, and reg2
can be SI or DI (or you can use any of those 4 registers by themself). Scaling is not available. 16bit code is obsolete for a lot of reasons, including this one, and not worth learning if you don't have to.
Note that the 16bit restrictions apply in 32bit code when the address-size prefix is used, so 16bit LEA-math is highly restrictive. However, you can work around that: lea eax, [edx + ecx*2]
sets ax = dx + cx*2
, because garbage in the upper bits of the source registers has no effect.
How they're used
This table doesn't exactly match the hardware encodings of possible addressing modes, since I'm distinguishing between using a label (for eg global or static data) vs. using a small constant displacement. So I'm covering hardware addressing modes + linker support for symbols.
If you have a pointer char array[]
in esi
,
mov al, esi
: invalid, won't assemble. Without square brackets, it's not a load at all. It's an error because the registers aren't the same size.
mov al, [esi]
loads the byte pointed to.
mov al, [esi + ecx]
loads array[ecx]
.
mov al, [esi + 10]
loads array[10]
.
mov al, [esi + ecx*8 + 200]
loads array[ecx*8 + 200]
mov al, [global_array + 10]
loads from global_array[10]
. In 64bit mode, this can be a RIP-relative address. Using DEFAULT REL
is recommended, to generate RIP-relative addresses by default instead of having to always use [rel global_array + 10]
. There is no way to use an index register with a RIP-relative address directly. The normal method is lea rax, [global_array]
mov al, [rax + rcx*8 + 10]
or similar.
mov al, [global_array + ecx + edx*2 + 10]
loads from global_array[ecx + edx*2 + 10]
Obviously you can index a static/global array with a single register. Even a 2D array using two separate registers is possible. (pre-scaling one with an extra instruction, for scale factors other than 2, 4, or 8). Note that the global_array + 10
math is done at link time. The object file (assembler output, linker input) informs the linker of the +10 to add to the final absolute address, to put the right 4-byte displacement into the executable (linker output). This is why you can't use arbitrary expressions on link-time constants that aren't assemble-time constants (eg symbol addresses).
mov al, 0ABh
Not a load at all, but instead an immediate-constant that was stored inside the instruction. (Note that you need to prefix a 0
so the assembler knows it's a constant, not a symbol. Some assemblers will also accept 0xAB
). You can use a symbol as the immediate constant, to get an address into a register.
mov esi, global_array
assembles into a mov esi, imm32
that puts the address into esi. mov esi, OFFSET global_array
is needed to do the same thing. mov esi, global_array
assembles into a load: mov esi, dword [global_array]
. In 64bit mode, addressing global symbols is usually done with RIP-relative addressing, which your assembler will do by default with the DEFAULT REL
directive, or with mov al, [rel global_array + 10]
. No index register can be used with RIP-relative addresses, only constant displacements. You can still do absolute addressing, and there's even a special form of mov
that can load from a 64bit absolute address (rather than the usual 32bit sign-extended.) AT&T syntax calls that opcode movabs
(also used for mov r64, imm64
), while Intel/NASM syntax still calls it a form of mov
.
Use lea rsi, [rel global_array]
to get rip-relative addresses into registers, since mov reg, imm
would hard-code a non-relative address into the instruction bytes.
Note that OS X loads all code at an address outside the low 32 bits, so 32-bit absolute addressing is unusable. Position-independent code isn't required for executables, but you might as well because 64-bit absolute addressing is less efficient than RIP-relative. The macho64 object file format doesn't support relocations for 32-bit absolute addresses the way Linux ELF does. Make sure not to use a label name as a compile-time constant anywhere, except in an effective-address like [global_array + constant]
, because that can be assembled to a RIP-relative addressing mode. eg [global_array + rcx]
is not allowed, because RIP can't be used with any other registers, so it would have to be assembled with the absolute address of global_array
hard-coded as the 32bit displacement (which will be sign-extended to 64b).
Any and all of these addressing modes can be used with LEA
to do integer math with a bonus of not affecting flags, regardless of whether it's a valid address. [esi*4 + 10]
is usually only useful with LEA (unless the displacement is a symbol, instead of a small constant). In machine code, there is no encoding for scaled-register alone, so [esi*4]
has to assemble to [esi*4 + 0]
, with 4 bytes of zeros for a 32-bit displacement. It's still often worth it to copy+shift in one instruction instead of a shorter mov + shl, because usually uop throughput is more of a bottleneck than code size, especially on CPUs with a decoded-uop cache.
You can specify segment-overrides like mov al, fs:[esi]
. A segment-override just adds a prefix-byte in front of the usual encoding. Everything else stays the same, with the same syntax.
You can even use segment overrides with RIP-relative addressing. 32-bit absolute addressing takes one more byte to encode than RIP-relative, so mov eax, fs:[0]
can most efficiently be encoded using a relative displacement that produces a known absolute address. ie choose rel32 so RIP+rel32 = 0. YASM will do this with mov ecx, [fs: rel 0]
, but NASM always uses disp32 absolute addressing, ignoring the rel
specifier. I haven't tested MASM or gas.
If the operand-size is ambiguous (eg in an instruction with an immediate and a memory operand), use byte
/ word
/ dword
/ qword
/ xmmword
/ ymmword
to specify:
mov dword [rsi + 10], 0xAB ; NASM
mov dword ptr [rsi + 10], 0xAB ; MASM and GNU .intex_syntax noprefix
movl $0xAB, 10(%rsi) # GNU(AT&T): operand size from insn suffix
See the yasm docs for NASM-syntax effective addresses, and/or the wikipedia x86 entry's section on addressing modes. The wiki page says what's allowed in 16bit mode. Here's another "cheat sheet" for 32bit addressing modes.
There's also a more detailed guide to addressing modes, for 16bit. 16bit still has all the same addressing modes as 32bit, so if you're finding addressing modes confusing, read it anyway
Also see the x86 wiki page for links.
链接地址: http://www.djcxy.com/p/72444.html上一篇: 在Linux 64位处理命令行
下一篇: 引用内存位置的内容。 (x86寻址模式)