x86 spinlock using cmpxchg
我是使用gcc内联汇编的新手,并且想知道在x86多核机器上是否可以实现自旋锁(无竞态条件)(使用AT&T语法):
spin_lock: mov 0 eax lock cmpxchg 1 [lock_addr] jnz spin_lock ret spin_unlock: lock mov 0 [lock_addr] ret
You have the right idea, but your asm is broken:
cmpxchg
can't work with an immediate operand, only registers.
lock
is not a valid prefix for mov
. mov
to an aligned address is atomic on x86, so you don't need lock
anyway.
It has been some time since I've used AT&T syntax, hope I remembered everything:
spin_lock:
xorl %ecx, %ecx
incl %ecx # newVal = 1
spin_lock_retry:
xorl %eax, %eax # expected = 0
lock; cmpxchgl %ecx, (lock_addr)
jnz spin_lock_retry
ret
spin_unlock:
movl $0, (lock_addr) # atomic release-store
ret
Note that GCC has atomic builtins, so you don't actually need to use inline asm to accomplish this:
void spin_lock(int *p)
{
while(!__sync_bool_compare_and_swap(p, 0, 1));
}
void spin_unlock(int volatile *p)
{
asm volatile ("":::"memory"); // acts as a memory barrier.
*p = 0;
}
As Bo says below, locked instructions incur a cost: every one you use must acquire exclusive access to the cache line and lock it down while lock cmpxchg
runs, which can delay the unlocking thread especially if multiple threads are waiting to take the lock. Even without many CPUs, it's still easy and worth it to optimize around:
void spin_lock(int volatile *p)
{
while(!__sync_bool_compare_and_swap(p, 0, 1))
{
// spin read-only until a cmpxchg might succeed
while(*p) _mm_pause(); // or maybe do{}while(*p) to pause first
}
}
The pause
instruction is vital for performance on HyperThreading CPUs when you've got code that spins like this -- it lets the second thread execute while the first thread is spinning. On CPUs which don't support pause
, it is treated as a nop
.
pause
also prevents memory-order mis-speculation when leaving the spin-loop, when it's finally time to do real work again.
Note that real spinlock implementations don't spin forever; they fall back to an OS-assisted sleep and notify mechanism. They may also take measures to improve fairness, and lots of other things the cmpxchg
/ pause
loop doesn't do.
这会减少内存总线上的争用:
void spin_lock(int *p)
{
while(!__sync_bool_compare_and_swap(p, 0, 1)) while(*p);
}
链接地址: http://www.djcxy.com/p/64602.html
上一篇: XCode 4.2 w / Three20:架构i386的未定义符号:TTIsInDebugger
下一篇: x86自旋锁使用cmpxchg