Primary key conflict even when TABLOCKX and HOLDLOCK hints

2018-05-30 22:50:51

I have a table which is used to create locks with unique key to control execution of a critical section over multiple servers, ie only one thread at a time from all the web servers can enter that critical section.

The lock mechanism starts by trying to add a record to the database, and if successful it enters the region, otherwise it waits. When it exits the critical section, it removes that key from the table. I have the following procedure for this:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRANSACTION
  DECLARE @startTime DATETIME2
  DECLARE @lockStatus INT
  DECLARE @lockTime INT
  SET @startTime = GETUTCDATE()

  IF EXISTS (SELECT * FROM GuidLocks WITH (TABLOCKX, HOLDLOCK) WHERE Id = @lockName)
  BEGIN
    SET @lockStatus = 0
  END
  ELSE
  BEGIN
    INSERT INTO GuidLocks VALUES (@lockName, GETUTCDATE())
    SET @lockStatus = 1
  END

  SET @lockTime = (SELECT DATEDIFF(millisecond, @startTime, GETUTCDATE()))
  SELECT @lockStatus AS Status, @lockTime AS Duration
COMMIT TRANSACTION GetLock

So I do a SELECT on the table and use TABLOCKX and HOLDLOCK so I get an exclusive lock on the complete table and hold it until the end of the transaction. Then depending on the result, I either return fail status (0), or create a new record and return (1).

However, I am getting this exception from time to time and I just don't know how it is happening:

System.Data.SqlClient.SqlException: Violation of PRIMARY KEY constraint 'PK_GuidLocks'. Cannot insert duplicate key in object 'dbo.GuidLocks'. The duplicate key value is (XXXXXXXXX). The statement has been terminated.

Any idea how this is happening? How is it possible that two threads managed to obtain an exclusive lock on the same table and tried to insert rows at the same time?

UPDATE : It looks readers might have not fully understand my question here, so I would like to elaborate: My understanding is that using TABLOCKX obtains an exclusive lock on the table. I also understood from the documentation (and I could be mistaken) that if I use the HOLDLOCK statement, then the lock will be held till the end of the transaction, which in this case, I assume (and apparently my assumption is wrong, but that's what I understood from the documentation) is the outer transaction initiated by the BEGIN TRANSACTION statement and ended by COMMIT TRANSACTION statement. So the way I understand things here is that by the time SQL Server reach the SELECT statement having the TABLOCKX and HOLDLOCK, it will try to obtain an exclusive lock on the whole table, and will not release it until the execution of COMMIT TRANSACTION . If that's the case, how comes two threads seam to be trying to execute the same INSERT statement at the same time?

In my concurrent programming text many years ago, we read the parable of the blind train engineers who needed to transport trains both directions through a single track pass across the Andes only one track wide. In the first mutex model, an engineer would walk up to a synchronization bowl at the top of the pass and, if it was empty, place a pebble in to lock the pass. After driving through the pass he would remove his pebble to unlock the pass for the next train. This is the mutex model you have implemented and it doesn't work. In the parable a crach occurred soon after implementation, and sure enough there were two pebbles in the bowl - we have encountered a READ-READ-WRITE-WRTE anomaly due to the multi-threaded environment.

The parable then describes a second mutex model, where there is already a single pebble in the bowl. Each engineer walks up to the bowl and removes the pebble if one is there, placing it in his pocket while he drives through the pass. Then he restores the pebble to unlock the pass for the next train. If an engineer finds the bowl empty he keeps trying (or blocks for some length of time) until a pebble is available. This is the model that works.

You can implement this (correct) model by having ( only ever ) a single row in the GuidLocks table with a (by default) NULL value for the lock holder. In a suitable transaction each process UPDATES (in place) this single row with it's SPID exactly if the old value IS NULL; returning 1 if this succeeds and 0 if it fails. It again updates this column back to NULL when it releases the lock.

This will ensure that the resource being locked actually includes the row being modified, which in your case is clearly not always true.

See the answer by usr to this question for an interesting example.

I believe that you are being confused by the error message - clearly the engine is locating the row of a potential conflict before testing for the existence of a lock, resulting in a misleading error message, and that since (due to implementing model 1 above instead of model 2) the TABLOCK is being held on the resource used by the SELECT instead of the resource used by an INSERT/UPDATE, a second process is able to sneak in.

Note that, especially in the presence of support for snapshot isolation, the resource on which you have taken your TABLOCKX (the table snapshot before any inserts) does not guarantee to include the resource to which you have written the lock specifics (the table snapshot after an insert) .

If you look up the documentation for tablock and holdlock, you'll see that it is not doing what you think it is:

Tablock: Specifies that the acquired lock is applied at the table level. The type of lock that is acquired depends on the statement being executed. For example, a SELECT statement may acquire a shared lock. By specifying TABLOCK, the shared lock is applied to the entire table instead of at the row or page level. If HOLDLOCK is also specified, the table lock is held until the end of the transaction.

So the reason that your query is not working is because you are only getting a shared lock from the table. What Frisbee is attempting to point out is that you don't need to re-implement all of the transaction isolating and locking code because there is a more natural syntax that handles this implicitly. His version is better than yours because it's much more easy to not make a mistake that introduces bugs.

More generally, when ordering statements in your query, you should place the statements requiring the more restrictive lock first.

Use an app lock.

exec sp_getapplock @resource = @lockName, 
     @LockMode='Exclusive', 
     @LockOwner = 'Session';

Your approach is incorrect from many point of view: granularity (table lock), scope (transaction which commit), leakage (will leak locks). Session scope app locks is what you actually intend to use.

链接地址: http://www.djcxy.com/p/5622.html

上一篇: 默认锁定设置SQL Server 2008 R2

下一篇: 主键冲突，即使TABLOCKX和HOLDLOCK提示