minimizing lock scope for JDK8 ConcurrentHashMap check

1.

I have multiple threads updating a ConcurrentHashMap. Each thread appends a list of integers to the Value of a map entry based on the key. There is no removal operation by any thread.

The main point here is that I want to minimize the scope of locking and synchronization as much as possible.

I saw that the doc of computeIf...() method says "Some attempted update operations on this map by other threads may be blocked while computation is in progress", which is not so encouraging. On the other hand, when I look into its source code, I failed to observe where it locks/synchronizes on the entire map.

Therefore, I wonder about the comparison of theoretical performance of using computeIf...() and the following home-grown 'method 2'.

2.

Also, I feel that the problem I described here is perhaps one of the most simplified check-n-set (or generally a 'compound') operation you can carry out on ConcurrentHashMap .

Yet I'm not quite confident and can't quite find much guideline about how to do even this kind of simple compound operations on ConcurrentHashMap, without Locking/Synchronizing on the entire map .

So any general good practice advice for this will be much appreciated.

public void myConcurrentHashMapTest1() {

    ConcurrentHashMap<String, List<Integer>> myMap = new ConcurrentHashMap<String, List<Integer>>();

    // MAP KEY: a Word found by a thread on a page of a book 
    String myKey = "word1";

    // -- Method 1: 
    // Step 1.1 first, try to use computeIfPresent(). doc says it may lock the
    //      entire myMap. 
    myMap.computeIfPresent(myKey, (key,val) -> val.addAll(getMyVals()));
    // Step 1.2 then use computeIfAbsent(). Again, doc says it may lock the
    //      entire myMap. 
    myMap.computeIfAbsent(myKey, key -> getMyVals());    
}

public void myConcurrentHashMapTest2() {        
    // -- Method 2: home-grown lock splitting (kind of). Will it theoretically 
    //      perform better? 

    // Step 2.1: TRY to directly put an empty list for the key
    //      This may have no effect if the key is already present in the map
    List<Integer> myEmptyList = new ArrayList<Integer>();
    myMap.putIfAbsent(myKey, myEmptyList);

    // Step 2.2: By now, we should have the key present in the map
    //      ASSUMPTION: no thread does removal 
    List<Integer> listInMap = myMap.get(myKey);

    // Step 2.3: Synchronize on that list, append all the values 
    synchronized(listInMap){
        listInMap.addAll(getMyVals());
    }

}

public List<Integer> getMyVals(){
    // MAP VALUE: e.g. Page Indices where word is found (by a thread)
    List<Integer> myValList = new ArrayList<Integer>(); 
    myValList.add(1);
    myValList.add(2);

    return myValList;
}

You're basing your assumption (that using ConcurrentHashMap as intended will be too slow for you) on a misinterpretation of the Javadoc. The Javadoc doesn't state that the whole map will be locked. It also doesn't state that each computeIfAbsent() operation performs pessimistic locking.

What could actually be locked is a bin (aka bucket) which corresponds to a single element in the internal array backing of the ConcurrentHashMap . Note that this is not Java 7's map segment containing multiple buckets. When such a bin is locked, potentially blocked operations are solely updates for keys that hash to the same bin.

On the other hand, your solution doesn't mean that all internal locking within ConcurrentHashMap is avoided - computeIfAbsent() is just one of the methods that can degrade to using a synchronized block while updating. Even the putIfAbsent() with which you're initially putting an empty list for some key, can block if it doesn't hit an empty bin.

What's worse though is that your solution doesn't guarantee the visibility of your synchronized bulk updates. You are guaranteed that a get() happens-before a putIfAbsent() which value it observes, but there's no happens-before between your bulk updates and a subsequent get() .

PS You can read further about the locking in ConcurrentHashMap in its OpenJDK implementation: http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/concurrent/ConcurrentHashMap.java, lines 313-352.


As already explained by Dimitar Dimitrov, a compute… method doesn't generally lock the entire map. In the best case, ie there's no need to increase the capacity and there's no hash collision, only the mapping for the single key is locked.

However, there are still things you can do better:

  • generally, avoid performing multiple lookups. This applies to both variants, using computeIfPresent followed by computeIfAbsent , as well as using putIfAbsent followed by get
  • it's still recommended to minimize the code executed when holding a lock, ie don't invoke getMyVals() while holding the lock as it doesn't depend on the map's state
  • Putting it together, the update should look like:

    // compute without holding a lock
    List<Integer> toAdd=getMyVals();
    // update the map
    myMap.compute(myKey, (key,val) -> {
        if(val==null) val=toAdd; else val.addAll(toAdd);
        return val;
    });
    

    or

    // compute without holding a lock
    List<Integer> toAdd=getMyVals();
    // update the map
    myMap.merge(myKey, toAdd, (a,b) -> { a.addAll(b); return a; });
    

    which can be simplified to

    myMap.merge(myKey, getMyVals(), (a,b) -> { a.addAll(b); return a; });
    
    链接地址: http://www.djcxy.com/p/91900.html

    上一篇: 使用ConcurrentHashMap时的并发问题

    下一篇: 尽量减少JDK8 ConcurrentHashMap检查的锁定范围