hardware mutex, based@XCHG instruction

[[OS by William Stallings]] P214 is concise and clear.

The atomic XCHG instruction on Intel processors can swap a local variable (held in cpu register) with a main memory location visible to all threads.

This XCHG instruction alone can implement a simple mutex usable by multiple threads on multiple processors sharing the same main memory.

Another atomic instruction, atomic test-n-set instruction, can also by itself implement mutex.

Both mutexes have only academic value because their Limitations are severe:

  • even with a single mutex, deadlock is possible at the hardware level — ThrA in critical section can get interrupted and preempted — perfectly normal. Now ThrA has to wait for high-priority ThrH to complete, but ThrH is busy-waiting (wheel-spinning) to enter the same critical section 😦
    • Note in this example there’s no “lock object” per-se, only a critical section in code !
  • by default, unsuccessful threads like ThrH are stuck in busy-wait — wasting CPU and generating heat 😉

 

Advertisements

std::defer_lock #kill1@4deadlock factors

Only in the lock() call does the thread grabs both locks together. This breaks “incremental acquisition” , one of the four deadlock conditions.

Sample code from https://en.cppreference.com/w/cpp/thread/lock_tag

  std::unique_lock<std::mutex> ulock1(mutex1,std::defer_lock);
  std::unique_lock<std::mutex> ulock2(mutex2,std::defer_lock);

  // now grab both locks
  std::lock(ulock1,ulock2); 

deadlock avoidance: release all locks fail-fast

In some deadlock avoidance algorithms, at runtime we need to ensure we immediately release every lock already acquired, without hesitation, so to speak. Here’s one design my friend Shanyou shared with me.

  • Tip: limit the number of “grabbing” functions, functions that explicit grab any lock.

Suppose a grabbing function f1 acquires a lock and calls another function f2 (a red flag to concurrency designers). In such a situation, f2() will use -1 return value to indicate “must release lock”. If f2() calls a grabbing function f3(), and if f3() fails to grab, it propagates “-1” to f2 and f1.

  1. f3() could use trylock.
  2. f3() could check lock hierarchy and return -1.

calling unknown code while hold`(or Not hold`) lock #XR

In c++, if the unknown code does a fork(), then all locks are replicated to new born process. See after fork(): threads

[[eff c#]] warns against calling “unknown” code while holding a lock. This advice applies to java and c++ too.

Now, What defines “unknown”? Any delegate can have an inv list containing unknown callbacks. Raising an event involves delegate invocation. If you receives an Action instance, realize it’s a delegate instance too.
Another category of “unknown code” might be implementations of an interface. You get a IComparable instance, and you just call Compare() without knowing the implementation
Suppose within obj1.method1() we grab a lock and then call the unknown code. Such unknown code could circle back to call obj1.method2(). If method2() also grabs the same lock, then no issue, thanks to c# reentrancy. Note c# reentrancy is implemented by counting — http://stackoverflow.com/questions/3687505/recursive-nested-locking-in-c-sharp-with-the-lock-statement
If method2 is on another thread, then we get a classic single-lock deadlock — method2 would be blocked by the lock held by method1(), hopefully for a short time only. If method1() is unfortunately waiting for method2 to finish its work on the other thread, then the deadlock would persist.
Even if there’s no locking involved, method2 may circle back to invoke method1 (on the same or a separate thread, on the same obj1 or another object), recursively. If this recursion is unplanned (like unplanned pregnancy:), then stack overflow may ensue.
Dumb luck if unintentional recursion doesn’t hit stack overflow.

break the 4 deadlock conditions

My long-time java friend told me lock ordering and lock timeout are the 2 standard solutions, but here are additional suggestions directly or indirectly addressing deadlocks.

(“[top x]” means one of my favorite x  techniques.)

For want of a better word, “stuck” is my general term meaning “no-progress” and includes
– lock grab,
– Object.wait(), waking up and grabbing lock
– join(),
– socket accept(),
– long sleep(),
– endless loop,
– unbounded *recursion*

Now the solutions.
–to provide preemption
* [top 3] remain interruptible while getting “stuck”
** even if a recursion/loop calls no blocking method, the thread can still check for the interrupt status from time to time and exit if necessary. see http://download.oracle.com/javase/tutorial/essential/concurrency/interrupt.html
* tryInterruptibly(). Now, if thread 1 knows another thread is holding a mutex it can forcibly grab it, but how does it know which thread? –to break incremental acquisition

*[top 2] have a timeout while getting “stuck” — a wide-spread implementation
** example — in an endless loop, check how long we have been stuck and exit if necessary.
** example — lock timeout — A thread tries to acquire a lock in a specified period. Upon timeout, it release all it’s locks and go to wait status for a period. For example, release 1st chopstick if 2nd chopstick remains unavailable for x seconds. Achievable with tryLock()? Yes P279 [[ java concurrency in practice ]] I believe tryLock() can tell u the 2nd chopstick is unavailable and give you a chance to give up 1st chopstick held. See P69 [[pthreads]] and deadlock avoidance: release all locks voluntarily
* open call? impractical

–to break circular wait
* lockInterruptibly() of the Lock interface. It let’s u try to acquire a lock and remain responsive to interruption.
* reentrant/recursive lock. By default pthreads locks aren’t so a single thread can deadlock itself.
* tryLock()
* [top 2] globally fixed resource acquisition order
* if you identify that 3 threads form a wait-for circle, then reverse one of the threads’ acquisition order
* finally() to release lock if using Lock objects. “synchronized” doesn’t need this.
* Release locks asap. minimize lock scope.
* [top 5] Make all methods finite. Avoid using locks while inside infinite loops or unbounded recursions.

–to break the need for mutex resources
* immutables. no need to lock anything to access them. CopyOnWrite is thread safe because it’s immutable.
* ThreadLocal
* copy On Write
* copy the data for each thread, and sync up later
* replacing sync with atomic variables? both options can be criticized as risky

deadlock involving 1 lock (again):suspend/resume methods

if you app uses just one lock, you can still get deadlock. Eg: ThreadM (master) starts ThreadS (slave). S acquires a lock. M calls suspend method on S, then tries to acquire the lock, which is held by the suspended S. S is waiting to be resumed by M — deadlock.

Now the 4 conditions of deadlock:
* wait-for cycle? indeed
* mutually exclusive resource? Just one resource
* incremental acquisition? indeed
* non-Preemptive? Indeed. But if there’s a third thread, it can restart ThreadS.

In multithreaded apps, I feel single-lock deadlock is very common. Here’s another example — “S gets a lock, and wait for ConditionA. Only M can create conditionA, but M is waiting for the lock.”

Here, ConditionA can be many things, including a Boolean flag, a JMS message, a file, a count to reach a threshold, a data condition in DB, a UI event where UI depends partly on ThreadM.

deadlock involving 1 lock,2 threads

See also ..

Deadlock is seldom found in real projects. Even if an observed problem is actually due to deadlock, the problem is usually resolved without the deadlock being identified. Most managers and teams are happy to see a problem disappear and not motivated to uncover root cause(s).

The classic 2-lock deadlock situation is usually hard to reproduce, which makes it a rare “natural disaster”, but possibly catastrophic. I’ve never seen it but if a deadlock were consistently reproducible, then it would be much easier to resolve.

However, if you are serious about deadlock risks in real projects, you can’t ignore one of the more common deadlock patterns — 2 threads deadlock each other with a single lock.

I’d /go out on a limb/ to claim that this pattern is arguably more common than the 2-lock textbook pattern. It may account for up to 90% of all uncovered deadlock cases.

P90 [[eff c#]] also shows a single-lock deadlock. Here’s my recollection — ThrA acquires the lock and goes into a while-loop, periodically checking a flag to be set by ThrB. ThrB wants to set the flag but need the lock.

real world DB deadlock reduction — insert hotspot

One of my multi-threaded java apps (using ExecutorSerivce) started getting database deadlocks when I increased thread count from 1 to 5.

* In Dev/Test environment, i got up to 4 deadlocks out of 20,000-40,000 inserts.
* In Production, I got 3000+ deadlocks out of 20,000 inserts.

Solution: I changed the clustered index from the table’s identity col to an AccountNumber column.

Result? down to below 5 deadlocks.

Analysis: Probably the clustered index on identity col created a hotspot for inserts. Since new rows all get similar identity values, all 5 threads need to write to the same data pages (sybase defaults to page lock). Each thread (with its own db connection and spid) in the deadlock probably writes to 2 pages within on transaction. If one thread already wrote on page 1 and needs to write on page 2 before commit(), but the other thread already wrote to page 2 and needs page 1, then this scenario meets

* wait-for circle
* incremental lock acquisition
* mutex
* non-preemption

Simple deadlock demo


package com.test1;

/**
* showcasing Thread.yield(), Thread.join(), anon class
*/
public class DeadlockSimpleDemo {
public static void main(String[] args) {
Thread thr1 = startNewThread("lockA", "lockB");
Thread thr2 = startNewThread("lockB", "lockA"); // identical string literals are the same objects
// thr1.join();
// thr2.join();
}

private static Thread startNewThread(final String lock1, final String lock2) {
Thread ret = new Thread() {
@Override
public void run() {
synchronized (lock1) {
System.out.println(currentThread() + lock1 + " acquired, grabbing "
+ lock2);
yield(); // needed to produce a deadlock
synchronized (lock2) {
System.out.println(currentThread() + lock1
+ " acquired, -> then acquired " + lock2);
}
}
System.out.println(currentThread() + "end of run()");
}
};
ret.start();
return ret;
}
}