什么是線程虛假喚醒
在不同的語言,甚至不同的操作系統(tǒng)上瀑晒,條件鎖都會產(chǎn)生虛假喚醒現(xiàn)象。所有語言的條件鎖庫都推薦用戶把wait()放進(jìn)循環(huán)里,參見為什么條件鎖會產(chǎn)生虛假喚醒現(xiàn)象(spurious wakeup)
while (!cond) {
lock.wait();
}
摘選wikipedia的解釋:
This means that when you wait on a condition variable, the wait may (occasionally) return when no thread specifically broadcast or signaled that condition variable. Spurious wakeups may sound strange, but on some multiprocessor systems, making condition wakeup completely predictable might substantially slow all condition variable operations. The race conditions that cause spurious wakeups should be considered rare.
簡單翻譯一下: 當(dāng)線程在某個條件變量下等待時,即使其他線程沒有broadcast or signaled 這個條件變量,該線程仍然可能被喚醒,在多核處理器系統(tǒng)下,使條件變量完全可以預(yù)測會降低系統(tǒng)的性能,而導(dǎo)致虛假喚醒的幾率又很小
綜合我所了解到相關(guān)知識. 在操作系統(tǒng)底層"喚醒"的實現(xiàn)機制就注定虛假喚醒的存在,設(shè)計者們不解決這個問題的原因是
- 修復(fù)這個問題會導(dǎo)致系統(tǒng)性能下降,性價比太低
- 即使修復(fù)了這個問題,由于同步問題的存在,仍然要將wait()放進(jìn)循環(huán)里.
對于1不過多追究,對于2,下面以Java中的真實場景做演示
模擬場景
一個典型的生產(chǎn)者消費者場景,現(xiàn)有 2個consumer線程,一個producer線程,consumer和producer都是一次性的,它們都只會消費/生產(chǎn)一個產(chǎn)品,初始時產(chǎn)品數(shù)量為0,三個線程(近似)同時啟動
/**
* @description: 參考了 http://www.reibang.com/p/da312eee4ac4
* @author: alonwang
* @create: 2019-07-19 15:54
**/
public class SpuriousWakeUp {
private final Object lock = new Object();
private int product = 0;
//如果沒有產(chǎn)品,在lock對象上等待喚醒,如果有產(chǎn)品,消費.
private Runnable consumer = () -> {
System.out.println(Thread.currentThread().getName() + " prepare consume");
synchronized (lock) {
if (product <= 0) {//替換為while解決線程虛假喚醒問題
try {
System.out.println(Thread.currentThread().getName() + " wait");
lock.wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(Thread.currentThread().getName() + " wakeup");
}
product--;
System.out.println(Thread.currentThread().getName() + " consumed product:" + product);
if (product < 0) {
System.err.println(Thread.currentThread().getName() + " spurious lock happend, product: " + product);
}
}
};
//生產(chǎn)一個產(chǎn)品然后喚醒一個在lock對象上等待的consumer
private Runnable producer = () -> {
System.out.println(Thread.currentThread().getName() + " prepare produce");
synchronized (lock) {
product += 1;
System.out.println(Thread.currentThread().getName() + "produced product: " + product);
lock.notify();
}
};
public void producerAndConsumer() {
// 啟動2個consumer,1個producer
Thread c1 = new Thread(consumer);
Thread c2 = new Thread(consumer);
Thread p = new Thread(producer);
c1.start();
c2.start();
p.start();
}
public static void main(String[] args) {
//運行100次,以便觸發(fā)異炒仿耄現(xiàn)象
for (int i = 0; i < 100; i++) {
new SpuriousWakeUp().producerAndConsumer();
}
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.exit(0);
}
}
現(xiàn)象
預(yù)期現(xiàn)象: 不考慮三個線程的執(zhí)行順序,由于生產(chǎn)者只有一個,初始時又沒有產(chǎn)品.而consumer線程有兩個,那么一定會有一個consumer線程由于無法消費被永久阻塞.
實際現(xiàn)象: 某些情況下,應(yīng)該被永久阻塞的那個consumer線程被異常喚醒,并消費一個產(chǎn)品導(dǎo)致產(chǎn)品數(shù)量為-1.
截取的帶有異沉裕現(xiàn)象的輸出如下:
Thread-100 prepare consume
Thread-100 consumed product:0
Thread-99 wakeup
Thread-99 consumed product:-1
Thread-99 spurious lock happend, product: -1
Thread-108 prepare consume
Thread-108 wait
Thread-112 prepare consume
原理解釋
設(shè)兩個consumer為C1,C2,producer為P.上面出現(xiàn)異常的時序如下:
要理解問題所在,需要了解以下知識
- 線程獲取不到鎖被阻塞,會在Contention List上等待
- 獲取到鎖的線程調(diào)用wait后,會主動放棄鎖,并在Wait Set中等待喚醒
- 線程調(diào)用notify后,在退出Synchronized塊釋放鎖后才會執(zhí)行喚醒操作(暫時沒有搞清楚喚醒和釋放鎖的順序)
上面問題的核心是: C1被喚醒后,仍然需要先獲取鎖再繼續(xù)執(zhí)行邏輯,而喚醒-獲取鎖并不是原子性的,喚醒之后鎖可能被其他線程獲取,這時C1再次獲取到鎖時,產(chǎn)品已經(jīng)沒了,由于是繼續(xù)執(zhí)行,就沒有再檢查產(chǎn)品數(shù)量,導(dǎo)致異常情況的出現(xiàn)
解決辦法—將if替換為while
替換為while后,即使被喚醒,仍然會再檢查一遍限制條件,保證邏輯的正確性.