futex同步機(jī)制包括用戶(hù)態(tài)的原子操作和內(nèi)核態(tài)的futex系統(tǒng)調(diào)用兩部分組成拱绑,其調(diào)用原型如下:
int futex (int *uaddr, int op, int val, const struct timespec *timeout,
int *uaddr2, int val3);
在futex系統(tǒng)調(diào)用內(nèi)部是通過(guò)do_futex()完成具體操作
long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
u32 __user *uaddr2, u32 val2, u32 val3)
futex系統(tǒng)調(diào)用的參數(shù)很多蠢古,而do_futex的參數(shù)比f(wàn)utex還要多出一個(gè)來(lái)撰筷。這是由于同一個(gè)futex調(diào)用要根據(jù)不同的操作類(lèi)型來(lái)完成不同的操作,而具體的操作所需的參數(shù)因目的不同而有所差異睹欲,每種具體的操作類(lèi)型所需的參數(shù)數(shù)目以及具體參數(shù)的含義由其參數(shù)op決定浪感。具體op操作類(lèi)型的定義具體如下:
//最基本的掛起喚醒操作少梁,將進(jìn)程(線(xiàn)程)阻塞在uaddr所指向的futex變量上(僅當(dāng)*uaddr==val)和
//喚醒阻塞在*uaddr所指向的futex變量上的val個(gè)進(jìn)程(線(xiàn)程)
#define FUTEX_WAIT 0
#define FUTEX_WAKE 1
// 不清楚 ?
#define FUTEX_FD 2
//跟基本的喚醒操作類(lèi)似,但不僅僅是喚醒val個(gè)等待在uaddr的進(jìn)程(線(xiàn)程)稚照,
//而更進(jìn)一步蹂空,將val3個(gè)等待uaddr的進(jìn)程(線(xiàn)程)移到uaddr2的等待隊(duì)列中
//(相當(dāng)于先它們俯萌,然后強(qiáng)制讓它們阻塞在uaddr2上面)
#define FUTEX_REQUEUE 3
//在FUTEX_REQUEUE的基礎(chǔ)上,F(xiàn)UTEX_CMP_REQUEUE 多了一個(gè)判斷上枕,
//僅當(dāng)*uaddr與val2相等時(shí)才執(zhí)行操作咐熙,否則直接返回,讓用戶(hù)態(tài)去重試姿骏。
#define FUTEX_CMP_REQUEUE 4
//在這種操作類(lèi)型中做了很多動(dòng)作糖声。它嘗試在uaddr1的等待隊(duì)列中喚醒val個(gè)進(jìn)程,
//然后修改uaddr2的值分瘦,并且在uaddr2的值滿(mǎn)足條件的情況下蘸泻,喚醒uaddr2隊(duì)列中的val2個(gè)進(jìn)程。
//uaddr2的值如何修改嘲玫?又需要滿(mǎn)足什么樣的條件才喚醒uaddr2悦施?這些邏輯都pack在val3參數(shù)中。
#define FUTEX_WAKE_OP 5
//帶優(yōu)先級(jí)繼承的futex鎖操作
#define FUTEX_LOCK_PI 6
#define FUTEX_UNLOCK_PI 7
#define FUTEX_TRYLOCK_PI 8
//在基本的掛起喚醒操作基礎(chǔ)上去团,額外使用一個(gè)bitset參數(shù)val3抡诞,
//使用特定bitset進(jìn)行wait的進(jìn)程,只能被使用它的bitset超集的wake調(diào)用所喚醒土陪。
#define FUTEX_WAIT_BITSET 9
#define FUTEX_WAKE_BITSET 10
//FUTEX_WAIT_REQUEUE_PI是帶優(yōu)先級(jí)繼承版本的FUTEX_WAIT_REQUEUE昼汗,
//FUTEX_WAIT_REQUEUE_PI是與之配套使用的,用于替代普通的FUTEX_WAIT
#define FUTEX_WAIT_REQUEUE_PI 11
#define FUTEX_CMP_REQUEUE_PI 12
具體的futex系統(tǒng)調(diào)用如下鬼雀,在futex(……)中根據(jù)操作類(lèi)型op對(duì)參數(shù)進(jìn)行調(diào)整顷窒,然后調(diào)用do_futex(……)。主要有以下幾種情況:
- 當(dāng)op掛起阻塞類(lèi)型的操作時(shí)源哩,用戶(hù)傳入的utime即阻塞的timeout鞋吉,此時(shí)如果utime不為空,則將struct timespec類(lèi)型轉(zhuǎn)換為 ktime_t類(lèi)型励烦,然后傳給do_futex(……)谓着。
- 當(dāng)op屬于FUTEX_*_REQUEUE_*時(shí),utime此時(shí)用來(lái)作為與uaddr進(jìn)行條件判斷的參數(shù)坛掠,則先將其轉(zhuǎn)換為一個(gè)u32類(lèi)型的值val2赊锚,然后傳給do_futex(……),該參數(shù)即相較futex(……)多出的參數(shù)却音。
- 當(dāng)op的操作類(lèi)型為FUTEX_WAKE_OP時(shí)改抡,utime與情況2做同樣的類(lèi)型轉(zhuǎn)換,但此時(shí)代表的含義是將要喚醒的進(jìn)程數(shù)目系瓢。
linux/kernel/futex.c
SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
struct timespec __user *, utime, u32 __user *, uaddr2,
u32, val3)
{
struct timespec ts;
ktime_t t, *tp = NULL;
u32 val2 = 0;
int cmd = op & FUTEX_CMD_MASK;
if (utime && (cmd == FUTEX_WAIT || cmd == FUTEX_LOCK_PI ||
cmd == FUTEX_WAIT_BITSET ||
cmd == FUTEX_WAIT_REQUEUE_PI)) {
if (copy_from_user(&ts, utime, sizeof(ts)) != 0)
return -EFAULT;
if (!timespec_valid(&ts))
return -EINVAL;
t = timespec_to_ktime(ts);
if (cmd == FUTEX_WAIT)
t = ktime_add_safe(ktime_get(), t);
tp = &t;
}
/*
* requeue parameter in 'utime' if cmd == FUTEX_*_REQUEUE_*.
* number of waiters to wake in 'utime' if cmd == FUTEX_WAKE_OP.
*/
if (cmd == FUTEX_REQUEUE || cmd == FUTEX_CMP_REQUEUE ||
cmd == FUTEX_CMP_REQUEUE_PI || cmd == FUTEX_WAKE_OP)
val2 = (u32) (unsigned long) utime;
return do_futex(uaddr, op, val, tp, uaddr2, val2, val3);
}
在do_futex(……)中阿纤,主要根據(jù)op代表的具體操作類(lèi)型進(jìn)行不同分支的操作。例如FUTEX_WAIT執(zhí)行futex_wait(uaddr, flags, val, timeout, val3)夷陋,F(xiàn)UTEX_WAKE則執(zhí)行futex_wake(uaddr, flags, val, val3)欠拾,這是最基本futex阻塞喚醒操作胰锌。可以看到FUTEX_WAIT_BITSET和FUTEX_WAKE_BITSET最終調(diào)用的具體操作函數(shù)也是futex_wait(uaddr, flags, val, timeout, val3)和futex_wake(uaddr, flags, val, val3)藐窄,只不過(guò)FUTEX_WAIT和FUTEX_WAKE在執(zhí)行具體操作之前將bitset參數(shù)val3設(shè)置為全匹配资昧。另外操作函數(shù)中flag參數(shù)指明該futex變量時(shí)進(jìn)程間共享的還進(jìn)程私有的,該參數(shù)具體值根據(jù)op的值設(shè)定荆忍。
linux/kernel/futex.c
long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
u32 __user *uaddr2, u32 val2, u32 val3)
{
int cmd = op & FUTEX_CMD_MASK;
unsigned int flags = 0;
if (!(op & FUTEX_PRIVATE_FLAG))
flags |= FLAGS_SHARED;
if (op & FUTEX_CLOCK_REALTIME) {
flags |= FLAGS_CLOCKRT;
if (cmd != FUTEX_WAIT_BITSET && cmd != FUTEX_WAIT_REQUEUE_PI)
return -ENOSYS;
}
switch (cmd) {
case FUTEX_LOCK_PI:
case FUTEX_UNLOCK_PI:
case FUTEX_TRYLOCK_PI:
case FUTEX_WAIT_REQUEUE_PI:
case FUTEX_CMP_REQUEUE_PI:
if (!futex_cmpxchg_enabled)
return -ENOSYS;
}
switch (cmd) {
case FUTEX_WAIT:
val3 = FUTEX_BITSET_MATCH_ANY;
case FUTEX_WAIT_BITSET:
return futex_wait(uaddr, flags, val, timeout, val3);
case FUTEX_WAKE:
val3 = FUTEX_BITSET_MATCH_ANY;
case FUTEX_WAKE_BITSET:
return futex_wake(uaddr, flags, val, val3);
case FUTEX_REQUEUE:
return futex_requeue(uaddr, flags, uaddr2, val, val2, NULL, 0);
case FUTEX_CMP_REQUEUE:
return futex_requeue(uaddr, flags, uaddr2, val, val2, &val3, 0);
case FUTEX_WAKE_OP:
return futex_wake_op(uaddr, flags, uaddr2, val, val2, val3);
case FUTEX_LOCK_PI:
return futex_lock_pi(uaddr, flags, val, timeout, 0);
case FUTEX_UNLOCK_PI:
return futex_unlock_pi(uaddr, flags);
case FUTEX_TRYLOCK_PI:
return futex_lock_pi(uaddr, flags, 0, timeout, 1);
case FUTEX_WAIT_REQUEUE_PI:
val3 = FUTEX_BITSET_MATCH_ANY;
return futex_wait_requeue_pi(uaddr, flags, val, timeout, val3,
uaddr2);
case FUTEX_CMP_REQUEUE_PI:
return futex_requeue(uaddr, flags, uaddr2, val, val2, &val3, 1);
}
return -ENOSYS;
}
后面會(huì)主要對(duì)futex_wait(……)和futex_wake(……)進(jìn)行詳細(xì)分析格带。
參考:
http://www.tuicool.com/articles/feUR73
http://blog.csdn.net/jianchaolv/article/details/7544316