前言
OOM全稱 Out Of Memory,指的是因?yàn)閮?nèi)存使用過(guò)多而導(dǎo)致的 APP 閃退亲桦。其實(shí)這本身是一種操作系統(tǒng)管理內(nèi)存的機(jī)制崖蜜。因?yàn)槭謾C(jī)的內(nèi)存是有限的,不可能無(wú)限制的使用客峭,當(dāng)內(nèi)存不夠時(shí)豫领,需要將低優(yōu)先級(jí)的進(jìn)程kill,騰出內(nèi)存以便高優(yōu)先級(jí)的進(jìn)程使用舔琅。這里發(fā)生的進(jìn)程 kill等恐,就是 OOM 了。
那OOM的觸發(fā)機(jī)制到底是怎么樣的呢备蚓?目前市上的資料說(shuō)的都比較模糊课蔬,沒(méi)有一個(gè)很清晰的介紹
源碼探究
幸好xnu這塊代碼是開(kāi)源的,在opensource.apple.com里可以下到整個(gè) xnu 內(nèi)核的代碼郊尝。內(nèi)存狀態(tài)管理相關(guān)的代碼主要是在kern_memorystatus.c(.h)
文件中
優(yōu)先級(jí)隊(duì)列
首先系統(tǒng)對(duì)進(jìn)程是分優(yōu)先級(jí)的购笆,整個(gè)系統(tǒng)會(huì)有一個(gè)優(yōu)先級(jí)隊(duì)列。
#define MEMSTAT_BUCKET_COUNT (JETSAM_PRIORITY_MAX + 1)
typedef struct memstat_bucket {
TAILQ_HEAD(, proc) list;
int count;
} memstat_bucket_t;
memstat_bucket_t memstat_bucket[MEMSTAT_BUCKET_COUNT];
kern_memorystatus.c
中定義了一個(gè)memstat_bucket_t
的結(jié)構(gòu)體虚循。結(jié)構(gòu)體很簡(jiǎn)單,count 表示這個(gè)優(yōu)先級(jí)下有多少個(gè)進(jìn)程,list是一個(gè)鏈表横缔,用來(lái)存放各個(gè)進(jìn)程铺遂。(使用鏈表是為了插入和刪除方便。)
在memstat_bucket_t
結(jié)構(gòu)體之后茎刚,系統(tǒng)定義了一個(gè)memstat_bucket_t
結(jié)構(gòu)體的數(shù)組襟锐,用來(lái)存放系統(tǒng)進(jìn)程的優(yōu)先級(jí)隊(duì)列。每個(gè)優(yōu)先級(jí)對(duì)應(yīng)一個(gè)memstat_bucket_t
結(jié)構(gòu)體膛锭,結(jié)構(gòu)體中存放著這個(gè)優(yōu)先級(jí)下的所有進(jìn)程粮坞。
在kern_memorystatus.h
中定義了優(yōu)先級(jí)有哪些:
#define JETSAM_PRIORITY_REVISION 2
#define JETSAM_PRIORITY_IDLE_HEAD -2
/* The value -1 is an alias to JETSAM_PRIORITY_DEFAULT */
#define JETSAM_PRIORITY_IDLE 0
#define JETSAM_PRIORITY_IDLE_DEFERRED 1 /* Keeping this around till all xnu_quick_tests can be moved away from it.*/
#define JETSAM_PRIORITY_AGING_BAND1 JETSAM_PRIORITY_IDLE_DEFERRED
#define JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC 2
#define JETSAM_PRIORITY_AGING_BAND2 JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC
#define JETSAM_PRIORITY_BACKGROUND 3
#define JETSAM_PRIORITY_ELEVATED_INACTIVE JETSAM_PRIORITY_BACKGROUND
#define JETSAM_PRIORITY_MAIL 4
#define JETSAM_PRIORITY_PHONE 5
#define JETSAM_PRIORITY_UI_SUPPORT 8
#define JETSAM_PRIORITY_FOREGROUND_SUPPORT 9
#define JETSAM_PRIORITY_FOREGROUND 10
#define JETSAM_PRIORITY_AUDIO_AND_ACCESSORY 12
#define JETSAM_PRIORITY_CONDUCTOR 13
#define JETSAM_PRIORITY_HOME 16
#define JETSAM_PRIORITY_EXECUTIVE 17
#define JETSAM_PRIORITY_IMPORTANT 18
#define JETSAM_PRIORITY_CRITICAL 19
#define JETSAM_PRIORITY_MAX 21
/* TODO - tune. This should probably be lower priority */
#define JETSAM_PRIORITY_DEFAULT 18
#define JETSAM_PRIORITY_TELEPHONY 19
可以看到foreground
是10,background
是3初狰,當(dāng)內(nèi)存緊張的時(shí)候莫杈,后臺(tái)的進(jìn)程會(huì)優(yōu)先被干掉,正常當(dāng)foreground
前面優(yōu)先級(jí)的進(jìn)程全被kill后奢入,依然內(nèi)存緊張筝闹,才會(huì)kill foreground
進(jìn)程
OOM 類(lèi)型
目前 OOM 主要分為11種類(lèi)型:
/* Cause */
enum {
kMemorystatusInvalid = JETSAM_REASON_INVALID,
kMemorystatusKilled = JETSAM_REASON_GENERIC,
kMemorystatusKilledHiwat = JETSAM_REASON_MEMORY_HIGHWATER, //high water
kMemorystatusKilledVnodes = JETSAM_REASON_VNODE, // vnode
kMemorystatusKilledVMPageShortage = JETSAM_REASON_MEMORY_VMPAGESHORTAGE, //vm page shortager
kMemorystatusKilledVMThrashing = JETSAM_REASON_MEMORY_VMTHRASHING, // vm thrashing
kMemorystatusKilledFCThrashing = JETSAM_REASON_MEMORY_FCTHRASHING, // fc thrashing
kMemorystatusKilledPerProcessLimit = JETSAM_REASON_MEMORY_PERPROCESSLIMIT, // per process limit
kMemorystatusKilledDiagnostic = JETSAM_REASON_MEMORY_DIAGNOSTIC, // diagnostic
kMemorystatusKilledIdleExit = JETSAM_REASON_MEMORY_IDLE_EXIT, // idle exit
kMemorystatusKilledZoneMapExhaustion = JETSAM_REASON_ZONE_MAP_EXHAUSTION // map exhaustion
};
對(duì)應(yīng)每種類(lèi)型,輸出日志時(shí)會(huì)有相應(yīng)的字符串腥光,輸出到 log 中
/* For logging clarity */
static const char *memorystatus_kill_cause_name[] = {
"" ,
"jettisoned" , /* kMemorystatusKilled */
"highwater" , /* kMemorystatusKilledHiwat */
"vnode-limit" , /* kMemorystatusKilledVnodes */
"vm-pageshortage" , /* kMemorystatusKilledVMPageShortage */
"vm-thrashing" , /* kMemorystatusKilledVMThrashing */
"fc-thrashing" , /* kMemorystatusKilledFCThrashing */
"per-process-limit" , /* kMemorystatusKilledPerProcessLimit */
"diagnostic" , /* kMemorystatusKilledDiagnostic */
"idle-exit" , /* kMemorystatusKilledIdleExit */
"zone-map-exhaustion" , /* kMemorystatusKilledZoneMapExhaustion */
};
當(dāng)我們的 App 觸發(fā) OOM 時(shí)关顷,系統(tǒng)會(huì)有相應(yīng)的日志寫(xiě)到手機(jī)的設(shè)置->隱私->分析->分析數(shù)據(jù)->jstsamEvent-xxx文件中。打開(kāi)文件武福,可以看到reason
一欄會(huì)標(biāo)明 OOM 的類(lèi)型
這是我手機(jī)里的一個(gè)jstsamEvent文件
...
"largestProcess" : "Boom",
"genCounter" : 23,
"processes" : [
{
"uuid" : "ebd916c8-96e7-3b8f-985d-027098a13fd6",
"states" : [
"daemon",
"idle"
],
"killDelta" : 1887,
"genCount" : 0,
"age" : 200706725,
"purgeable" : 0,
"fds" : 50,
"coalition" : 268,
"rpages" : 34,
"reason" : "vm-pageshortage",
"pid" : 2205,
"cpuTime" : 0.0030500000000000002,
"name" : "xpcproxy",
"lifetimeMax" : 79
},
...
在這里我們可以看到占用內(nèi)存最大的進(jìn)程是 boom议双,OOM 的類(lèi)型是vm-pageshortage
OOM 的觸發(fā)方式
正常 OOM 的觸發(fā)方式有2種,一種是同步觸發(fā)捉片,一種是異步觸發(fā)平痰。比如 VMPageShortage類(lèi)型的 OOM 觸發(fā)方式:
boolean_t memorystatus_kill_on_VM_page_shortage(boolean_t async) {
if (async) {
return memorystatus_kill_process_async(-1, kMemorystatusKilledVMPageShortage);
} else {
os_reason_t jetsam_reason = os_reason_create(OS_REASON_JETSAM, JETSAM_REASON_MEMORY_VMPAGESHORTAGE);
if (jetsam_reason == OS_REASON_NULL) {
printf("memorystatus_kill_on_VM_page_shortage -- sync: failed to allocate jetsam reason\n");
}
return memorystatus_kill_process_sync(-1, kMemorystatusKilledVMPageShortage, jetsam_reason);
}
}
同步觸發(fā)比較簡(jiǎn)單粗暴,直接根據(jù)pid界睁,kill 掉相應(yīng)的進(jìn)程觉增。如果 pid 傳的是-1,就 kill 掉優(yōu)先級(jí)隊(duì)列里面優(yōu)先級(jí)最低的那個(gè)進(jìn)程翻斟。(如果多個(gè)進(jìn)程同一個(gè)優(yōu)先級(jí)逾礁,系統(tǒng)會(huì)根據(jù)占用內(nèi)存大小排序,kill 掉內(nèi)存占用最大的進(jìn)程)
static boolean_t
memorystatus_kill_process_sync(pid_t victim_pid, uint32_t cause, os_reason_t jetsam_reason) {
boolean_t res;
uint32_t errors = 0;
if (victim_pid == -1) {
/* No pid, so kill first process */
res = memorystatus_kill_top_process(TRUE, TRUE, cause, jetsam_reason, NULL, &errors);
} else {
res = memorystatus_kill_specific_process(victim_pid, cause, jetsam_reason);
}
if (errors) {
memorystatus_clear_errors();
}
return res;
}
而異步觸發(fā)實(shí)際是通過(guò)設(shè)置一個(gè)內(nèi)存標(biāo)志位访惜,標(biāo)志當(dāng)前內(nèi)存已經(jīng)有問(wèn)題了嘹履,然后喚醒專(zhuān)門(mén)的內(nèi)存管理線程去管理內(nèi)存狀態(tài),觸發(fā) OOM债热,kill 部分進(jìn)程砾嫉,回收內(nèi)存。
static boolean_t
memorystatus_kill_process_async(pid_t victim_pid, uint32_t cause) {
/*
* TODO: allow a general async path
*
* NOTE: If a new async kill cause is added, make sure to update memorystatus_thread() to
* add the appropriate exit reason code mapping.
*/
if ((victim_pid != -1) || (cause != kMemorystatusKilledVMPageShortage && cause != kMemorystatusKilledVMThrashing &&
cause != kMemorystatusKilledFCThrashing && cause != kMemorystatusKilledZoneMapExhaustion)) {
return FALSE;
}
kill_under_pressure_cause = cause;
memorystatus_thread_wake();
return TRUE;
}
內(nèi)存狀態(tài)管理線程
系統(tǒng)中專(zhuān)門(mén)有一個(gè)線程用來(lái)管理內(nèi)存狀態(tài)窒篱,當(dāng)內(nèi)存狀態(tài)出現(xiàn)問(wèn)題或者內(nèi)存壓力過(guò)大時(shí)焕刮,將會(huì)通過(guò)一定的策略舶沿,干掉一些 App 回收內(nèi)存。
將部分無(wú)關(guān)代碼刪除后配并,內(nèi)存狀態(tài)管理線程代碼是這樣的
static void
memorystatus_thread(void *param __unused, wait_result_t wr __unused)
{
static boolean_t is_vm_privileged = FALSE;
boolean_t post_snapshot = FALSE;
uint32_t errors = 0;
uint32_t hwm_kill = 0;
boolean_t sort_flag = TRUE;
boolean_t corpse_list_purged = FALSE;
int jld_idle_kills = 0;
if (is_vm_privileged == FALSE) {
/* 一些初始化工作 */
thread_wire(host_priv_self(), current_thread(), TRUE);
is_vm_privileged = TRUE;
if (vm_restricted_to_single_processor == TRUE)
thread_vm_bind_group_add();
thread_set_thread_name(current_thread(), "VM_memorystatus");
memorystatus_thread_block(0, memorystatus_thread);
}
// 真正的內(nèi)存管理的循環(huán)
while (memorystatus_action_needed()) {
boolean_t killed;
int32_t priority;
uint32_t cause;
uint64_t jetsam_reason_code = JETSAM_REASON_INVALID;
os_reason_t jetsam_reason = OS_REASON_NULL;
cause = kill_under_pressure_cause;
switch (cause) {
case kMemorystatusKilledFCThrashing:
jetsam_reason_code = JETSAM_REASON_MEMORY_FCTHRASHING;
break;
case kMemorystatusKilledVMThrashing:
jetsam_reason_code = JETSAM_REASON_MEMORY_VMTHRASHING;
break;
case kMemorystatusKilledZoneMapExhaustion:
jetsam_reason_code = JETSAM_REASON_ZONE_MAP_EXHAUSTION;
break;
case kMemorystatusKilledVMPageShortage:
/* falls through */
default:
jetsam_reason_code = JETSAM_REASON_MEMORY_VMPAGESHORTAGE;
cause = kMemorystatusKilledVMPageShortage;
break;
}
/* HIGHWATER類(lèi)型的 OOM 觸發(fā) */
boolean_t is_critical = TRUE;
if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) {
if (is_critical == FALSE) {
/*
* For now, don't kill any other processes.
*/
break;
} else {
goto done;
}
}
jetsam_reason = os_reason_create(OS_REASON_JETSAM, jetsam_reason_code);
if (jetsam_reason == OS_REASON_NULL) {
printf("memorystatus_thread: failed to allocate jetsam reason\n");
}
// 核心的 OOM 觸發(fā)機(jī)制
if (memorystatus_act_aggressive(cause, jetsam_reason, &jld_idle_kills, &corpse_list_purged, &post_snapshot)) {
goto done;
}
os_reason_ref(jetsam_reason);
/* LRU括荡,干掉優(yōu)先級(jí)最低的一個(gè)進(jìn)程 */
killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
sort_flag = FALSE;
if (killed) {
/* Jetsam Loop Detection */
if (memorystatus_jld_enabled == TRUE) {
if ((priority == JETSAM_PRIORITY_IDLE) || (priority == system_procs_aging_band) || (priority == applications_aging_band)) {
jld_idle_kills++;
}
}
if ((priority >= JETSAM_PRIORITY_UI_SUPPORT) && (total_corpses_count() > 0) && (corpse_list_purged == FALSE)) {
task_purge_all_corpses();
corpse_list_purged = TRUE;
}
goto done;
}
if (memorystatus_avail_pages_below_critical()) {
/*
* Still under pressure and unable to kill a process - purge corpse memory
*/
if (total_corpses_count() > 0) {
task_purge_all_corpses();
corpse_list_purged = TRUE;
}
if (memorystatus_avail_pages_below_critical()) {
/*
* Still under pressure and unable to kill a process - panic
*/
panic("memorystatus_jetsam_thread: no victim! available pages:%llu\n", (uint64_t)memorystatus_available_pages);
}
}
done:
/*
* We do not want to over-kill when thrashing has been detected.
* To avoid that, we reset the flag here and notify the
* compressor.
*/
if (is_reason_thrashing(kill_under_pressure_cause)) {
kill_under_pressure_cause = 0;
#if CONFIG_JETSAM
vm_thrashing_jetsam_done();
#endif /* CONFIG_JETSAM */
} else if (is_reason_zone_map_exhaustion(kill_under_pressure_cause)) {
kill_under_pressure_cause = 0;
}
os_reason_free(jetsam_reason);
}
kill_under_pressure_cause = 0;
if (errors) {
memorystatus_clear_errors();
}
}
代碼比較多,我們來(lái)慢慢解析
準(zhǔn)入條件
我們可以看到真正核心的代碼在while (memorystatus_action_needed())
的循環(huán)里面溉旋,memorystatus_action_needed
是觸發(fā) OOM 的核心判斷條件
/* Does cause indicate vm or fc thrashing? */
static boolean_t
is_reason_thrashing(unsigned cause)
{
switch (cause) {
case kMemorystatusKilledVMThrashing:
case kMemorystatusKilledFCThrashing:
return TRUE;
default:
return FALSE;
}
}
/* Is the zone map almost full? */
static boolean_t
is_reason_zone_map_exhaustion(unsigned cause)
{
if (cause == kMemorystatusKilledZoneMapExhaustion)
return TRUE;
return FALSE;
}
static boolean_t
memorystatus_action_needed(void)
{
return (is_reason_thrashing(kill_under_pressure_cause) ||
is_reason_zone_map_exhaustion(kill_under_pressure_cause) ||
memorystatus_available_pages <= memorystatus_available_pages_pressure);
}
當(dāng)kill_under_pressure_cause
值為kMemorystatusKilledVMThrashing
,kMemorystatusKilledFCThrashing
,kMemorystatusKilledZoneMapExhaustion
時(shí)畸冲,或者當(dāng)前可用內(nèi)存 memorystatus_available_pages
小于閾值memorystatus_available_pages_pressure
時(shí),會(huì)走進(jìn)去觸發(fā) OOM观腊。
high-water
進(jìn)入循環(huán)之后邑闲,首先走到memorystatus_act_on_hiwat_processes
/* HIGHWATER類(lèi)型的 OOM 觸發(fā) */
boolean_t is_critical = TRUE;
if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) {
if (is_critical == FALSE) {
/*
* For now, don't kill any other processes.
*/
break;
} else {
goto done;
}
}
這是觸發(fā)HIGHWATER類(lèi)型 OOM 的關(guān)鍵方法
static boolean_t
memorystatus_act_on_hiwat_processes(uint32_t *errors, uint32_t *hwm_kill, boolean_t *post_snapshot, __unused boolean_t *is_critical)
{
boolean_t killed = memorystatus_kill_hiwat_proc(errors);
if (killed) {
*hwm_kill = *hwm_kill + 1;
*post_snapshot = TRUE;
return TRUE;
} else {
memorystatus_hwm_candidates = FALSE;
}
return FALSE;
}
memorystatus_act_on_hiwat_processes
會(huì)直接調(diào)用memorystatus_kill_hiwat_proc
,核心代碼都在memorystatus_kill_hiwat_proc
中梧油。
static boolean_t
memorystatus_kill_hiwat_proc(uint32_t *errors)
{
pid_t aPid = 0;
proc_t p = PROC_NULL, next_p = PROC_NULL;
boolean_t new_snapshot = FALSE, killed = FALSE;
int kill_count = 0;
unsigned int i = 0;
uint32_t aPid_ep;
uint64_t killtime = 0;
clock_sec_t tv_sec;
clock_usec_t tv_usec;
uint32_t tv_msec;
os_reason_t jetsam_reason = OS_REASON_NULL;
jetsam_reason = os_reason_create(OS_REASON_JETSAM, JETSAM_REASON_MEMORY_HIGHWATER);
proc_list_lock();
next_p = memorystatus_get_first_proc_locked(&i, TRUE);
while (next_p) {
uint64_t footprint_in_bytes = 0;
uint64_t memlimit_in_bytes = 0;
boolean_t skip = 0;
p = next_p;
next_p = memorystatus_get_next_proc_locked(&i, p, TRUE);
aPid = p->p_pid;
aPid_ep = p->p_memstat_effectivepriority;
if (p->p_memstat_state & (P_MEMSTAT_ERROR | P_MEMSTAT_TERMINATED)) {
continue;
}
/* skip if no limit set */
if (p->p_memstat_memlimit <= 0) {
continue;
}
footprint_in_bytes = get_task_phys_footprint(p->task);
memlimit_in_bytes = (((uint64_t)p->p_memstat_memlimit) * 1024ULL * 1024ULL); /* convert MB to bytes */
skip = (footprint_in_bytes <= memlimit_in_bytes);
#if CONFIG_FREEZE
if (!skip) {
if (p->p_memstat_state & P_MEMSTAT_LOCKED) {
skip = TRUE;
} else {
skip = FALSE;
}
}
#endif
if (skip) {
continue;
} else {
if (memorystatus_jetsam_snapshot_count == 0) {
p->p_memstat_state |= P_MEMSTAT_TERMINATED;
killtime = mach_absolute_time();
absolutetime_to_microtime(killtime, &tv_sec, &tv_usec);
tv_msec = tv_usec / 1000;
{
memorystatus_update_jetsam_snapshot_entry_locked(p, kMemorystatusKilledHiwat, killtime);
if (proc_ref_locked(p) == p) {
proc_list_unlock();
/*
* memorystatus_do_kill drops a reference, so take another one so we can
* continue to use this exit reason even after memorystatus_do_kill()
* returns
*/
os_reason_ref(jetsam_reason);
killed = memorystatus_do_kill(p, kMemorystatusKilledHiwat, jetsam_reason);
/* Success? */
if (killed) {
proc_rele(p);
kill_count++;
goto exit;
}
proc_list_lock();
proc_rele_locked(p);
p->p_memstat_state &= ~P_MEMSTAT_TERMINATED;
p->p_memstat_state |= P_MEMSTAT_ERROR;
*errors += 1;
}
i = 0;
next_p = memorystatus_get_first_proc_locked(&i, TRUE);
}
}
}
proc_list_unlock();
exit:
os_reason_free(jetsam_reason);
return killed;
}
首先通過(guò)memorystatus_get_first_proc_locked(&i, TRUE)
去優(yōu)先級(jí)隊(duì)列里面取出優(yōu)先級(jí)最低的進(jìn)程苫耸。如果這個(gè)進(jìn)程內(nèi)存小于閾值(footprint_in_bytes <= memlimit_in_bytes)
,則跳過(guò)然后取下一個(gè)進(jìn)程memorystatus_get_next_proc_locked
,如果內(nèi)存超過(guò)閾值婶溯,將通過(guò)memorystatus_do_kill
干掉這個(gè)進(jìn)程鲸阔,并結(jié)束循環(huán)。
我們可以看到這里計(jì)算內(nèi)存的口徑主要用的是phys_footprint
迄委,不過(guò)目前觀察我自己手機(jī)上的 OOM 類(lèi)型褐筛,從未見(jiàn)過(guò)high-water 類(lèi)型的 OOM,猜測(cè)可能high-water的閾值比較高叙身,比較難觸發(fā)渔扎,大家也可以看看自己手機(jī)里的 OOM 類(lèi)型,如果有 high-water 類(lèi)型的 OOM信轿,可以告訴我
normal kill
如果沒(méi)有high-water的進(jìn)程晃痴,程序繼續(xù)往下執(zhí)行,走到memorystatus_act_aggressive
方法里财忽,這個(gè)方法是通常oom的觸發(fā)方法倘核,大部分OOM都在這里面觸發(fā)。
static boolean_t
memorystatus_act_aggressive(uint32_t cause, os_reason_t jetsam_reason, int *jld_idle_kills, boolean_t *corpse_list_purged, boolean_t *post_snapshot)
{
if (memorystatus_jld_enabled == TRUE) {
boolean_t killed;
uint32_t errors = 0;
/* Jetsam Loop Detection - locals */
memstat_bucket_t *bucket;
int jld_bucket_count = 0;
struct timeval jld_now_tstamp = {0,0};
uint64_t jld_now_msecs = 0;
int elevated_bucket_count = 0;
/* Jetsam Loop Detection - statics */
static uint64_t jld_timestamp_msecs = 0;
static int jld_idle_kill_candidates = 0; /* Number of available processes in band 0,1 at start */
static int jld_eval_aggressive_count = 0; /* Bumps the max priority in aggressive loop */
static int32_t jld_priority_band_max = JETSAM_PRIORITY_UI_SUPPORT;
microuptime(&jld_now_tstamp);
jld_now_msecs = (jld_now_tstamp.tv_sec * 1000);
proc_list_lock();
switch (jetsam_aging_policy) {
case kJetsamAgingPolicyLegacy:
bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
jld_bucket_count = bucket->count;
bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
jld_bucket_count += bucket->count;
break;
case kJetsamAgingPolicySysProcsReclaimedFirst:
case kJetsamAgingPolicyAppsReclaimedFirst:
bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
jld_bucket_count = bucket->count;
bucket = &memstat_bucket[system_procs_aging_band];
jld_bucket_count += bucket->count;
bucket = &memstat_bucket[applications_aging_band];
jld_bucket_count += bucket->count;
break;
case kJetsamAgingPolicyNone:
default:
bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
jld_bucket_count = bucket->count;
break;
}
bucket = &memstat_bucket[JETSAM_PRIORITY_ELEVATED_INACTIVE];
elevated_bucket_count = bucket->count;
proc_list_unlock();
if ( (jld_bucket_count == 0) ||
(jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) {
jld_timestamp_msecs = jld_now_msecs;
// 先回收優(yōu)先級(jí)特別低的進(jìn)程:JETSAM_PRIORITY_IDLE即彪,system_procs_aging_band紧唱,applications_aging_band,這些進(jìn)程回收后jld_bucket_count將等于0
jld_idle_kill_candidates = jld_bucket_count;
*jld_idle_kills = 0;
jld_eval_aggressive_count = 0;
jld_priority_band_max = JETSAM_PRIORITY_UI_SUPPORT;
}
// 正常狀態(tài)下先回收一些隨時(shí)可以回收的線程:JETSAM_PRIORITY_IDLE隶校,system_procs_aging_band漏益,applications_aging_band,這些進(jìn)程回收后才能走進(jìn)這個(gè)判斷里面
if (*jld_idle_kills > jld_idle_kill_candidates) {
jld_eval_aggressive_count++;
if ((jld_eval_aggressive_count == memorystatus_jld_eval_aggressive_count) &&
(total_corpses_count() > 0) && (*corpse_list_purged == FALSE)) {
task_purge_all_corpses();
*corpse_list_purged = TRUE;
}
else if (jld_eval_aggressive_count > memorystatus_jld_eval_aggressive_count) {
if ((memorystatus_jld_eval_aggressive_priority_band_max < 0) ||
(memorystatus_jld_eval_aggressive_priority_band_max >= MEMSTAT_BUCKET_COUNT)) {
} else {
jld_priority_band_max = memorystatus_jld_eval_aggressive_priority_band_max;
}
}
// 先干掉后臺(tái)線程
/* Visit elevated processes first */
while (elevated_bucket_count) {
elevated_bucket_count--;
os_reason_ref(jetsam_reason);
killed = memorystatus_kill_elevated_process(
cause,
jetsam_reason,
jld_eval_aggressive_count,
&errors);
if (killed) {
*post_snapshot = TRUE;
// 如果還是有壓力深胳,就繼續(xù)殺App
if (memorystatus_avail_pages_below_pressure()) {
/*
* Still under pressure.
* Find another pinned processes.
*/
continue;
} else {
return TRUE;
}
} else {
break;
}
}
// 干掉前臺(tái)線程
killed = memorystatus_kill_top_process_aggressive(
kMemorystatusKilledVMThrashing,
jld_eval_aggressive_count,
jld_priority_band_max,
&errors);
if (killed) {
/* Always generate logs after aggressive kill */
*post_snapshot = TRUE;
*jld_idle_kills = 0;
return TRUE;
}
}
return FALSE;
}
return FALSE;
}
這里的邏輯比較多绰疤,我們慢慢解釋。
首先有一個(gè)jld_bucket_count
,這里面包含可以直接干掉的低優(yōu)先級(jí)進(jìn)程數(shù)量舞终。
switch (jetsam_aging_policy) {
case kJetsamAgingPolicyLegacy:
bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
jld_bucket_count = bucket->count;
bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
jld_bucket_count += bucket->count;
break;
case kJetsamAgingPolicySysProcsReclaimedFirst:
case kJetsamAgingPolicyAppsReclaimedFirst:
bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
jld_bucket_count = bucket->count;
bucket = &memstat_bucket[system_procs_aging_band];
jld_bucket_count += bucket->count;
bucket = &memstat_bucket[applications_aging_band];
jld_bucket_count += bucket->count;
break;
case kJetsamAgingPolicyNone:
default:
bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
jld_bucket_count = bucket->count;
break;
}
if ( (jld_bucket_count == 0) ||
(jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) {
jld_timestamp_msecs = jld_now_msecs;
// 先回收優(yōu)先級(jí)特別低的進(jìn)程:JETSAM_PRIORITY_IDLE轻庆,system_procs_aging_band癣猾,applications_aging_band,這些進(jìn)程回收后jld_bucket_count將等于0
jld_idle_kill_candidates = jld_bucket_count;
*jld_idle_kills = 0;
jld_eval_aggressive_count = 0;
jld_priority_band_max = JETSAM_PRIORITY_UI_SUPPORT;
}
// 正常狀態(tài)下先回收一些隨時(shí)可以回收的線程:JETSAM_PRIORITY_IDLE余爆,system_procs_aging_band煎谍,applications_aging_band,這些進(jìn)程回收后才能走進(jìn)這個(gè)判斷里面
if (*jld_idle_kills > jld_idle_kill_candidates) {
// 這里面是我們App經(jīng)常觸發(fā)OOM的地方
}
killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
if (killed) {
jld_idle_kills++;
}
根據(jù)jetsam_aging_policy
確定哪些優(yōu)先級(jí)類(lèi)型的進(jìn)程需要被直接干掉龙屉。正常走到kJetsamAgingPolicyAppsReclaimedFirst
或者kJetsamAgingPolicySysProcsReclaimedFirst
,jld_bucket_count = JETSAM_PRIORITY_IDLE + system_procs_aging_band + applications_aging_band
*jld_idle_kills
表示已經(jīng)kill掉的低優(yōu)先級(jí)進(jìn)程满俗,每次kill掉一個(gè)低優(yōu)先級(jí)進(jìn)程jld_idle_kills++
转捕。jld_idle_kill_candidates = jld_bucket_count;
,在if (*jld_idle_kills > jld_idle_kill_candidates)
的判斷條件里唆垃,只有前面提到的jld_bucket_count
的低優(yōu)先級(jí)進(jìn)程全部被干掉了五芝,才會(huì)走到判斷條件里面。
所以當(dāng)內(nèi)存不夠的時(shí)候辕万,系統(tǒng)會(huì)先回收JETSAM_PRIORITY_IDLE ``system_procs_aging_band ``applications_aging_band
優(yōu)先級(jí)的進(jìn)程枢步。
我們?cè)賮?lái)看判斷條件里面
if (*jld_idle_kills > jld_idle_kill_candidates) {
jld_eval_aggressive_count++;
if ((jld_eval_aggressive_count == memorystatus_jld_eval_aggressive_count) &&
(total_corpses_count() > 0) && (*corpse_list_purged == FALSE)) {
task_purge_all_corpses();
*corpse_list_purged = TRUE;
}
else if (jld_eval_aggressive_count > memorystatus_jld_eval_aggressive_count) {
if ((memorystatus_jld_eval_aggressive_priority_band_max < 0) ||
(memorystatus_jld_eval_aggressive_priority_band_max >= MEMSTAT_BUCKET_COUNT)) {
} else {
jld_priority_band_max = memorystatus_jld_eval_aggressive_priority_band_max;
}
}
// 先干掉后臺(tái)線程
/* Visit elevated processes first */
while (elevated_bucket_count) {
elevated_bucket_count--;
os_reason_ref(jetsam_reason);
killed = memorystatus_kill_elevated_process(
cause,
jetsam_reason,
jld_eval_aggressive_count,
&errors);
if (killed) {
*post_snapshot = TRUE;
// 如果還是有壓力,就繼續(xù)殺App
if (memorystatus_avail_pages_below_pressure()) {
/*
* Still under pressure.
* Find another pinned processes.
*/
continue;
} else {
return TRUE;
}
} else {
break;
}
}
// 干掉前臺(tái)線程
killed = memorystatus_kill_top_process_aggressive(
kMemorystatusKilledVMThrashing,
jld_eval_aggressive_count,
jld_priority_band_max,
&errors);
if (killed) {
/* Always generate logs after aggressive kill */
*post_snapshot = TRUE;
*jld_idle_kills = 0;
return TRUE;
}
}
他會(huì)先通過(guò)memorystatus_kill_elevated_process
干掉后臺(tái)的進(jìn)程渐尿,每干掉一個(gè)進(jìn)程醉途,檢測(cè)一下內(nèi)存壓力,檢測(cè)內(nèi)存壓力還是通過(guò)memorystatus_available_pages
static boolean_t memorystatus_avail_pages_below_pressure(void) {
return (memorystatus_available_pages <= memorystatus_available_pages_pressure);
}
如果memorystatus_available_pages
還是小于閾值砖茸,則繼續(xù)kill下一個(gè)進(jìn)程隘擎。當(dāng)所有后臺(tái)進(jìn)程都被kill后。如果還有內(nèi)存壓力凉夯,再通過(guò)memorystatus_kill_top_process_aggressive
kill掉優(yōu)先級(jí)最低的進(jìn)程货葬。這里是觸發(fā)FOOM的關(guān)鍵,如果foreground
已經(jīng)是最低優(yōu)先級(jí)的進(jìn)程了劲够,那就會(huì)發(fā)生FOOM震桶,kill掉前臺(tái)的App
memorystatus_available_pages
計(jì)算
是否觸發(fā)FOOM,主要還是根據(jù)memorystatus_available_pages
是否小于閾值征绎。那memorystatus_available_pages
怎么計(jì)算呢蹲姐?
查閱源碼,可以找到
#define VM_CHECK_MEMORYSTATUS do { \
memorystatus_pages_update( \
vm_page_pageable_external_count + \
vm_page_free_count + \
(VM_DYNAMIC_PAGING_ENABLED() ? 0 : vm_page_purgeable_count) \
); \
} while(0)
void memorystatus_pages_update(unsigned int pages_avail)
{
memorystatus_available_pages = pages_avail;
...
}
可以看到memorystatus_available_pages = vm_page_pageable_external_count + vm_page_free_count + vm_page_purgeable_count
-
vm_page_pageable_external_count
: iOS里表示已經(jīng)備份的page count炒瘸,內(nèi)存不夠時(shí)淤堵,可以使用 -
vm_page_free_count
: 表示未使用的page count -
vm_page_purgeable_count
: 表示可清理的page count
另外memorystatus_available_pages_pressure實(shí)際等于手機(jī)最大內(nèi)存的15%。也就是說(shuō)當(dāng)可用內(nèi)存小于系統(tǒng)內(nèi)存的15%時(shí)顷扩,就會(huì)觸發(fā)OOM了
邏輯匯總
縱觀memorystatus_thread
代碼拐邪,邏輯如下:
- 判斷
kill_under_pressure_cause
值為kMemorystatusKilledVMThrashing
,kMemorystatusKilledFCThrashing
,kMemorystatusKilledZoneMapExhaustion
時(shí),或者當(dāng)前可用內(nèi)存memorystatus_available_pages
小于閾值memorystatus_available_pages_pressure
隘截,進(jìn)入OOM邏輯 - 遍歷每個(gè)進(jìn)程扎阶,跟據(jù)
phys_footprint
汹胃,判斷每個(gè)進(jìn)程是否高于閾值,如果高于閾值东臀,以high-water
類(lèi)型kill進(jìn)程着饥,觸發(fā)OOM - 如果
JETSAM_PRIORITY_IDLE
,JETSAM_PRIORITY_AGING_BAND1
,JETSAM_PRIORITY_IDLE
優(yōu)先級(jí)隊(duì)列中還存在進(jìn)程,則kill一個(gè)最低優(yōu)先級(jí)的進(jìn)程惰赋,再次走1
的判斷邏輯 - 當(dāng)所有低優(yōu)先級(jí)進(jìn)程被kill掉后宰掉,如果
memorystatus_available_pages
仍然小于閾值,先kill掉后臺(tái)進(jìn)程赁濒,每kill一個(gè)進(jìn)程轨奄,判斷一下memorystatus_available_pages
是否還小于閾值,如果已經(jīng)小于閾值拒炎,則結(jié)束流程挪拟,走到1
- 當(dāng)所有后臺(tái)優(yōu)先級(jí)進(jìn)程都被kill后,調(diào)用
memorystatus_kill_top_process_aggressive
击你,kill掉前臺(tái)的進(jìn)程玉组。再次回到1
總結(jié)
根據(jù)源碼,觸發(fā)前臺(tái)OOM的可能性有3個(gè):
- 直接觸發(fā)同步kill丁侄,比如
kMemorystatusKilledPerProcessLimit
類(lèi)型的OOM惯雳,這個(gè)解釋起來(lái)還需要一篇文章,暫時(shí)不在本文的討論范圍之類(lèi) -
footprint_in_bytes > memlimit_in_bytes
绒障,觸發(fā)high-water
類(lèi)型的OOM吨凑,目前我在自己手機(jī)上,暫時(shí)沒(méi)有看到這個(gè)類(lèi)型的OOM - 當(dāng)后臺(tái)線程都被kill后户辱,依然
memorystatus_available_pages <= memorystatus_available_pages_pressure
鸵钝,進(jìn)而系統(tǒng)kill掉我們的App
OOM監(jiān)控和解決都還是目前iOS界內(nèi)的一個(gè)難點(diǎn),大多數(shù)App的OOM率應(yīng)該比Crash率高不少庐镐,因?yàn)镃rash的監(jiān)控已經(jīng)有非常成熟的方案了恩商,只需要根據(jù)堆棧,解決Crash即可必逆,而OOM的監(jiān)控還是任重道遠(yuǎn)怠堪。