ANR問題分析的一般套路

TIM截圖20200604083317.png

ANR(App Not Responding)基本上99%的App都有疾捍,即使是系統(tǒng),也有system_anr。

一、ANR初步了解

1异赫、發(fā)生原因

一句話總結(jié):沒有在規(guī)定的時(shí)間內(nèi),干完要干的事情头岔,就會(huì)發(fā)生ANR塔拳。

2、ANR分類

從發(fā)生的場景分類:

  • Input事件超過5s沒有被處理完
  • Service處理超時(shí)峡竣,前臺(tái)20s靠抑,后臺(tái)200s
  • BroadcastReceiver處理超時(shí),前臺(tái)10S适掰,后臺(tái)60s
  • ContentProvider執(zhí)行超時(shí)颂碧,比較少見
    從發(fā)生的原因分:
  • 主線程有耗時(shí)操作,如有復(fù)雜的layout布局类浪,IO操作等载城。
  • 被Binder對(duì)端block
  • 被子線程同步鎖block
  • Binder被占滿導(dǎo)致主線程無法和SystemServer通信
  • 得不到系統(tǒng)資源(CPU/RAM/IO)
    從進(jìn)程的角度分:
  • 問題出在當(dāng)前進(jìn)程:
  • 主線程本身耗時(shí), 或則主線程的消息隊(duì)列存在耗時(shí)操作;
  • 主線程被本進(jìn)程的其他子線程所blocked;
  • 問題出在遠(yuǎn)端進(jìn)程(一般是binder call或socket等通信方式)

二、ANR的Log解讀

2.1费就、Log獲取

發(fā)生了ANR問題诉瓦,通常會(huì)抓一份bugreport

adb bugreprot  xxx

最為重要的是,生成的bugreport有anr的trace力细,如果要單獨(dú)拿出來也行

adb pull /data/anr/traces.txt  xxx

一份完整的bugreport包含下面的信息睬澡,對(duì)分析ANR問題很關(guān)鍵

Log名稱 作用 獲取命令
system.log 包含ANR發(fā)生時(shí)間點(diǎn)信息、ANR發(fā)生前的CPU信息,還包含大量系統(tǒng)服務(wù)輸出的信息 adb logcat –b system
main.log 包含ANR發(fā)生前應(yīng)用自身輸出的信息,可供分析應(yīng)用是否有異常;此外還包含輸出的GC信息眠蚂,可供分析內(nèi)存回收的速度,判斷系統(tǒng)是否處于低內(nèi)存或內(nèi)存碎片化狀態(tài) adb logcat –b main
event.log 包含AMS與WMS輸出的應(yīng)用程序聲明周期信息,可供分析窗口創(chuàng)建速度以及焦點(diǎn)轉(zhuǎn)換情況 adb logcat –b  event
kernel.log 包含kernel打出的信息,LowMemoryKiller殺進(jìn)程煞聪、內(nèi)存碎片化或內(nèi)存不足,mmc驅(qū)動(dòng)異常都可以在這里找到。

2.1河狐、案例一:sp耗時(shí)問題導(dǎo)致應(yīng)用ANR

一般先搜索ANR in獲取最直觀的信息米绕,如下:

06-16 16:16:28.590  1853  2073 E ActivityManager:**ANR in** com.android.camera (com.android.camera/.Camera)
06-16 16:16:28.590  1853  2073 E ActivityManager: PID: 27661
06-16 16:16:28.590  1853  2073 E ActivityManager: Reason: Input dispatching timed out (com.android.camera/com.android.camera.Camera, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 24.  Wait queue head age: 5511.1ms.)
06-16 16:16:28.590  1853  2073 E ActivityManager: Load: 16.25 / 29.48 / 38.33
06-16 16:16:28.590  1853  2073 E ActivityManager: CPU usage from 0ms to 8058ms later:
06-16 16:16:28.590  1853  2073 E ActivityManager:   58% 291/mediaserver: 51% user + 6.7% kernel / faults: 2457 minor 4 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   27% 317/mm-qcamera-daemon: 21% user + 5.8% kernel / faults: 15965 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 288/debuggerd: 0% user + 0.3% kernel / faults: 21615 minor 87 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   17% 27661/com.android.camera: 10% user + 6.8% kernel / faults: 2412 minor 34 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   16% 1853/system_server: 10% user + 6.4% kernel / faults: 1754 minor 87 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   10% 539/sensors.qcom: 7.8% user + 2.6% kernel / faults: 16 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   4.4% 277/surfaceflinger: 1.8% user + 2.6% kernel / faults: 14 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   4% 203/mmcqd/0: 0% user + 4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   2.6% 3510/com.android.phone: 1.9% user + 0.6% kernel / faults: 1148 minor 8 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   2.1% 2902/com.android.systemui: 1.6% user + 0.4% kernel / faults: 1272 minor 32 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   1.6% 3110/com.miui.whetstone: 1.6% user + 0% kernel / faults: 2614 minor 22 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.8% 99/kswapd0: 0% user + 0.8% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   1.4% 217/jbd2/mmcblk0p25: 0% user + 1.4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   1.4% 223/logd: 0.7% user + 0.7% kernel / faults: 4 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.9% 12808/kworker/0:1: 0% user + 0.9% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.8% 35/kworker/u:2: 0% user + 0.8% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0% 3222/com.miui.sysbase: 0% user + 0% kernel / faults: 1314 minor 12 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.8% 3446/com.android.nfc: 0.4% user + 0.3% kernel / faults: 1223 minor 9 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.7% 10866/kworker/u:1: 0% user + 0.7% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.6% 642/mdss_fb0: 0% user + 0.6% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.6% 29336/kworker/u:7: 0% user + 0.6% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 6/kworker/u:0: 0% user + 0.4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 22924/kworker/u:6: 0% user + 0.4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.3% 4421/mpdecision: 0% user + 0.3% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.2% 276/servicemanager: 0.1% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.2% 289/rild: 0.2% user + 0% kernel / faults: 20 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 4161/mcd: 0% user + 0% kernel / faults: 9 minor 1 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 3/ksoftirqd/0: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 5/kworker/0:0H: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 7/kworker/u:0H: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0% 215/flush-179:0: 0% user + 0% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 321/displayfeature: 0.1% user + 0% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 368/irq/33-cpubw_hw: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 403/qmuxd: 0% user + 0.1% kernel / faults: 60 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0% 3491/com.xiaomi.finddevice: 0% user + 0% kernel / faults: 706 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 29330/ksoftirqd/1: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager: 96% TOTAL: 56% user + 29% kernel + 6.3% iowait + 4.1% softirq

遇到ANR問題瑟捣,擺在我們面前的trace是不是第一案發(fā)現(xiàn)場,如果ANR發(fā)生的輸出的信息很多栅干,當(dāng)時(shí)的CPU和I/O資源比較緊張迈套,那么這段日志輸出的時(shí)間點(diǎn)可能會(huì)延遲10秒到20秒都有可能,所以我們有時(shí)候需要提高警惕碱鳞,現(xiàn)在舉例逐行解讀一下:

06-16 16:16:28.590  1853  2073 E ActivityManager: ANR in com.android.camera (com.android.camera/.Camera)

這一行得知ANR發(fā)生的時(shí)間是06-16 16:16:28.590桑李,發(fā)生的進(jìn)程是com.android.camera ,具體在com.android.camera/.Camera窿给,其中1853是systemserver的pid贵白,2073是ActivityManager線程的pid,ActivityManager是一個(gè)系統(tǒng)線程崩泡。
其實(shí)在Events log中也有對(duì)應(yīng)的信息禁荒,搜索關(guān)鍵字am_anr

06-16 16:16:20.536  1853  2073 I am_anr  : [0,27661,com.android.camera,952745541,Input dispatching timed out (com.android.camera/com.android.camera.Camera, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 24.  Wait queue head age: 5511.1ms.)]

由此同樣可以確定ANR的時(shí)間點(diǎn),類型角撞,進(jìn)程pid呛伴,進(jìn)程名稱等,繼續(xù)看下一行

06-16 16:16:28.590  1853  2073 E ActivityManager: PID: 27661

這一行得知ANR進(jìn)程的pid是27661谒所,特殊情況热康,如果pid為0,說明在發(fā)生ANR之前劣领,這個(gè)進(jìn)程就被LowMemoryKiller殺死了或者出現(xiàn)了Crash姐军,這種情況下,是無法接收到系統(tǒng)的廣播或者按鍵消息的尖淘,故出現(xiàn)ANR

06-16 16:16:28.590  1853  2073 E ActivityManager: Reason: Input dispatching timed out (com.android.camera/com.android.camera.Camera, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 24.  Wait queue head age: 5511.1ms.)

這一行得知ANR發(fā)生的原因是Input dispatching timed out

06-16 16:16:28.590  1853  2073 E ActivityManager: Load: 16.25 / 29.48 / 38.33

這行得知Cpu的負(fù)載奕锌,在Linux操作系統(tǒng)上,輸入uptime也能得到一段時(shí)間的負(fù)載村生。

wangjing@wangjing-OptiPlex-7050:~$ uptime
20:09:54 up 71 days, 10:48,  1 user,  load average: 0.99, 0.78, 0.86

那么負(fù)載是什么意思呢歇攻?Load后面的三個(gè)數(shù)字的意思分別是1分鐘、5分鐘梆造、15分鐘內(nèi)系統(tǒng)的平均負(fù)荷。

當(dāng)CPU完全空閑的時(shí)候葬毫,平均負(fù)荷為0镇辉;當(dāng)CPU工作量飽和的時(shí)候,平均負(fù)荷為1贴捡,通過Load可以判斷系統(tǒng)負(fù)荷是否過重忽肛。

有一個(gè)形象的比喻:個(gè)CPU想象成一座大橋,橋上只有一根車道烂斋,所有車輛都必須從這根車道上通過屹逛,系統(tǒng)負(fù)荷為0础废,意味著大橋上一輛車也沒有,系統(tǒng)負(fù)荷為0.5罕模,意味著大橋一半的路段有車评腺,系統(tǒng)負(fù)荷為1.0,意味著大橋的所有路段都有車淑掌,也就是說大橋已經(jīng)"滿"了蒿讥,系統(tǒng)負(fù)荷為2.0,意味著車輛太多了抛腕,大橋已經(jīng)被占滿了(100%)芋绸,后面等著上橋的車輛還有一倍。大橋的通行能力担敌,就是CPU的最大工作量摔敛;橋梁上的車輛,就是一個(gè)個(gè)等待CPU處理的進(jìn)程(process)全封。

經(jīng)驗(yàn)法則是這樣的
當(dāng)系統(tǒng)負(fù)荷持續(xù)大于0.7马昙,你必須開始調(diào)查了,問題出在哪里售貌,防止情況惡化给猾。
當(dāng)系統(tǒng)負(fù)荷持續(xù)大于1.0,你必須動(dòng)手尋找解決辦法颂跨,把這個(gè)值降下來敢伸。
當(dāng)系統(tǒng)負(fù)荷達(dá)到5.0,就表明你的系統(tǒng)有很嚴(yán)重的問題

而我們現(xiàn)在的手機(jī)是多核CPU架構(gòu)恒削,八核的多的是池颈,意味著Cpu處理的能力就乘以了8,每個(gè)核運(yùn)行的時(shí)間可以從下面的文件中得到钓丰,/sys/devices/system/cpu/cpu%d/cpufreq/stats/time_in_state
中讀取的躯砰,%d代表是CPU的核。文件中記錄了 CPU 從開機(jī)到讀取文件時(shí)携丁,在各個(gè)頻率下的運(yùn)行時(shí)間琢歇,單位:10 mS。

使用adb shell cat /sys/devices/system/cpu/cpu1/cpufreq/stats/time_in_state查看
頻度       時(shí)間
652800 1813593
1036800 46484
1401600 521974
1689600 2956667
1843200 83065
1958400 53516
2016000 251693

關(guān)于負(fù)荷詳細(xì)的可以看理解Linux系統(tǒng)負(fù)荷梦鉴,不過多擴(kuò)展李茫。

06-16 16:16:28.590  1853  2073 E ActivityManager: CPU usage from 0ms to 8058ms later:
06-16 16:16:28.590  1853  2073 E ActivityManager:   58% 291/mediaserver: 51% user + 6.7% kernel / faults: 2457 minor 4 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   27% 317/mm-qcamera-daemon: 21% user + 5.8% kernel / faults: 15965 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 288/debuggerd: 0% user + 0.3% kernel / faults: 21615 minor 87 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   17% 27661/com.android.camera: 10% user + 6.8% kernel / faults: 2412 minor 34 major
....
06-16 16:16:28.590  1853  2073 E ActivityManager: 96% TOTAL: 56% user + 29% kernel + 6.3% iowait + 4.1% softirq
.....

這一段日志可以得到ANR發(fā)生的時(shí)候,Top進(jìn)程的Cpu占用情況肥橙,user代表是用戶空間魄宏,kernel是內(nèi)核空間,一般的有如下的規(guī)律存筏。

  • kswapd0 cpu占用率偏高宠互,系統(tǒng)整體運(yùn)行會(huì)緩慢味榛,從而引起各種ANR。把問題轉(zhuǎn)給"內(nèi)存優(yōu)化"予跌,請(qǐng)他們進(jìn)行優(yōu)化搏色。

  • logd CPU占用率偏高,也會(huì)引起系統(tǒng)卡頓和ANR匕得,因?yàn)楦鱾€(gè)進(jìn)程輸出LOG的操作被阻塞從而執(zhí)行的極為緩慢

  • Vold占用CPU過高继榆,會(huì)引起系統(tǒng)卡頓和ANR,請(qǐng)負(fù)責(zé)存儲(chǔ)的同學(xué)先調(diào)查

  • qcom.sensor CPU占用率過高汁掠,會(huì)引起卡頓略吨,請(qǐng)系統(tǒng)同學(xué)調(diào)查

  • 應(yīng)用自身CPU占用率較高,高概率應(yīng)用自身問題

  • 系統(tǒng)CPU占用率不高考阱,但主線程在等待一個(gè)鎖翠忠,高概率應(yīng)用自身問題

  • 應(yīng)用處于D狀態(tài),發(fā)生ANR乞榨,如果最后的操作是refriger秽之,那么是應(yīng)用被凍結(jié)了,正常情況下是功耗優(yōu)化引起的
    好了吃既,通過上面的日志我們得到了ANR的基本信息考榨,要得到阻塞的地方,還要靠trace文件鹦倚。一般都在anr目錄下河质。在這個(gè)trace文件中搜索主線程的堆棧,如下:

----- pid 27661 at 2017-06-16 16:16:20 -----
Cmd line: com.android.camera
"main" prio=5 tid=1 Waiting
 | group="main" sCount=1 dsCount=0 obj=0x75a4b5c8 self=0xb4cf6500
 | sysTid=27661 nice=-10 cgrp=default sched=0/0 handle=0xb6f6cb34
 | state=S schedstat=( 11242036155 8689191757 38520 ) utm=895 stm=229 core=0 HZ=100
 | stack=0xbe4ea000-0xbe4ec000 stackSize=8MB
 | held mutexes=
 at java.lang.Object.wait!(Native method)
 - waiting on <0x09e6a059> (a java.lang.Object)
 at java.lang.Thread.parkFor$(Thread.java:1220)
 - locked <0x09e6a059> (a java.lang.Object)
 at sun.misc.Unsafe.park(Unsafe.java:299)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:810)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:970)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1278)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:203)
 at android.app.SharedPreferencesImpl$EditorImpl$1.run(SharedPreferencesImpl.java:366)
 at android.app.QueuedWork.waitToFinish(QueuedWork.java:88)
 at android.app.ActivityThread.handleStopActivity(ActivityThread.java:3605)
 at android.app.ActivityThread.access$1300(ActivityThread.java:153)
 at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1399)
 at android.os.Handler.dispatchMessage(Handler.java:102)
 at android.os.Looper.loop(Looper.java:154)
 at android.app.ActivityThread.main(ActivityThread.java:5528)
 at java.lang.reflect.Method.invoke!(Native method)
 at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:740)
 at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:630)

解讀一下部分字段的含義

字段 含義
tid=1 線程號(hào)
sysTid=27661 主線程的線程號(hào)和進(jìn)程號(hào)相同
Waiting 線程狀態(tài)震叙,其中state也是線程狀態(tài)掀鹅,如果state=D代表底層被blocked了。
nice nice值越小媒楼,則優(yōu)先級(jí)越高乐尊。因?yàn)槭侵骶€程此處nice=-10, 可以看到優(yōu)先級(jí)很高了
schedstat 括號(hào)中的3個(gè)數(shù)字,依次是Running, Runable, Switch划址,Running時(shí)間扔嵌。Running時(shí)間:CPU運(yùn)行的時(shí)間,單位ns夺颤。  Runable時(shí)間:RQ隊(duì)列的等待時(shí)間对人,單位ns》鞴玻  Switch次數(shù):CPU調(diào)度切換次數(shù)
utm 該線程在用戶態(tài)所執(zhí)行的時(shí)間,單位是jiffies
stm 該線程在內(nèi)核態(tài)所執(zhí)行的時(shí)間姻几,單位是jiffies
sCount 此線程被掛起的次數(shù)
dsCount 線程被調(diào)試器掛起的次數(shù)宜狐,當(dāng)一個(gè)進(jìn)程被調(diào)試后势告,sCount會(huì)重置為0,調(diào)試完畢后sCount會(huì)根據(jù)是否被正常掛起增長抚恒,但是dsCount不會(huì)被重置為0咱台,所以dsCount也可以用來判斷這個(gè)線程是否被調(diào)試過
self 線程本身的地址

在說一下線程的狀態(tài)
|狀態(tài)|值|說明|
|THREAD_ZOMBIE|0|TERMINATED|
|THREAD_RUNNING| 1 |RUNNABLE or running now|
|THREAD_TIMED_WAIT| 2 |TIMED_WAITING in Object.wait()|
|THREAD_MONITOR| 3 |BLOCKED on a monitor|
|THREAD_INITIALIZING| 5 |allocated not yet running|
|THREAD_STARTING |6 |started not yet on thread list|
|THREAD_NATIVE| 7 |off in a JNI native method|
|THREAD_VMWAIT |8 |waiting on a VM resource|
|THREAD_SUSPENDED| 9 |suspended usually by GC or debugger|
那么這個(gè)問題怎么搞呢,通過上面的基礎(chǔ)介紹與trace文件俭驮,我們知道回溺,blocked點(diǎn)是

 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:203)
 at android.app.SharedPreferencesImpl$EditorImpl$1.run(SharedPreferencesImpl.java:366)
 at android.app.QueuedWork.waitToFinish(QueuedWork.java:88)
 at android.app.ActivityThread.handleStopActivity(ActivityThread.java:3605)
 at android.app.ActivityThread.access$1300(ActivityThread.java:153)

先來看QueuedWork.waitToFinish

77    /**
78     * Finishes or waits for async operations to complete.
79     * (e.g. SharedPreferences$Editor#startCommit writes)
80     *
81     * Is called from the Activity base class's onPause(), after
82     * BroadcastReceiver's onReceive, after Service command handling,
83     * etc.  (so async work is never lost)
84     */
85    public static void waitToFinish() {
86        Runnable toFinish;
     //等待所有等待完成的任務(wù)完成
87        while ((toFinish = sPendingWorkFinishers.poll()) != null) {
88            toFinish.run();
89        }
90    }

QueuedWork.waitToFinish會(huì)在Activity的onPause或者BroadcastReceiver的onReceive之后被調(diào)用,目的是確保異步任務(wù)執(zhí)行完成.在waitToFinish中遍歷sPendingWorkFinishers所有等待完成的任務(wù)混萝,并等待他們的完成遗遵。在來看SharedPreferencesImpl.apply,這個(gè)方法里面會(huì)將等待寫入到文件系統(tǒng)的任務(wù)放到QueuedWork的等待完成隊(duì)列里

361        public void apply() {
362            final MemoryCommitResult mcr = commitToMemory();
363            final Runnable awaitCommit = new Runnable() {
364                    public void run() {
365                        try {
366                            mcr.writtenToDiskLatch.await();
367                        } catch (InterruptedException ignored) {
368                        }
369                    }
370                };
371
     //將等待寫入到文件系統(tǒng)的任務(wù)放到QueuedWork的等待隊(duì)列中
372            QueuedWork.add(awaitCommit);
373            ... ... ... ... ... ...
388        }

雖然apply方法本身可以很快返回逸嘀,但是當(dāng)Activity的onPause被調(diào)用時(shí)车要,會(huì)等待寫入到文件系統(tǒng)的任務(wù)完成.也就是說,雖然apply本身不會(huì)阻塞調(diào)用線程崭倘,但是會(huì)將等待時(shí)間轉(zhuǎn)嫁到主線程.因此翼岁,如果寫入任務(wù)執(zhí)行比較慢,activity, service, broadcast在生命周期結(jié)束時(shí), sp操作沒有完成,就會(huì)阻塞主線程造成ANR司光。分析到這里琅坡,明顯是個(gè)系統(tǒng)問題了,而App也是無能為力残家,
一般的看trace有如下規(guī)律:

  • 發(fā)生ANR時(shí)榆俺,trace中找不到相應(yīng)進(jìn)程,檢查一下Android Runtime是否因?yàn)閼?yīng)用的崩潰給ShutDown了跪削,如果ShutDown了谴仙,此時(shí)要去查ShutDown的原因。
  • 應(yīng)用發(fā)生ANR碾盐,如果主線程正在執(zhí)行g(shù)etContentProvider晃跺,那么它正在請(qǐng)求另一個(gè)應(yīng)用的ContentProvider,此時(shí)要查一下目標(biāo)ContentProvider的宿主進(jìn)程毫玖,看看正在做什么
  • 主線程執(zhí)行數(shù)據(jù)庫操作或網(wǎng)絡(luò)請(qǐng)求掀虎,應(yīng)該是應(yīng)用自身問題
  • 主線程等待其他線程持有的鎖,而目標(biāo)線程執(zhí)行數(shù)據(jù)庫操作或網(wǎng)絡(luò)請(qǐng)求付枫,那么是應(yīng)用自身問題烹玉。

這里只是先搞一個(gè)案例熟悉一下ANR分析基本流程。到這里總結(jié)一下上面的套路:

  • 根據(jù)進(jìn)程尋找主線程的trace阐滩,發(fā)現(xiàn)被blocked的地方
  • 結(jié)合源碼進(jìn)行分析解決

當(dāng)然通過這兩個(gè)步驟就定位到ANR發(fā)生的原因二打,說明我們的運(yùn)氣比較好,然而大多數(shù)時(shí)候不是這個(gè)樣子的

上面我們分析了一個(gè)系統(tǒng)問題導(dǎo)致的ANR掂榔,這里你可能會(huì)想继效,我的app什么活都沒有干症杏,竟然發(fā)生ANR了,以后可以把鍋直接甩給系統(tǒng)了瑞信,非也厉颤,具體問題還是要具體分析,懷疑系統(tǒng)凡简,我們要有證據(jù)逼友,證據(jù)從哪里來,還是從Log里面來秤涩,繼續(xù)看第三節(jié)帜乞,系統(tǒng)耗時(shí)分析方案

三、系統(tǒng)耗時(shí)分析方案

系統(tǒng)做了一些耗時(shí)分析的操作溉仑,在一些手機(jī)廠商中挖函,Log里面還有其他的加強(qiáng),這里列舉比較通用的一些

1.binder_sample

  • A.功能說明: 監(jiān)控每個(gè)進(jìn)程的主線程的binder transaction的耗時(shí)情況, 當(dāng)超過閾值時(shí),則輸出相應(yīng)的目標(biāo)調(diào)用信息浊竟,默認(rèn)1000ms打開怨喘。
  • B.log格式: 52004 binder_sample (descriptor|3),(method_num|1|5),(time|1|3),(blocking_package|3),(sample_percent|1|6)
  • C.log實(shí)例:
2754 2754 I binder_sample: [android.app.IActivityManager,35,2900,android.process.media,5]

從上面的log中可以得出
1.主線程2754;
2.執(zhí)行android.app.IActivityManager接口
3.所對(duì)應(yīng)方法code =35(即STOP_SERVICE_TRANSACTION)4.所花費(fèi)時(shí)間為2900ms.
5.該block所在package為 android.process.media,最后一個(gè)參數(shù)是sample比例(沒有太大價(jià)值)

2振定、dvm_lock_sample

  • A.功能說明: 當(dāng)某個(gè)線程等待lock的時(shí)間blocked超過閾值,則輸出當(dāng)前的持鎖狀態(tài) ;
  • B.log格式: 20003 dvm_lock_sample (process|3),(main|1|5),(thread|3),(time|1|3),(file|3),(line|1|5),(ownerfile|3),(ownerline|1|5),(sample_percent|1|6)
  • C.log實(shí)例:
dvm_lock_sample: [system_server,1,Binder_9,1500,ActivityManagerService.java,6403,-,1448,0]

意思是system_server: Binder_9,執(zhí)行到ActivityManagerService.java的6403行代碼,一直在等待AMS鎖, 而該鎖所同一文件的1448行代碼所持有, 從而導(dǎo)致Binder_9線程被阻塞1500ms.

3必怜、 binder starved

  • A.功能說明: 當(dāng)system_server等進(jìn)程的線程池使用完, 無空閑線程時(shí), 則binder通信都處于饑餓狀態(tài), 則饑餓狀態(tài)超過一定閾值則輸出信息;
  • B.云控參數(shù): persist.sys.binder.starvation (默認(rèn)值16ms)
  • C.log實(shí)例:
1232 1232 "binder thread pool (16 threads) starved for 100 ms"
  • D.log解析: system_server進(jìn)程的 線程池已滿的持續(xù)長達(dá)100ms

一般有了這些信息之后,可以輔助我們確定問題原因歸屬是系統(tǒng)原因還是App原因后频,看下面的案例二:

4梳庆、案例二:瘋狂Binder Call導(dǎo)致應(yīng)用ANR

搜索ANR in

08-28 18:54:00.110  1000  1825  1848 E ActivityManager: ANR in com.jeejen.family (com.jeejen.family/com.jeejen.home.launcher.ShoppingActivity)
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: PID: 20576
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: Reason: Input dispatching timed out (com.jeejen.family/com.jeejen.home.launcher.WelcomeActivity, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 2.  Wait queue head age: 10064.4ms.)
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: Parent: com.jeejen.family/com.jeejen.home.launcher.WelcomeActivity
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: Load: 1.25 / 1.1 / 1.37
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: CPU usage from 5166ms to 0ms ago (2018-08-28 18:53:51.270 to 2018-08-28 18:53:56.436):
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   7.7% 1825/system_server: 5.6% user + 2.1% kernel / faults: 1329 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   3.6% 20683/com.jeejen.family:pushcenter_pushservice: 3% user + 0.5% kernel / faults: 542 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   2.7% 4114/cnss_diag: 1.9% user + 0.7% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   2.1% 422/kworker/u16:7: 0% user + 2.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   1.9% 20830/com.jeejen.family:store: 1.3% user + 0.5% kernel / faults: 199 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   1.7% 20608/com.jeejen.family:pushcenter: 1.1% user + 0.5% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   1.5% 725/android.hardware.sensors@1.0-service: 0.7% user + 0.7% kernel / faults: 1 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.9% 3538/com.android.systemui: 0.7% user + 0.1% kernel / faults: 11 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.5% 241/crtc_commit:111: 0% user + 0.5% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.5% 419/kworker/u16:4: 0% user + 0.5% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.5% 786/surfaceflinger: 0.5% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 185/IPCRTR_dsps_sme: 0% user + 0.3% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 730/android.hardware.wifi@1.0-service: 0.1% user + 0.1% kernel / faults: 28 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 820/dsps_IPCRTR: 0% user + 0.3% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 1147/msm_irqbalance: 0.1% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 4113/sugov:0: 0% user + 0.3% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 10/rcuop/0: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 18/ksoftirqd/1: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 34/ksoftirqd/3: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 53/rcuop/5: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 61/rcuop/6: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 242/crtc_event:111: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 538/ueventd: 0.1% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 577/jbd2/sda22-8: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 591/logd: 0.1% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 719/android.hardware.graphics.composer@2.1-service: 0.1% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 928/thermal-engine: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 3490/cds_mc_thread: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 3491/cds_ol_rx_threa: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 3680/com.android.phone: 0% user + 0.1% kernel / faults: 16 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 4248/com.miui.daemon: 0.1% user + 0% kernel / faults: 4 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 4488/com.miui.powerkeeper: 0.1% user + 0% kernel / faults: 10 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 5545/com.lbe.security.miui: 0% user + 0.1% kernel / faults: 6 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 6490/kworker/u17:2: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 7535/kworker/u16:15: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 7723/kworker/3:5: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 15111/kworker/1:0: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 15138/kworker/3:0: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 19857/kworker/0:3: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 20492/kworker/5:3: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: 3.8% TOTAL: 2% user + 1.1% kernel + 0% iowait + 0.3% irq + 0.1% softirq

按照上面的套路看看,各方面比較正常卑惜,發(fā)生時(shí)間大概是08-28 18:54:00.110 膏执,在看主線程的trace。

----- pid 20576 at 2018-08-28 18:53:56 -----
Cmd line: com.jeejen.family
"main" prio=5 tid=1 Native
| group="main" sCount=1 dsCount=0 flags=1 obj=0x77ffca18 self=0xecfce000
| sysTid=20576 nice=-10 cgrp=default sched=0/0 handle=0xf0bf2494
| state=S schedstat=( 628294395 402363898 957 ) utm=42 stm=20 core=4 HZ=100
| stack=0xff5fe000-0xff600000 stackSize=8MB
| held mutexes=
kernel: (couldn't read /proc/self/task/20576/stack)
native: #00 pc 00053cfc /system/lib/libc.so (__ioctl+8)
native: #01 pc 00021cd3 /system/lib/libc.so (ioctl+30)
native: #02 pc 0003d3f5 /system/lib/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+204)
native: #03 pc 0003dde3 /system/lib/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*)+26)
native: #04 pc 0003713d /system/lib/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+36)
native: #05 pc 000c3cf1 /system/lib/libandroid_runtime.so (android_os_BinderProxy_transact(_JNIEnv*, _jobject*, int, _jobject*, _jobject*, int)+200)
at android.os.BinderProxy.transactNative(Native method)
at android.os.BinderProxy.transact(Binder.java:1127)
at android.net.wifi.IWifiManager$Stub$Proxy.getConnectionInfo(IWifiManager.java:1441)
at android.net.wifi.WifiManager.getConnectionInfo(WifiManager.java:1778)
at org.chromium.net.NetworkChangeNotifierAutoDetect$WifiManagerDelegate.getWifiInfoLocked(NetworkChangeNotifierAutoDetect.java:28)
at org.chromium.net.NetworkChangeNotifierAutoDetect$WifiManagerDelegate.getWifiSsid(NetworkChangeNotifierAutoDetect.java:22)
- locked <0x0f4edae7> (a java.lang.Object)
at org.chromium.net.NetworkChangeNotifierAutoDetect.getCurrentNetworkState(NetworkChangeNotifierAutoDetect.java:67)
at org.chromium.net.NetworkChangeNotifierAutoDetect.<init>(NetworkChangeNotifierAutoDetect.java:21)
at org.chromium.net.NetworkChangeNotifier.setAutoDetectConnectivityStateInternal(NetworkChangeNotifier.java:61)

看樣子是binder call阻塞了露久,調(diào)用的接口是IWifiManager.getConnectionInfo()更米。因?yàn)槭莃inder調(diào)用,查看一下binder_sample毫痕。

08-28 18:54:01.384 10171 20576 20576 I binder_sample: [android.net.wifi.IWifiManager,24,16004,com.jeejen.family,100]
08-28 18:54:04.868 10171 20576 20576 I binder_sample: [android.net.wifi.IWifiManager,24,3479,com.jeejen.family,100]
08-28 18:56:12.712 10171 21885 21885 I binder_sample: [android.net.wifi.IWifiManager,24,8963,com.jeejen.family,100]

可以看到在ANR附近的時(shí)間征峦,使用IWifiManager接口的binder調(diào)用確實(shí)耗時(shí)比較長,那么這個(gè)是不是系統(tǒng)原因呢消请?那么去看他的對(duì)端Sysytem的代碼栏笆。

1763    /**
1764     * See {@link android.net.wifi.WifiManager#getConnectionInfo()}
1765     * @return the Wi-Fi information, contained in {@link WifiInfo}.
1766     */
1767    @Override
1768    public WifiInfo getConnectionInfo() {
1769        enforceAccessPermission();
1770        mLog.trace("getConnectionInfo uid=%").c(Binder.getCallingUid()).flush();
1771        /*
1772         * Make sure we have the latest information, by sending
1773         * a status request to the supplicant.
1774         */
1775        return mWifiStateMachine.syncRequestConnectionInfo();
1776    }
1521    public WifiInfo syncRequestConnectionInfo() {
1522        WifiInfo result = new WifiInfo(mWifiInfo);
1523        return result;
1524    }

getConnectionInfo直接通過wifiService調(diào)用wifiStateMachine中syncRequestConnectionInfo,這部分的實(shí)現(xiàn)不會(huì)阻塞住,難道Binder被沾滿了臊泰?從trace中并沒有看出這一點(diǎn)蛉加,那么怎么回事呢?我們嘗試復(fù)現(xiàn)這個(gè)問題,好在比較容易復(fù)現(xiàn)针饥。

09-04 18:24:29.182 D/WifiStateMachine( 1312): syncRequestConnectionInfo/in SSID: MIOffice-5G, BSSID: 70:3a:0e:2c:bb:f1, MAC: 80:ad:16:4c:0b:fe, Supplicant state: COMPLETED, RSSI: -44, Link speed: 400Mbps, Frequency: 5180MHz, Net ID: 0, Metered hint: false, score: 60
09-04 18:24:29.182 D/WifiStateMachine( 1312): syncRequestConnectionInfo/out SSID: MIOffice-5G, BSSID: 70:3a:0e:2c:bb:f1, MAC: 80:ad:16:4c:0b:fe, Supplicant state: COMPLETED, RSSI: -44, Link speed: 400Mbps, Frequency: 5180MHz, Net ID: 0, Metered hint: false, score: 60

發(fā)現(xiàn)在主線程進(jìn)行大量輸出上面的Log祟偷,極簡桌面在1分鐘內(nèi)調(diào)用此接口160次,導(dǎo)致了SystemServer不能及時(shí)響應(yīng)這個(gè)App打厘,造成了App自己的ANR。

像Binder call導(dǎo)致ANR的問題很常見贺辰,存在被blocked的風(fēng)險(xiǎn)户盯,此刻可以嘗試放到異步里面執(zhí)行,其次不要短時(shí)間的大量Binder調(diào)用饲化,這樣行為輕則App自己有問題莽鸭,重則系統(tǒng)發(fā)生Watchdog死機(jī)重啟。

4吃靠、案例三:廣播超時(shí)導(dǎo)致App的ANR

繼續(xù)看一個(gè)案例三硫眨,按照上面的套路,首先在event log中查看發(fā)生ANR的時(shí)間巢块。

12-17 06:02:14.463  1566  1583 I am_anr  : [0,8769,com.android.updater,952680005,Broadcast of Intent { act=android.intent.action.BOOT_COMPLETED flg=0x9000010 cmp=com.android.updater/.BootCompletedReceiver (has extras) }]

發(fā)生ANR的時(shí)間是am_anr 時(shí)間點(diǎn)是12-17 06:02:14.463礁阁,繼續(xù)看Log

12-17 06:02:00.370  1566  1583 W BroadcastQueue: Timeout of broadcast BroadcastRecord{21ef8c2 u0 android.intent.action.BOOT_COMPLETED} - receiver=android.os.BinderProxy@2a6c365, started 60006ms ago
12-17 06:02:00.370  1566  1583 W BroadcastQueue: Receiver during timeout: ResolveInfo{5a8283a com.android.updater/.BootCompletedReceiver m=0x108000}
12-17 06:02:00.370  1566  1583 I am_broadcast_discard_app: [0,35584194,android.intent.action.BOOT_COMPLETED,49,ResolveInfo{5a8283a com.android.updater/.BootCompletedReceiver m=0x108000}]

但是我們發(fā)現(xiàn)在12-17 06:02:00.370 已經(jīng)發(fā)生ANR,說明event log中的時(shí)間是個(gè)大概值族奢,可能由于系統(tǒng)資源比較緊張?jiān)斐梢欢ǔ潭鹊臏罄驯铡S捎谑莂ndroid.intent.action.BOOT_COMPLETED這個(gè)廣播接收出現(xiàn)的ANR,那么我們順藤摸瓜越走。

12-17 06:01:00.383  1566  3524 I ActivityManager: Start proc 8769:com.android.updater/9802 for broadcast com.android.updater/.BootCompletedReceiver caller=null

在12-17 06:01:00.383的時(shí)候啟動(dòng)了廣播進(jìn)程

12-17 06:01:36.721  8769  8769 D BootCompletedReceiver: onReceive android.intent.action.BOOT_COMPLETED
12-17 06:02:14.725 8769 8769 D UpdateService: onCreate

在12-17 06:01:36.721的時(shí)候棚品,客戶端BootCompletedReceiver onReceiver方法 開始回調(diào),然后onReceive 啟動(dòng)UpdateService廊敌,調(diào)用UpdateService.onCreate時(shí)間是12-17 06:02:14.725铜跑。根據(jù)上面的分析有兩個(gè)初步的疑問。

啟動(dòng)廣播是在12-17 06:01:00.383骡澈,廣播的onReceiver方法開始回調(diào)是在12-17 06:01:36.721锅纺,ANR的時(shí)間是在12-17 06:02:00.370 ,那么為什么啟動(dòng)了36s后我才收到bootcompleted廣播秧廉,這本身就不正常吧伞广,其次為什么廣播啟動(dòng)UpdateService耗時(shí)將近化了一分多鐘?分析到這里App同學(xué)認(rèn)為無法分析下去疼电,八成是系統(tǒng)原因嚼锄,從cpu的統(tǒng)計(jì)上來看,認(rèn)為是某些程序占用過高導(dǎo)致的蔽豺,貼出下面的Log区丑。

12-17 06:02:19.286  1566  1583 E ActivityManager: ANR in com.android.updater
12-17 06:02:19.286  1566  1583 E ActivityManager: PID: 8769
12-17 06:02:19.286  1566  1583 E ActivityManager: Reason: Broadcast of Intent { act=android.intent.action.BOOT_COMPLETED flg=0x9000010 cmp=com.android.updater/.BootCompletedReceiver (has extras) }
12-17 06:02:19.286  1566  1583 E ActivityManager: Load: 0.0 / 0.0 / 0.0
12-17 06:02:19.286  1566  1583 E ActivityManager: CPU usage from 0ms to 18846ms later (2017-12-17 06:02:00.379 to 2017-12-17 06:02:19.224):
12-17 06:02:19.286  1566  1583 E ActivityManager:   195% 6142/com.immomo.momo: 195% user + 0% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   2.3% 8170/com.tencent.mm: 2.3% user + 0% kernel / faults: 448 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.7% 1566/system_server: 0.4% user + 0.3% kernel / faults: 150 minor 1 major
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.4% 90/kworker/u16:3: 0% user + 0.4% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.3% 4704/com.tencent.mm:push: 0.1% user + 0.2% kernel / faults: 116 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.3% 8769/com.android.updater: 0.2% user + 0.1% kernel / faults: 1600 minor 2 major
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 4790/com.tencent.mm:patch: 0.2% user + 0% kernel / faults: 748 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 329/mmc-cmdqd/0: 0% user + 0.2% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 5429/com.tencent.mm:push: 0% user + 0.1% kernel / faults: 17 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 5435/com.tencent.mm:patch: 0.2% user + 0% kernel / faults: 82 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 8712/com.tencent.mm:exdevice: 0.1% user + 0% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 432/logd: 0.1% user + 0% kernel / faults: 4 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 844/msm_irqbalance: 0% user + 0.1% kernel / faults: 4 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 7580/kworker/u16:2: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 7/rcu_preempt: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 1240/zygote: 0% user + 0.1% kernel / faults: 84 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0% 3216/com.xiaomi.simactivate.service: 0% user + 0% kernel / faults: 5 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 8645/kworker/7:0: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 8730/kworker/4:2: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0% 45/rcuop/4: 0% user + 0% kernel

然而cpu占用195%并不算高,在多核中每個(gè)核最大占用率都是100%(八核占用率是800%),其次Load: 0.0 / 0.0 / 0.0沧侥,Load在15分鐘這段時(shí)間都是0可霎,停止運(yùn)轉(zhuǎn)了?貌似這個(gè)Log也不太正確宴杀。然而在小米手機(jī)上癣朗,ANR的監(jiān)控會(huì)有加強(qiáng)的,輸出了下面的Log旺罢。

12-17 06:02:14.693  8769  8769 W MIUI-BLOCK-MONITOR: The msg { when=-36s107ms what=113 obj=ReceiverData{intent=Intent { act=android.intent.action.BOOT_COMPLETED flg=0x9000010 cmp=com.android.updater/.BootCompletedReceiver (has extras) } packageName=com.android.updater resultCode=0 resultData=null resultExtras=null} target=android.app.ActivityThread$H planTime=1513461660613 dispatchTime=1513461696720 finishTime=0 } took 74080ms and took 37973ms after dispatch.

我們對(duì)每一個(gè)Message都額外記錄了它各個(gè)狀態(tài)的時(shí)間點(diǎn)旷余,方便我們進(jìn)行分析

  • when:消息從應(yīng)該被執(zhí)行到發(fā)生anr的時(shí)間
  • planTime:消息計(jì)劃被執(zhí)行的時(shí)間點(diǎn)
  • dispatchTime:消息真正被執(zhí)行的時(shí)間點(diǎn)
  • finishTime:消息完成時(shí)的時(shí)間點(diǎn)
    計(jì)算消息執(zhí)行的時(shí)間為:-when-(dispatchTime-planTime)=0,那么這意味著什么呢扁达?意味著113這個(gè)Message正要開始執(zhí)行正卧,還沒有開始執(zhí)行就發(fā)生了ANR,在主線程的Looper消息隊(duì)列中跪解,等待了36秒炉旷。這份日志中沒有主線程的trace,有也沒有作用叉讥,因?yàn)榭梢钥吹竭@個(gè)消息還沒有執(zhí)行呢窘行?那么這個(gè)36秒期間在做什么呢?有比較多下面的Log节吮。
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: The binder call took 3973ms.
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: java.lang.Throwable
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.AnrMonitor.checkBinderCallTime(AnrMonitor.java:591)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.BinderProxy.transact(Binder.java:623)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.content.pm.IPackageManager$Stub$Proxy.getApplicationInfo(IPackageManager.java:2658)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ApplicationPackageManager.getApplicationInfoAsUser(ApplicationPackageManager.java:340)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ApplicationPackageManager.getApplicationInfo(ApplicationPackageManager.java:333)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.core.ManifestParser.create(SourceFile:64)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.core.SdkManager.start(SourceFile:186)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at java.lang.reflect.Method.invoke(Native Method)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.external.a.abx()
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.external.a.attachBaseContext()
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.Application.attach(Application.java:193)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.Instrumentation.newApplication(Instrumentation.java:1009)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.Instrumentation.newApplication(Instrumentation.java:993)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.LoadedApk.makeApplication(LoadedApk.java:800)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread.handleBindApplication(ActivityThread.java:5471)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread.-wrap2(ActivityThread.java)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1584)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.Handler.dispatchMessage(Handler.java:102)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.Looper.loop(Looper.java:163)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread.main(ActivityThread.java:6221)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at java.lang.reflect.Method.invoke(Native Method)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:904)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:794)

36秒期間都在做bindApplication操作抽高,這種情況下極有可能是系統(tǒng)狀態(tài)此時(shí)并不樂觀的原因,其次通過上面的分析也可以看到BootCompletedReceiver的onReceiver是在主線程處理的透绩,啟動(dòng)Service也花費(fèi)了很多的時(shí)間翘骂,我們也可以考慮在注冊(cè)Receiver的時(shí)候,可以指定Handler帚豪,讓onReceiver運(yùn)行在子線程中(怎么做碳竟,可以看源碼哦)

本篇文章主要梳理三個(gè)小案例,總結(jié)ANR問題的分析套路狸臣,最后在次總結(jié)分析步驟:

  • 抓取bugreport莹桅,搜索ANR in,查看發(fā)生的時(shí)間和進(jìn)程烛亦,Cpu的負(fù)載有沒有問題

  • 抓取bugreport诈泼,搜索ANR in,查看發(fā)生的時(shí)間和進(jìn)程煤禽,Cpu的負(fù)載有沒有問題

  • 根據(jù)進(jìn)程尋找主線程的trace铐达,發(fā)現(xiàn)被blocked的地方,如果是Binder call則檬果,進(jìn)一步確認(rèn)下對(duì)端的情況瓮孙;如果是耗時(shí)操作唐断,直接修改成異步,懷疑系統(tǒng)執(zhí)行慢可以看看binder_sample杭抠,dvm_lock等信息脸甘,其次gc多不多,lmk殺進(jìn)程是不是很頻繁偏灿,都可以看出系統(tǒng)的健康狀態(tài)丹诀。

  • 結(jié)合源碼進(jìn)行分析解決

本文只是記錄一些案例和分析手段,指導(dǎo)思想就是找出主線程在過去一段時(shí)間內(nèi)被block的原因翁垂。

參考

應(yīng)用與系統(tǒng)穩(wěn)定性第一篇---ANR問題分析的一般套路

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末忿墅,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子沮峡,更是在濱河造成了極大的恐慌,老刑警劉巖亿柑,帶你破解...
    沈念sama閱讀 206,602評(píng)論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件邢疙,死亡現(xiàn)場離奇詭異,居然都是意外死亡望薄,警方通過查閱死者的電腦和手機(jī)疟游,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,442評(píng)論 2 382
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來痕支,“玉大人颁虐,你說我怎么就攤上這事∥孕耄” “怎么了另绩?”我有些...
    開封第一講書人閱讀 152,878評(píng)論 0 344
  • 文/不壞的土叔 我叫張陵,是天一觀的道長花嘶。 經(jīng)常有香客問我笋籽,道長,這世上最難降的妖魔是什么椭员? 我笑而不...
    開封第一講書人閱讀 55,306評(píng)論 1 279
  • 正文 為了忘掉前任车海,我火速辦了婚禮,結(jié)果婚禮上隘击,老公的妹妹穿的比我還像新娘侍芝。我一直安慰自己,他們只是感情好埋同,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,330評(píng)論 5 373
  • 文/花漫 我一把揭開白布州叠。 她就那樣靜靜地躺著,像睡著了一般莺禁。 火紅的嫁衣襯著肌膚如雪留量。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 49,071評(píng)論 1 285
  • 那天,我揣著相機(jī)與錄音楼熄,去河邊找鬼忆绰。 笑死,一個(gè)胖子當(dāng)著我的面吹牛可岂,可吹牛的內(nèi)容都是我干的错敢。 我是一名探鬼主播,決...
    沈念sama閱讀 38,382評(píng)論 3 400
  • 文/蒼蘭香墨 我猛地睜開眼缕粹,長吁一口氣:“原來是場噩夢啊……” “哼稚茅!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起平斩,我...
    開封第一講書人閱讀 37,006評(píng)論 0 259
  • 序言:老撾萬榮一對(duì)情侶失蹤亚享,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后绘面,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體欺税,經(jīng)...
    沈念sama閱讀 43,512評(píng)論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 35,965評(píng)論 2 325
  • 正文 我和宋清朗相戀三年揭璃,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了晚凿。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 38,094評(píng)論 1 333
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡瘦馍,死狀恐怖歼秽,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情情组,我是刑警寧澤燥筷,帶...
    沈念sama閱讀 33,732評(píng)論 4 323
  • 正文 年R本政府宣布,位于F島的核電站院崇,受9級(jí)特大地震影響荆责,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜亚脆,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,283評(píng)論 3 307
  • 文/蒙蒙 一做院、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧濒持,春花似錦键耕、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,286評(píng)論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至官套,卻和暖如春酒奶,著一層夾襖步出監(jiān)牢的瞬間蚁孔,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 31,512評(píng)論 1 262
  • 我被黑心中介騙來泰國打工惋嚎, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留杠氢,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 45,536評(píng)論 2 354
  • 正文 我出身青樓另伍,卻偏偏與公主長得像鼻百,于是被迫代替她去往敵國和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子摆尝,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,828評(píng)論 2 345