文章有錯(cuò)誤之處芍碧,歡迎批評(píng)指正!
什么是ANR
在Android中号俐,如果主線程被長時(shí)間阻塞泌豆,導(dǎo)致無法響應(yīng)用戶的操作,即造成ANR(Application Not Responding)吏饿。通常的表現(xiàn)是彈出一個(gè)應(yīng)用無響應(yīng)的對(duì)話框踪危,讓用戶選擇強(qiáng)制退出或者等待。
注意:主線程做耗時(shí)操作本身是不會(huì)產(chǎn)生ANR的猪落,導(dǎo)致ANR的根本還是應(yīng)用程序無法在一定時(shí)間內(nèi)響應(yīng)用戶的操作贞远。因?yàn)橹骶€程被耗時(shí)操作阻塞了,主線程無法對(duì)下一個(gè)操作進(jìn)行響應(yīng)才會(huì)ANR笨忌,沒有需要響應(yīng)的操作自然就不會(huì)產(chǎn)生ANR蓝仲,或者應(yīng)該這樣說:主線程做耗時(shí)操作,非常容易引發(fā)ANR官疲。
ANR的類型
KeyDispatch Timeout :按鍵或觸摸事件在特定時(shí)間內(nèi)無響應(yīng)袱结。超時(shí)時(shí)間5秒。超時(shí)時(shí)間是在ActivityManagerService
類中定義的途凫。
// How long we wait until we timeout on key dispatching.
static final int KEY_DISPATCHING_TIMEOUT = 5*1000;
Broadcast Timeout :BroadcastReceiver在特定時(shí)間內(nèi)無法處理完成垢夹。前臺(tái)廣播10秒,后臺(tái)廣播60秒维费。超時(shí)時(shí)間是在ActivityManagerService
類中定義的果元。
// How long we allow a receiver to run before giving up on it.
static final int BROADCAST_FG_TIMEOUT = 10*1000;
static final int BROADCAST_BG_TIMEOUT = 60*1000;
Service Timeout :Service在特定的時(shí)間內(nèi)生命周期函數(shù)無法處理完成。前臺(tái)服務(wù)20秒犀盟,后臺(tái)服務(wù)200秒而晒。超時(shí)時(shí)間是在ActiveServices
類中定義的。
// How long we wait for a service to finish executing.
static final int SERVICE_TIMEOUT = 20*1000;
// How long we wait for a service to finish executing.
static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
ContentProvider Timeout :ContentProvider在特定的時(shí)間內(nèi)沒有完成發(fā)布阅畴。超時(shí)時(shí)間10秒倡怎。超時(shí)時(shí)間是在ActivityManagerService
類中定義的。
// How long we wait for an attached process to publish its content providers
// before we decide it must be hung.
static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;
關(guān)于ANR類型的詳細(xì)信息不在本篇文章的敘述范圍之內(nèi)恶阴,請(qǐng)自行查閱資料诈胜。
造成ANR的常見原因
- 應(yīng)用在主線程上進(jìn)行長時(shí)間的計(jì)算。
- 應(yīng)用在主線程上執(zhí)行耗時(shí)的I/O的操作冯事。
- 主線程處于阻塞狀態(tài)焦匈,等待獲取鎖。
- 主線程與其他線程之間發(fā)生死鎖昵仅。
- 主線程在對(duì)另一個(gè)進(jìn)程進(jìn)行同步Binder調(diào)用缓熟,而后者需要很長時(shí)間才能返回累魔。(如果我們知道調(diào)用遠(yuǎn)程方法需要很長時(shí)間,我們應(yīng)該避免在主線程調(diào)用)
上述原因都會(huì)造成主線程被長時(shí)間阻塞够滑,導(dǎo)致無法響應(yīng)用戶的操作垦写,從而造成ANR。
ANR原因排查
ANR發(fā)生以后彰触,在Logcat中有相應(yīng)的日志輸出梯投,并且會(huì)在/data/anr/
目錄下輸出一個(gè)traces.tx
文件,該文件記錄了ANR的更加詳細(xì)的信息况毅,我們可以導(dǎo)出分析分蓖。接下來我們就依次模擬上述5種方式來制造ANR,然后分析產(chǎn)生的Logcat和traces.txt文件尔许。
測(cè)試環(huán)境:Android Studio 3.6.1
測(cè)試手機(jī): HUAWEI MLA-AL10么鹤,Android版本: 7.0
1.應(yīng)用在主線程上進(jìn)行長時(shí)間的計(jì)算
//使用冒泡排序?qū)σ粋€(gè)大數(shù)組排序
private fun sortBigArray() {
val currTime = System.currentTimeMillis()
val random = IntArray(1000000)
for (i in random.indices) {
random[i] = (Math.random() * 10000000).toInt()
}
BubbleSort.sort(random)
println("耗時(shí)" + (System.currentTimeMillis() - currTime) + "ms")
for (i in random.indices) {
println(random[i].toString())
}
}
我們點(diǎn)擊一個(gè)按鈕調(diào)用sortBigArray()方法,內(nèi)部調(diào)用BubbleSort類的sort()方法對(duì)一個(gè)大數(shù)組(100萬)進(jìn)行排序味廊,然后點(diǎn)擊幾次返回鍵蒸甜,然后就出現(xiàn)ANR了。
我們先看一下Logcat日志輸出
//debug級(jí)別日志
2020-06-03 21:20:24.209 com.example.android.jetpackdemo I/art: Wrote stack traces to '/data/anr/traces.txt'
//error級(jí)別日志
2020-06-03 21:20:28.048 ? E/ActivityManager: ANR in com.example.android.jetpackdemo (com.example.android.jetpackdemo/.StartActivity)
PID: 15564
Reason: Input dispatching timed out (Waiting to send key event because the focused window has not finished processing all of the input events that were previously delivered to it. Outbound queue length: 0. Wait queue length: 2.)
Load: 7.7 / 7.48 / 7.35
CPU usage from 294322ms to 0ms ago (2020-06-03 21:15:29.817 to 2020-06-03 21:20:24.139):
4.1% 2001/system_server: 3.1% user + 0.9% kernel / faults: 64102 minor 6 major
3.3% 29428/adbd: 0.8% user + 2.4% kernel / faults: 131259 minor
1.1% 508/logd: 0.5% user + 0.6% kernel / faults: 18 minor
0.7% 2661/com.android.systemui: 0.6% user + 0.1% kernel / faults: 1648 minor 1 major
0.7% 607/surfaceflinger: 0.4% user + 0.3% kernel / faults: 21 minor
0.7% 24463/com.huawei.hwid.persistent: 0.6% user + 0% kernel / faults: 4650 minor 1 major
0.5% 4018/com.huawei.android.launcher: 0.4% user + 0% kernel / faults: 16025 minor 3 major
0.5% 24301/fingerprint_log: 0% user + 0.5% kernel
0.4% 28932/com.huawei.appmarket: 0.3% user + 0% kernel / faults: 2526 minor
//...
2020-06-03 21:20:28.048 ? E/ActivityManager: CPU usage from 1721ms to 2250ms later (2020-06-03 21:20:25.860 to 2020-06-03 21:20:26.389):
99% 15564/com.example.android.jetpackdemo: 97% user + 1.8% kernel / faults: 37 minor
99% 15564/oid.jetpackdemo: 99% user + 0% kernel
7.5% 2001/system_server: 3.7% user + 3.7% kernel / faults: 5 minor
5.6% 2014/ActivityManager: 1.8% user + 3.7% kernel
1.8% 2813/Binder:2001_5: 1.8% user + 0% kernel
1.8% 2862/Binder:2001_6: 0% user + 1.8% kernel
1.8% 3089/Binder:2001_7: 1.8% user + 0% kernel
5.3% 29428/adbd: 0% user + 5.3% kernel / faults: 480 minor
3.5% 29430/->transport: 0% user + 3.5% kernel
1.7% 29428/adbd: 0% user + 1.7% kernel
1.3% 53/rcuop/6: 0% user + 1.3% kernel
16% TOTAL: 14% user + 2.1% kernel + 0.2% irq + 0.2% softirq
在上面的日志中輸出了堆棧信息的保存在 /data/anr/traces.txt
文件中余佛。
com.example.android.jetpackdemo I/dalvikvm: Wrote stack traces to '/data/anr/traces.txt'
發(fā)生ANR進(jìn)程的包名信息柠新,所在的類,進(jìn)程id和ANR的類型
2020-06-03 21:20:28.048 ? E/ActivityManager: ANR in com.example.android.jetpackdemo (com.example.android.jetpackdemo/.StartActivity)
PID: 15564
Reason: Input dispatching timed out (Waiting to send key event because the focused window has not finished processing all of the input events that were previously delivered to it. Outbound queue length: 0. Wait queue length: 2.)
包名com.example.android.jetpackdemo
衙熔,具體的類com.example.android.jetpackdemo.StartActivity
登颓,進(jìn)程號(hào)是PID: 15564
搅荞,ANR的類型是Input dispatching timed out
红氯。
CPU usage from 294322ms to 0ms ago (2020-06-03 21:15:29.817 to 2020-06-03 21:20:24.139):
4.1% 2001/system_server: 3.1% user + 0.9% kernel / faults: 64102 minor 6 major
3.3% 29428/adbd: 0.8% user + 2.4% kernel / faults: 131259 minor
1.1% 508/logd: 0.5% user + 0.6% kernel / faults: 18 minor
//...
2020-06-03 21:20:28.048 ? E/ActivityManager: CPU usage from 1721ms to 2250ms later (2020-06-03 21:20:25.860 to 2020-06-03 21:20:26.389):
99% 15564/com.example.android.jetpackdemo: 97% user + 1.8% kernel / faults: 37 minor
99% 15564/oid.jetpackdemo: 99% user + 0% kernel
7.5% 2001/system_server: 3.7% user + 3.7% kernel / faults: 5 minor
5.6% 2014/ActivityManager: 1.8% user + 3.7% kernel
1.8% 2813/Binder:2001_5: 1.8% user + 0% kernel
//...
注意:
在ANR發(fā)生之前,2020-06-03 21:15:29.817 to 2020-06-03 21:20:24.139咕痛,這段時(shí)間CPU的使用并不高痢甘。
在ANR發(fā)生的時(shí)候,2020-06-03 21:20:25.860 to 2020-06-03 21:20:26.389茉贡,這段時(shí)間CPU的使用相當(dāng)高塞栅,已經(jīng)達(dá)到99%了。
99% 15564/com.example.android.jetpackdemo: 97% user + 1.8% kernel
- 99%:內(nèi)存占用率
- 15564/com.example.android.jetpackdemo:進(jìn)程id和進(jìn)程名腔丧。
這兩段CPU 信息分別代表ANR發(fā)生前和ANR時(shí)的CPU占用率放椰,在輸出的CPU使用信息中我們也可以看出一些端倪,我們注意到我們的進(jìn)程CPU的占用率比較高愉粤,說明我們的進(jìn)程比較忙碌砾医,這里需要說明一下,進(jìn)程忙碌并不一定代表主線程忙碌衣厘,也可能是進(jìn)程中的后臺(tái)線程忙碌如蚜。
但是現(xiàn)在我們雖然知道了ANR發(fā)生的所在的類压恒,但是如何精確定位到具體的哪一行代碼呢?這就需要分析發(fā)生ANR的時(shí)候保存的traces.txt文件了错邦。
導(dǎo)出traces文件
使用adb命令導(dǎo)出traces.txt文件
adb pull /data/anr/traces.txt traces_1.txt
/data/anr/traces.txt: 1 file pulled, 0 skipped. 28.5 MB/s (701726 bytes in 0.023s)
如果入到permission相關(guān)問題探赫,請(qǐng)使用bugreport命令導(dǎo)出,參考 Capture and read bug reports撬呢。
traces.txt部分信息
----- pid 15564 at 2020-06-03 21:20:24 -----
Cmd line: com.example.android.jetpackdemo
Build fingerprint: 'HUAWEI/MLA-AL10/HWMLA:7.0/HUAWEIMLA-AL10/C00B364:user/release-keys'
ABI: 'arm64'
//...
在traces.txt文件的最頂部伦吠,首先輸出的是發(fā)生ANR的進(jìn)程號(hào)和包名信息,然后我們可以在traces.txt中搜索我們的進(jìn)程號(hào)或者包名魂拦。
"main" prio=5 tid=1 Runnable
| group="main" sCount=0 dsCount=0 obj=0x77d21af8 self=0x7fa2ea2a00
| sysTid=15564 nice=-10 cgrp=default sched=0/0 handle=0x7fa6f4ba98
| state=R schedstat=( 22116939220 18299419 428 ) utm=2209 stm=2 core=5 HZ=100
| stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
| held mutexes= "mutator lock"(shared held)
at com.example.android.jetpackdemo.BubbleSort.sort(BubbleSort.java:45)
at com.example.android.jetpackdemo.StartActivity.sortBigArray(StartActivity.kt:76)
at com.example.android.jetpackdemo.StartActivity.onClick(StartActivity.kt:47)
at java.lang.reflect.Method.invoke!(Native method)
at androidx.appcompat.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:397)
at android.view.View.performClick(View.java:5646)
at android.view.View$PerformClick.run(View.java:22473)
at android.os.Handler.handleCallback(Handler.java:761)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:156)
at android.app.ActivityThread.main(ActivityThread.java:6517)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:942)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:832)
//...
我們首先看一下和線程相關(guān)的部分信息讨勤。
"main" prio=5 tid=1 Runnable
| group="main" sCount=0 dsCount=0 obj=0x77d21af8 self=0x7fa2ea2a00
| sysTid=15564 nice=-10 cgrp=default sched=0/0 handle=0x7fa6f4ba98
| state=R schedstat=( 22116939220 18299419 428 ) utm=2209 stm=2 core=5 HZ=100
| stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
| held mutexes= "mutator lock"(shared held)
線程基本信息:
線程名:main
線程優(yōu)先級(jí):prio=5,優(yōu)先級(jí)取值范圍[1,10]晨另,詳見
Thread
類:
//最小取值
public final static int MIN_PRIORITY = 1;
//默認(rèn)優(yōu)先級(jí)
public final static int NORM_PRIORITY = 5;
//最大取值
public final static int MAX_PRIORITY = 10;
- 線程id: tid=1潭千,1代表主線程
- 線程狀態(tài):Runnable,狀態(tài)取值如下借尿,詳見
Thread.State
枚舉類:
NEW, //線程還沒啟動(dòng)
RUNNABLE, //正在執(zhí)行
BLOCKED, //等待獲取鎖
WAITING, //等待其他線程執(zhí)行一個(gè)特定的動(dòng)作刨晴,比如說調(diào)用Object.notify或Object.notifyAll()
TIMED_WAITING, //等待一定時(shí)間
TERMINATED //執(zhí)行完畢
- 線程組名稱:group="main"
- 線程被掛起的次數(shù):sCount=0
- 線程被調(diào)試器掛起的次數(shù):dsCount=0
- 線程的java的對(duì)象地址:obj= 0x77d21af8
- 線程本身的Native對(duì)象地址:self= 0x7fa2ea2a00
線程調(diào)度信息:
- Linux系統(tǒng)中內(nèi)核線程id: sysTid= 15564 與進(jìn)程號(hào)相同
- 線程調(diào)度優(yōu)先級(jí):nice=-10,詳細(xì)信息可參考 淺析Linux線程調(diào)度
- 線程調(diào)度組:cgrp=default
- 線程調(diào)度策略和優(yōu)先級(jí):sched=0/0
- 線程處理函數(shù)地址:handle= 0x7fa6f4ba98
線程的堆棧信息:
- 堆棧地址和大新贩:stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
held mutexes:
- held mutexes 到底是什么意思我沒有找到官方的文檔解釋狈癞,網(wǎng)上大多數(shù)關(guān)于held mutexes的解釋也都是一筆帶過沒有實(shí)際參考意義,我們這里先忽略這個(gè)東西茂契,并不會(huì)影響我們排查問題蝶桶。
從上面traces.txt文件中這段信息可以看出,導(dǎo)致ANR的最終原因是在BubbleSort.java的第45行掉冶。
at com.example.android.jetpackdemo.BubbleSort.sort(BubbleSort.java:45)
at com.example.android.jetpackdemo.StartActivity.sortBigArray(StartActivity.kt:76)
at com.example.android.jetpackdemo.StartActivity.onClick(StartActivity.kt:47)
at java.lang.reflect.Method.invoke!(Native method)
2.應(yīng)用在主線程上執(zhí)行耗時(shí)的I/O的操作
/**
* 拷貝文件真竖,注意要有讀寫權(quán)限
*/
private fun doIo() {
val prePath = Environment.getExternalStorageDirectory().path
val file = File("${prePath}/test/View.java")
if (file.exists()) {
Log.d(TAG, "doIo: ${file.length()}")
val reader = FileReader(file)
val fileWriter = FileWriter("${prePath}/test/ViewCopy.java", true)
for (index in 0 until 5) {
var count: Int
while (reader.read().also { count = it } != -1) {
fileWriter.write(count)
}
try {
reader.reset()
} catch (e: IOException) {
Log.d(TAG, "doIo: error ${e.message}")
}
}
}
}
調(diào)用doIo()方法以后,多次點(diǎn)擊返回鍵厌小,制造ANR恢共。
Logcat日志輸出
2020-06-04 21:05:24.462 ? E/ActivityManager: ANR in com.example.android.jetpackdemo (com.example.android.jetpackdemo/.StartActivity)
PID: 16295
Reason: Input dispatching timed out (Waiting to send key event because the focused window has not finished processing all of the input events that were previously delivered to it. Outbound queue length: 0. Wait queue length: 2.)
Load: 7.49 / 7.45 / 7.24
CPU usage from 87491ms to 0ms ago (2020-06-04 21:03:53.035 to 2020-06-04 21:05:20.526):
7.8% 2001/system_server: 6.1% user + 1.7% kernel / faults: 34095 minor 3 major
4.2% 28932/com.huawei.appmarket: 3.7% user + 0.5% kernel / faults: 12314 minor 5 major
2.8% 2661/com.android.systemui: 2.2% user + 0.5% kernel / faults: 4222 minor 1 major
2% 412/msm-core:sampli: 0% user + 2% kernel
1.7% 24463/com.huawei.hwid.persistent: 1.5% user + 0.1% kernel / faults: 3317 minor 1 major
1.5% 607/surfaceflinger: 1% user + 0.5% kernel / faults: 24 minor
//...
2020-06-04 21:05:24.462 ? E/ActivityManager: CPU usage from 1696ms to 2226ms later (2020-06-04 21:05:22.222 to 2020-06-04 21:05:22.752):
84% 16295/com.example.android.jetpackdemo: 84% user + 0% kernel / faults: 562 minor 1 major
68% 16295/oid.jetpackdemo: 68% user + 0% kernel
12% 16317/RenderThread: 12% user + 0% kernel
1.8% 16307/HeapTaskDaemon: 1.8% user + 0% kernel
+0% 16461/DeferredSaveThr: 0% user + 0% kernel
9.1% 2001/system_server: 1.8% user + 7.3% kernel / faults: 7 minor
7.3% 2014/ActivityManager: 0% user + 7.3% kernel
3.6% 2536/Binder:2001_3: 3.6% user + 0% kernel
5.5% 607/surfaceflinger: 2.7% user + 2.7% kernel
1.3% 607/surfaceflinger: 1.3% user + 0% kernel
1.3% 658/Binder:607_1: 0% user + 1.3% kernel
4.3% 2661/com.android.systemui: 2.9% user + 1.4% kernel / faults: 26 minor
4.3% 3614/RenderThread: 2.9% user + 1.4% kernel
1.4% 2661/ndroid.systemui: 1.4% user + 0% kernel
1.3% 25/rcuop/2: 0% user + 1.3% kernel
1.3% 339/irq/171-tsens_i: 0% user + 1.3% kernel
1.5% 11851/mdss_fb0: 0% user + 1.5% kernel
1.6% 14246/kworker/u16:5: 0% user + 1.6% kernel
1.6% 16318/kworker/u16:4: 0% user + 1.6% kernel
15% TOTAL: 13% user + 1.8% kernel
從上面的日志信息中我們也看出來發(fā)生ANR的時(shí)候,我們的進(jìn)程com.example.android.jetpackdemo
CPU占用率是比較高的璧亚,說明我們進(jìn)程內(nèi)存在比較忙碌的線程讨韭。然后我們繼續(xù)看一下對(duì)應(yīng)的traces.txt文件。
traces.txt部分信息
----- pid 16295 at 2020-06-04 21:05:20 -----
Cmd line: com.example.android.jetpackdemo
Build fingerprint: 'HUAWEI/MLA-AL10/HWMLA:7.0/HUAWEIMLA-AL10/C00B364:user/release-keys'
通過進(jìn)程號(hào)pid 16295搜索
"main" prio=5 tid=1 Runnable
| group="main" sCount=0 dsCount=0 obj=0x77d21af8 self=0x7fa2ea2a00
| sysTid=16295 nice=-10 cgrp=default sched=0/0 handle=0x7fa6f4ba98
| state=R schedstat=( 16406184130 12254163 407 ) utm=1630 stm=10 core=6 HZ=100
| stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
| held mutexes= "mutator lock"(shared held)
native: #00 pc 0000000000478088 /system/lib64/libart.so (_ZN3art15DumpNativeStackERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEEiP12BacktraceMapPKcPNS_9ArtMethodEPv+220)
native: #01 pc 0000000000478084 /system/lib64/libart.so (_ZN3art15DumpNativeStackERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEEiP12BacktraceMapPKcPNS_9ArtMethodEPv+216)
native: #02 pc 000000000044c604 /system/lib64/libart.so (_ZNK3art6Thread9DumpStackERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEEbP12BacktraceMap+524)
native: #03 pc 0000000000463f60 /system/lib64/libart.so (_ZN3art14DumpCheckpoint3RunEPNS_6ThreadE+820)
native: #04 pc 000000000044d510 /system/lib64/libart.so (_ZN3art6Thread21RunCheckpointFunctionEv+192)
native: #05 pc 00000000000ff870 /system/lib64/libart.so (_ZN3art27ScopedObjectAccessUncheckedD2Ev+576)
native: #06 pc 000000000010a764 /system/lib64/libart.so (_ZN3art8CheckJNI23GetPrimitiveArrayRegionEPKcNS_9Primitive4TypeEP7_JNIEnvP7_jarrayiiPv+1164)
native: #07 pc 0000000000022ee4 /system/lib64/libjavacore.so (???)
native: #08 pc 00000000004747a8 /data/dalvik-cache/arm64/system@framework@boot-core-libart.oat (Java_libcore_icu_NativeConverter_encode__J_3CI_3BI_3IZ+244)
at libcore.icu.NativeConverter.encode(Native method)
at java.nio.charset.CharsetEncoderICU.encodeLoop(CharsetEncoderICU.java:169)
at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:579)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:271)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
- locked <0x05b5279d> (a java.io.FileWriter)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:113)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:194)
at com.example.android.jetpackdemo.StartActivity.doIo(StartActivity.kt:116)
at com.example.android.jetpackdemo.StartActivity.onClick(StartActivity.kt:65)
at java.lang.reflect.Method.invoke!(Native method)
at androidx.appcompat.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:397)
at android.view.View.performClick(View.java:5646)
at android.view.View$PerformClick.run(View.java:22473)
at android.os.Handler.handleCallback(Handler.java:761)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:156)
at android.app.ActivityThread.main(ActivityThread.java:6517)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:942)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:832)
我們重點(diǎn)看一下這段信息
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:194)
at com.example.android.jetpackdemo.StartActivity.doIo(StartActivity.kt:116)
at com.example.android.jetpackdemo.StartActivity.onClick(StartActivity.kt:65)
從上面這段信息可以看出癣蟋,導(dǎo)致ANR的最終原因是在OutputStreamWriter.java的第194行透硝。而我們的代碼出問題的地方是StartActivity.kt的116行。
3.主線程處于阻塞狀態(tài)疯搅,等待獲取鎖
//鎖資源
val lockedResource: Any = Any()
fun onClick(v: View) {
when (v.id) {
R.id.btnWaitLockedResource -> {
LockTask().execute(arrayListOf<Int>())
Log.d(TAG, "onClick: 主線程先睡眠一會(huì)濒生,避免先獲取到鎖")
Thread.sleep(200)
Log.d(TAG, "onClick: 主線程先睡眠結(jié)束,嘗試獲取鎖")
synchronized(lockedResource) {
for (index in 0 until 10) {
Log.d(TAG, "onClick: 主線程獲取到鎖了$index")
}
}
}
}
}
//LockTask后臺(tái)線程
inner class LockTask : AsyncTask<MutableList<Int>, Int, Unit>() {
override fun doInBackground(vararg params: MutableList<Int>) =
synchronized(lockedResource) {
val list = params[0]
for (i in 0 until 1000000) {
list.add((Math.random() * 10000000).toInt())
}
list.forEach {
Log.d(TAG, "doInBackground: for each element is $it")
}
}
}
調(diào)用onClick()方法以后秉撇,先讓后臺(tái)線程獲取鎖甜攀,然后主線程再嘗試獲取鎖秋泄。然后多次點(diǎn)擊返回鍵,制造ANR规阀。
Logcat日志輸出
2020-06-04 09:55:04.396 ? E/ActivityManager: ANR in com.example.android.jetpackdemo (com.example.android.jetpackdemo/.StartActivity)
PID: 20008
Reason: Input dispatching timed out (Waiting to send key event because the focused window has not finished processing all of the input events that were previously delivered to it. Outbound queue length: 0. Wait queue length: 2.)
Load: 8.27 / 7.73 / 7.37
CPU usage from 83152ms to 0ms ago (2020-06-04 09:53:36.842 to 2020-06-04 09:54:59.995) with 99% awake:
19% 508/logd: 15% user + 3.5% kernel / faults: 533 minor 1 major
5.5% 2001/system_server: 3.9% user + 1.5% kernel / faults: 10843 minor 7 major
4.9% 28932/com.huawei.appmarket: 4.3% user + 0.6% kernel / faults: 13003 minor 79 major
2.6% 2661/com.android.systemui: 2.2% user + 0.3% kernel / faults: 7158 minor 2 major
1.5% 607/surfaceflinger: 0.9% user + 0.6% kernel / faults: 190 minor 1 major
1.2% 24307/logcat: 0.7% user + 0.4% kernel
0.8% 11161/com.android.settings: 0.6% user + 0.1% kernel / faults: 9084 minor 20 major
0.6% 24305/logcat: 0.2% user + 0.3% kernel
0.4% 24301/fingerprint_log: 0% user + 0.4% kernel
0.3% 15363/kworker/u16:10: 0% user + 0.3% kernel
0.2% 6831/kworker/u16:5: 0% user + 0.2% kernel
0.2% 837/imonitor: 0% user + 0.1% kernel
//...
2020-06-04 09:55:04.396 ? E/ActivityManager: CPU usage from 2211ms to 2742ms later (2020-06-04 09:55:02.206 to 2020-06-04 09:55:02.737):
105% 20008/com.example.android.jetpackdemo: 92% user + 13% kernel / faults: 220 minor
99% 20096/AsyncTask #1: 86% user + 13% kernel
5.6% 20019/HeapTaskDaemon: 5.6% user + 0% kernel
103% 508/logd: 99% user + 3.7% kernel / faults: 8 minor
92% 24315/logd.reader.per: 92% user + 0% kernel
7.5% 511/logd.writer: 5.6% user + 1.8% kernel
3.7% 24314/logd.reader.per: 0% user + 3.7% kernel
1.8% 24313/logd.reader.per: 0% user + 1.8% kernel
11% 2661/com.android.systemui: 11% user + 0% kernel / faults: 52 minor
9.3% 3614/RenderThread: 7.5% user + 1.8% kernel
1.8% 2661/ndroid.systemui: 1.8% user + 0% kernel
9.3% 607/surfaceflinger: 9.3% user + 0% kernel
3.7% 607/surfaceflinger: 3.7% user + 0% kernel
1.8% 2614/Binder:607_4: 0% user + 1.8% kernel
5.6% 2001/system_server: 1.8% user + 3.7% kernel / faults: 2 minor
5.6% 2014/ActivityManager: 0% user + 5.6% kernel
3.3% 19794/adbd: 1.6% user + 1.6% kernel / faults: 147 minor
1.6% 19794/adbd: 0% user + 1.6% kernel
1.6% 19796/->transport: 0% user + 1.6% kernel
1.6% 19797/<-transport: 0% user + 1.6% kernel
3.4% 24307/logcat: 0% user + 3.4% kernel
1.3% 624/mm-pp-dpps: 0% user + 1.3% kernel
1.3% 717/ABA_THREAD: 1.3% user + 0% kernel
1.6% 18971/kworker/0:2: 0% user + 1.6% kernel
1.6% 18974/kworker/u16:0: 0% user + 1.6% kernel
1.6% 19095/mdss_fb0: 0% user + 1.6% kernel
1.7% 24301/fingerprint_log: 1.7% user + 0% kernel
1.7% 24305/logcat: 1.7% user + 0% kernel
31% TOTAL: 26% user + 4% kernel + 0.2% irq + 0.2% softirq
從上面的日志信息中我們也看出來我們的進(jìn)程CPU占用率是比較高的恒序,說明我們進(jìn)程內(nèi)存在比較忙碌的線程。然后我們繼續(xù)看一下對(duì)應(yīng)的traces.txt文件谁撼。
traces.txt部分信息
----- pid 20008 at 2020-06-04 09:55:00 -----
Cmd line: com.example.android.jetpackdemo
Build fingerprint: 'HUAWEI/MLA-AL10/HWMLA:7.0/HUAWEIMLA-AL10/C00B364:user/release-keys'
通過進(jìn)程號(hào)pid 20008搜索
"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 obj=0x77d21af8 self=0x7fa2ea2a00
| sysTid=20008 nice=-10 cgrp=default sched=0/0 handle=0x7fa6f4ba98
| state=S schedstat=( 278831875 7233747 156 ) utm=22 stm=5 core=0 HZ=100
| stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
| held mutexes=
at com.example.android.jetpackdemo.StartActivity.onClick(StartActivity.kt:61)
- waiting to lock <0x0f8c80b0> (a java.lang.Object) held by thread 16
at java.lang.reflect.Method.invoke!(Native method)
at androidx.appcompat.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:397)
at android.view.View.performClick(View.java:5646)
at android.view.View$PerformClick.run(View.java:22473)
at android.os.Handler.handleCallback(Handler.java:761)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:156)
at android.app.ActivityThread.main(ActivityThread.java:6517)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:942)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:832)
關(guān)鍵信息
at com.example.android.jetpackdemo.StartActivity.onClick(StartActivity.kt:61)
- waiting to lock <0x0f8c80b0> (a java.lang.Object) held by thread 16
在StartActivity的61行歧胁,在等待一個(gè)鎖對(duì)象<0x0f8c80b0>
,該對(duì)象是一個(gè)Object
對(duì)象(a java.lang.Object)厉碟,這個(gè)鎖對(duì)象正在被線程id為16的線程持有喊巍。那么我們下面在traces.txt文件中搜索一下這個(gè)鎖對(duì)象<0x0f8c80b0>
。如下所示:
DALVIK THREADS (16):
"AsyncTask #1" prio=5 tid=16 Runnable
| group="main" sCount=0 dsCount=0 obj=0x12cd61f0 self=0x7f93187200
| sysTid=20096 nice=10 cgrp=bg_non_interactive sched=0/0 handle=0x7f84346450
| state=R schedstat=( 13814173056 6030204 1355 ) utm=1193 stm=188 core=3 HZ=100
| stack=0x7f84244000-0x7f84246000 stackSize=1037KB
| held mutexes= "mutator lock"(shared held)
at java.lang.Integer.stringSize(Integer.java:414)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:630)
at java.lang.StringBuilder.append(StringBuilder.java:220)
at com.example.android.jetpackdemo.StartActivity$LockTask.doInBackground(StartActivity.kt:107)
- locked <0x0f8c80b0> (a java.lang.Object)
at com.example.android.jetpackdemo.StartActivity$LockTask.doInBackground(StartActivity.kt:99)
at android.os.AsyncTask$2.call(AsyncTask.java:316)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:255)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
at java.lang.Thread.run(Thread.java:776)
關(guān)鍵信息
at com.example.android.jetpackdemo.StartActivity$LockTask.doInBackground(StartActivity.kt:107)
- locked <0x0f8c80b0> (a java.lang.Object)
我們看到正是這個(gè)AsyncTask在107行持有鎖對(duì)象0x0f8c80b0
箍鼓,導(dǎo)致主線程無法獲取鎖而阻塞崭参,最終導(dǎo)致ANR。
4.主線程與其他線程之間發(fā)生死鎖
val resourceFirst = "resourceFirst"
val resourceSecond = "resourceSecond"
private fun mockDeadLock() {
//啟動(dòng)一個(gè)后臺(tái)線程
thread(start = false) {
synchronized(resourceSecond) {
Log.d(TAG, "工作線程獲取了鎖 resourceSecond")
Thread.sleep(100)
Log.d(TAG, "工作線程嘗試獲取鎖 resourceFirst")
synchronized(resourceFirst) {
while (true) {
Log.d(TAG, "工作線程 mockDeadLock")
}
}
}
}.start()
//主線程睡眠30ms后開始獲取鎖
Thread.sleep(30)
synchronized(resourceFirst) {
Log.d(TAG, "主線程獲取了鎖 resourceFirst")
Log.d(TAG, "主線程嘗試獲取鎖 resourceSecond")
synchronized(resourceSecond) {
Log.d(TAG, "主線程獲取了鎖 resourceFirst")
while (true) {
Log.d(TAG, "主線程 mockDeadLock")
}
}
}
}
上面這段代碼邏輯:
- 工作線程先獲取鎖
resourceSecond
款咖,然后睡眠100ms保證主線程能獲取到鎖resourceFirst
何暮。 - 主線程睡眠30ms后先獲取鎖
resourceFirst
,然后再嘗試獲取鎖resourceSecond
铐殃,這時(shí)候是獲取不到的海洼,因?yàn)楣ぷ骶€程已經(jīng)持有鎖resourceSecond
并且不釋放。 - 工作線程睡眠結(jié)束以后嘗試獲取鎖
resourceFirst
富腊,這時(shí)候是獲取不到的坏逢,因?yàn)橹骶€程持有了鎖resourceFirst
并且不釋放。 - 最終赘被,造成死鎖是整。
調(diào)用mockDeadLock()方法以后,多次點(diǎn)擊返回鍵帘腹,制造ANR贰盗。
Logcat輸出
2020-06-04 15:07:41.246 ? E/ActivityManager: ANR in com.example.android.jetpackdemo (com.example.android.jetpackdemo/.StartActivity)
PID: 13626
Reason: Input dispatching timed out (Waiting to send key event because the focused window has not finished processing all of the input events that were previously delivered to it. Outbound queue length: 0. Wait queue length: 2.)
Load: 7.53 / 6.81 / 6.4
CPU usage from 177565ms to 0ms ago (2020-06-04 15:04:39.715 to 2020-06-04 15:07:37.281):
11% 2001/system_server: 7.1% user + 4.4% kernel / faults: 68219 minor 37 major
3.4% 2661/com.android.systemui: 2.8% user + 0.6% kernel / faults: 20555 minor 29 major
2% 508/logd: 0.9% user + 1.1% kernel / faults: 76 minor
1.8% 607/surfaceflinger: 1.1% user + 0.7% kernel / faults: 82 minor 1 major
0% 24463/com.huawei.hwid.persistent: 0% user + 0% kernel / faults: 7819 minor 24 major
0.9% 2823/com.huawei.systemmanager:service: 0.6% user + 0.2% kernel / faults: 13277 minor 12 major
//...
2020-06-04 15:07:41.246 ? E/ActivityManager: CPU usage from 1714ms to 2243ms later (2020-06-04 15:07:38.994 to 2020-06-04 15:07:39.523):
12% 2001/system_server: 9% user + 3.6% kernel / faults: 8 minor
10% 2014/ActivityManager: 5.4% user + 5.4% kernel
1.8% 2399/UEventObserver: 1.8% user + 0% kernel
1.5% 13652/kworker/u16:7: 0% user + 1.5% kernel
2.3% TOTAL: 1.1% user + 1.1% kernel
上面的Logcat輸出并沒有關(guān)于我們進(jìn)程的CUP信息,說明我們的進(jìn)程CPU占用率很低阳欲。那么我們繼續(xù)看一下traces.txt文件。
traces.txt部分信息
----- pid 13626 at 2020-06-04 15:07:37 -----
Cmd line: com.example.android.jetpackdemo
Build fingerprint: 'HUAWEI/MLA-AL10/HWMLA:7.0/HUAWEIMLA-AL10/C00B364:user/release-keys'
通過進(jìn)程號(hào)pid 13626搜索
"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 obj=0x77d21af8 self=0x7fa2ea2a00
| sysTid=13626 nice=-10 cgrp=default sched=0/0 handle=0x7fa6f4ba98
| state=S schedstat=( 288564792 6939269 224 ) utm=23 stm=5 core=0 HZ=100
| stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
| held mutexes=
at com.example.android.jetpackdemo.StartActivity.mockDeadLock(StartActivity.kt:142)
- waiting to lock <0x0a43b5c8> (a java.lang.String) held by thread 17
at com.example.android.jetpackdemo.StartActivity.onClick(StartActivity.kt:70)
at java.lang.reflect.Method.invoke!(Native method)
at androidx.appcompat.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:397)
at android.view.View.performClick(View.java:5646)
at android.view.View$PerformClick.run(View.java:22473)
at android.os.Handler.handleCallback(Handler.java:761)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:156)
at android.app.ActivityThread.main(ActivityThread.java:6517)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:942)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:832)
主線程狀態(tài)是線程狀態(tài)是Blocked陋率,說明正在等待獲取鎖對(duì)象球化,等待獲取的鎖對(duì)象<0x0a43b5c8>
是一個(gè)String對(duì)象(a java.lang.String),該對(duì)象被線程id為17的線程持有瓦糟。然后我們搜索這個(gè)鎖對(duì)象筒愚。
"Thread-2" prio=5 tid=17 Blocked
| group="main" sCount=1 dsCount=0 obj=0x12c89dc0 self=0x7f931cd000
| sysTid=13763 nice=0 cgrp=default sched=0/0 handle=0x7f84344450
| state=S schedstat=( 886406 280365 2 ) utm=0 stm=0 core=0 HZ=100
| stack=0x7f84242000-0x7f84244000 stackSize=1037KB
| held mutexes=
at com.example.android.jetpackdemo.StartActivity$mockDeadLock$1.invoke(StartActivity.kt:127)
- waiting to lock <0x0ec26674> (a java.lang.String) held by thread 1
- locked <0x0a43b5c8> (a java.lang.String)
at com.example.android.jetpackdemo.StartActivity$mockDeadLock$1.invoke(StartActivity.kt:21)
at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)
Thread-2,線程狀態(tài)是Blocked菩浙,說明正在等待獲取鎖對(duì)象巢掺,等待獲取的鎖對(duì)象<0x0ec26674>
是一個(gè)String對(duì)象(a java.lang.String)句伶,這個(gè)對(duì)象被線程id為1的線程(也就是主線程)持有。并且當(dāng)前線程持有鎖對(duì)象<0x0a43b5c8>
陆淀。
最終考余,主線程和工作線程Thread-2造成死鎖,導(dǎo)致應(yīng)用無響應(yīng)轧苫。
5.主線程在對(duì)另一個(gè)進(jìn)程進(jìn)行同步Binder調(diào)用楚堤,而后者需要很長時(shí)間才能返回
我們的代碼是實(shí)現(xiàn)從客戶端的兩個(gè)EditText中獲取兩個(gè)數(shù)字,然后通過Binder調(diào)用服務(wù)端的方法計(jì)算兩個(gè)數(shù)的和返回給客戶端含懊,然后客戶端講計(jì)算結(jié)果展示在界面上身冬。完整代碼請(qǐng)參考 AIDLDemo。
客戶端部分代碼
private IMyAidlInterface iMyAidlInterface;
private ServiceConnection conn = new ServiceConnection() {
@Override
public void onServiceConnected(ComponentName name, IBinder service) {
//獲取Binder對(duì)象
iMyAidlInterface = IMyAidlInterface.Stub.asInterface(service);
}
//...
};
public void onClick(View view) {
switch (view.getId()) {
case R.id.btn_count:
mNum1 = Integer.parseInt(etNum1.getText().toString());
mNum2 = Integer.parseInt(etNum2.getText().toString());
try {
//在主線程進(jìn)行同步binder調(diào)用
mTotal = iMyAidlInterface.add(mNum1, mNum2);
} catch (RemoteException e) {
e.printStackTrace();
Log.e(TAG, "onClick: " + e.getMessage());
}
editShowResult.setText("mTotal=" + mTotal);
break;
}
}
服務(wù)端部分代碼
public class IRemoteService extends Service {
private static final String TAG = "IRemoteService";
private IBinder iBinder = new IMyAidlInterface.Stub() {
@Override
public int add(int num1, int num2) throws RemoteException {
Log.d(TAG, "remote method add: start sleep thread id =" + Thread.currentThread().getId()+"," +
"thread name = "+Thread.currentThread().getName());
try {
//睡眠一段時(shí)間岔乔,然后才進(jìn)行計(jì)算
Thread.sleep(120000);
} catch (InterruptedException e) {
e.printStackTrace();
}
Log.d(TAG, "remote method add: finish sleep return calculate result");
return num1 + num2;
}
};
public IRemoteService() {
}
@Override
public IBinder onBind(Intent intent) {
return iBinder;
}
}
注意:我們需要先把Binder服務(wù)端運(yùn)行起來酥筝,然后再運(yùn)行Binder客戶端執(zhí)行相應(yīng)的方法。
Logcat輸出
2020-06-04 15:49:47.006 2001-2014/? E/ActivityManager: ANR in com.hm.aidlclient (com.hm.aidlclient/.BaseKnowledgeActivity)
PID: 18096
Reason: Input dispatching timed out (Waiting to send key event because the focused window has not finished processing all of the input events that were previously delivered to it. Outbound queue length: 0. Wait queue length: 2.)
Load: 7.55 / 7.26 / 6.87
CPU usage from 755516ms to 0ms ago (2020-06-04 15:37:07.545 to 2020-06-04 15:49:43.062) with 99% awake:
5.1% 2001/system_server: 3.5% user + 1.5% kernel / faults: 139606 minor 17 major
1.2% 508/logd: 0.5% user + 0.6% kernel / faults: 35 minor
0.9% 2661/com.android.systemui: 0.7% user + 0.1% kernel / faults: 13039 minor 4 major
0.8% 12442/adbd: 0.2% user + 0.6% kernel / faults: 23957 minor
0.7% 607/surfaceflinger: 0.4% user + 0.3% kernel / faults: 183 minor 2 major
0.6% 28932/com.huawei.appmarket: 0.5% user + 0.1% kernel / faults: 9311 minor 64 major
0.4% 24463/com.huawei.hwid.persistent: 0.3% user + 0% kernel / faults: 11607 minor 6 major
0.5% 24301/fingerprint_log: 0% user + 0.5% kernel
0.3% 4128/com.google.android.gms: 0.2% user + 0% kernel / faults: 26970 minor 16 major
//...
2020-06-04 15:49:47.006 2001-2014/? E/ActivityManager: CPU usage from 1701ms to 2232ms later (2020-06-04 15:49:44.762 to 2020-06-04 15:49:45.293):
28% 2001/system_server: 21% user + 7.2% kernel / faults: 38 minor
16% 2010/HeapTaskDaemon: 16% user + 0% kernel
9% 2014/ActivityManager: 1.8% user + 7.2% kernel
1.8% 2001/system_server: 0% user + 1.8% kernel
1.8% 2540/NetdConnector: 1.8% user + 0% kernel
9% 607/surfaceflinger: 9% user + 0% kernel
5.4% 607/surfaceflinger: 5.4% user + 0% kernel
1.8% 658/Binder:607_1: 0% user + 1.8% kernel
1.8% 677/EventThread: 0% user + 1.8% kernel
7.1% 2661/com.android.systemui: 5.3% user + 1.7% kernel / faults: 38 minor
8.9% 3614/RenderThread: 7.1% user + 1.7% kernel
1.3% 508/logd: 1.3% user + 0% kernel
1.3% 624/mm-pp-dpps: 1.3% user + 0% kernel
2.7% 717/ABA_THREAD: 1.3% user + 1.3% kernel
1.5% 15978/mdss_fb0: 0% user + 1.5% kernel
1.6% 18228/logcat: 0% user + 1.6% kernel
8.3% TOTAL: 5.8% user + 2.5% kernel
Logcat輸出的信息中并沒有什么有價(jià)值的信息雏门。那么我們繼續(xù)看一下traces.txt文件樱哼。
traces.txt中客戶端相關(guān)信息
----- pid 18096 at 2020-06-04 15:49:43 -----
Cmd line: com.hm.aidlclient
Build fingerprint: 'HUAWEI/MLA-AL10/HWMLA:7.0/HUAWEIMLA-AL10/C00B364:user/release-keys'
通過進(jìn)程號(hào)pid 18096搜索
"main" prio=5 tid=1 Native
| group="main" sCount=1 dsCount=0 obj=0x77d21af8 self=0x7fa2ea2a00
| sysTid=18096 nice=-10 cgrp=default sched=0/0 handle=0x7fa6f4ba98
| state=S schedstat=( 464662186 22498334 359 ) utm=38 stm=8 core=0 HZ=100
| stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
| held mutexes=
kernel: __switch_to+0x70/0x7c
kernel: binder_thread_read+0x4cc/0x13f0
kernel: binder_ioctl+0x53c/0xbcc
kernel: do_vfs_ioctl+0x570/0x5a8
kernel: SyS_ioctl+0x60/0x88
kernel: el0_svc_naked+0x24/0x28
native: #00 pc 000000000006ad6c /system/lib64/libc.so (__ioctl+4)
native: #01 pc 000000000001fa48 /system/lib64/libc.so (ioctl+144)
native: #02 pc 00000000000555a4 /system/lib64/libbinder.so (_ZN7android14IPCThreadState14talkWithDriverEb+260)
native: #03 pc 0000000000056388 /system/lib64/libbinder.so (_ZN7android14IPCThreadState15waitForResponseEPNS_6ParcelEPi+352)
native: #04 pc 000000000004b250 /system/lib64/libbinder.so (_ZN7android8BpBinder8transactEjRKNS_6ParcelEPS1_j+72)
native: #05 pc 0000000000103354 /system/lib64/libandroid_runtime.so (???)
native: #06 pc 0000000000b36238 /data/dalvik-cache/arm64/system@framework@boot-framework.oat (Java_android_os_BinderProxy_transactNative__ILandroid_os_Parcel_2Landroid_os_Parcel_2I+196)
at android.os.BinderProxy.transactNative(Native method)
at android.os.BinderProxy.transact(Binder.java:617)
at com.hm.aidlserver.IMyAidlInterface$Stub$Proxy.add(IMyAidlInterface.java:90)
at com.hm.aidlclient.BaseKnowledgeActivity.onClick(BaseKnowledgeActivity.java:109)
at com.hm.aidlclient.BaseKnowledgeActivity_ViewBinding$1.doClick(BaseKnowledgeActivity_ViewBinding.java:41)
at butterknife.internal.DebouncingOnClickListener.onClick(DebouncingOnClickListener.java:22)
at android.view.View.performClick(View.java:5646)
at android.view.View$PerformClick.run(View.java:22473)
at android.os.Handler.handleCallback(Handler.java:761)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:156)
at android.app.ActivityThread.main(ActivityThread.java:6517)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:942)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:832)
這里我們看到Binder客戶端主線程的狀態(tài)是Native
,這個(gè)狀態(tài)是native線程的一個(gè)狀態(tài)剿配,對(duì)應(yīng)java線程的RUNNABLE
狀態(tài)搅幅。更詳細(xì)的對(duì)應(yīng)關(guān)系可以參考VMThread.java。然后從上面的信息中我們只看到BinderProxy調(diào)用了transactNative()方法呼胚,這是一個(gè)本地方法茄唐,最終會(huì)調(diào)用服務(wù)端Binder對(duì)象的transact()方法,實(shí)現(xiàn)真正的跨進(jìn)程通信蝇更。除了這些我們沒有看到其他有用的信息了沪编。那么我們接下來看一看服務(wù)端的一些信息,看看能不能找到一些線索年扩。
traces.txt中服務(wù)端相關(guān)信息
----- pid 17773 at 2020-06-04 15:49:43 -----
Cmd line: com.hm.aidlserver
Build fingerprint: 'HUAWEI/MLA-AL10/HWMLA:7.0/HUAWEIMLA-AL10/C00B364:user/release-keys'
通過進(jìn)程號(hào)pid 17773搜索
"main" prio=5 tid=1 Native
| group="main" sCount=1 dsCount=0 obj=0x77d21af8 self=0x7fa2ea2a00
| sysTid=17773 nice=0 cgrp=default sched=0/0 handle=0x7fa6f4ba98
| state=S schedstat=( 213791882 16481247 206 ) utm=18 stm=3 core=1 HZ=100
| stack=0x7fd42e0000-0x7fd42e2000 stackSize=8MB
| held mutexes=
kernel: __switch_to+0x70/0x7c
kernel: SyS_epoll_wait+0x2d4/0x394
kernel: SyS_epoll_pwait+0xc4/0x150
kernel: el0_svc_naked+0x24/0x28
native: #00 pc 000000000006ac80 /system/lib64/libc.so (__epoll_pwait+8)
native: #01 pc 000000000001e21c /system/lib64/libc.so (epoll_pwait+64)
native: #02 pc 00000000000181d8 /vendor/lib64/libutils.so (_ZN7android6Looper9pollInnerEi+156)
native: #03 pc 000000000001808c /vendor/lib64/libutils.so (_ZN7android6Looper8pollOnceEiPiS1_PPv+60)
native: #04 pc 00000000000f66dc /system/lib64/libandroid_runtime.so (_ZN7android18NativeMessageQueue8pollOnceEP7_JNIEnvP8_jobjecti+48)
native: #05 pc 0000000000b91ec0 /data/dalvik-cache/arm64/system@framework@boot-framework.oat (Java_android_os_MessageQueue_nativePollOnce__JI+140)
at android.os.MessageQueue.nativePollOnce(Native method)
at android.os.MessageQueue.next(MessageQueue.java:356)
at android.os.Looper.loop(Looper.java:138)
at android.app.ActivityThread.main(ActivityThread.java:6517)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:942)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:832)
服務(wù)端的進(jìn)程號(hào)是pid 17773蚁廓,我們看到服務(wù)端的主線程中也沒有什么線索,不要慌厨幻,這里我們似乎忘了一點(diǎn)什么相嵌,那就是服務(wù)端的Binder對(duì)象是運(yùn)行在服務(wù)端的Binder線程池中的。那我們?cè)趺凑业骄唧w是Binder線程池中的哪個(gè)線程呢况脆?其實(shí)在traces.txt文件中也是輸出了的饭宾。
----- binder transactions -----
18096:18096(m.hm.aidlclient:m.hm.aidlclient) -> 17773:17788(m.hm.aidlserver:Binder:17773_2) code: 1
----- end binder transactions -----
上面這段信息的意思就是,我們是在進(jìn)程id為18096格了,內(nèi)核線程為18096的線程(就是主線程)向進(jìn)程id為17773看铆,內(nèi)核線id為17788的線程發(fā)起跨進(jìn)程通信。內(nèi)核線程id為17788的線程的線程名稱是Binder:17773_2
盛末。那么我們就搜索一下Binder:17773_2
弹惦。搜索結(jié)果如下所示:
"Binder:17773_2" prio=5 tid=10 Sleeping
| group="main" sCount=1 dsCount=0 obj=0x32c064c0 self=0x7f9a624800
| sysTid=17788 nice=-10 cgrp=default sched=0/0 handle=0x7fa0fc3450
| state=S schedstat=( 3077762 6086666 14 ) utm=0 stm=0 core=6 HZ=100
| stack=0x7fa0ec9000-0x7fa0ecb000 stackSize=1005KB
| held mutexes=
at java.lang.Thread.sleep!(Native method)
- sleeping on <0x05eea4a7> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:379)
- locked <0x05eea4a7> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:321)
at com.hm.aidlserver.IRemoteService$1.add(IRemoteService.java:18)
at com.hm.aidlserver.IMyAidlInterface$Stub.onTransact(IMyAidlInterface.java:55)
at android.os.Binder.execTransact(Binder.java:565)
這里我們終于發(fā)現(xiàn)了原因否淤,我們看到Binder:17773_2
狀態(tài)是Sleeping,就是服務(wù)端的Binder對(duì)象的add()方法內(nèi)部第18行調(diào)用了Thread.sleep方法造成長時(shí)間無法返回棠隐,從而使客戶端方法執(zhí)行無法結(jié)束石抡,最終導(dǎo)致ANR。
總結(jié):本篇文章列舉了幾種常見原因造成ANR的示例宵荒,并分析了相關(guān)日志和traces.txt文件汁雷。然而在真實(shí)的場(chǎng)景中可能還會(huì)有各種稀奇古怪的原因造成ANR,排查起來也會(huì)復(fù)雜的多报咳,所以最重要的還是防患于未然侠讯,在實(shí)際的開發(fā)過程中盡量避免主線程被長時(shí)間阻塞。
參考鏈接: