1 ANR定義
ANR即應(yīng)用無(wú)響應(yīng)(application not responding),一般會(huì)彈出無(wú)響應(yīng)提示對(duì)話框,某些系統(tǒng)可能會(huì)屏蔽掉彈框功能攻柠。ANR的日志存放在/data/anr/下赃磨,此日志只記錄VM層面的日志,不會(huì)記錄Native的行為日志轧钓;
1.1 ANR滿(mǎn)足條件與情況
ANR的產(chǎn)生需要同時(shí)滿(mǎn)足三個(gè)條件:
- 主線程:只有應(yīng)用程序進(jìn)程的主線程響應(yīng)超時(shí)才會(huì)產(chǎn)生ANR序厉;
- 超時(shí)時(shí)間:產(chǎn)生ANR的上下文不同,超時(shí)時(shí)間也不同毕箍,但只要超過(guò)這個(gè)時(shí)間上限沒(méi)有響應(yīng)就會(huì)產(chǎn)生ANR弛房;
- 輸入事件/特定操作:輸入事件是指按鍵、觸屏等設(shè)備輸入事件而柑,特定操作是指BroadcastReceiver和Service的生命周期中的各個(gè)函數(shù)調(diào)用文捶。
產(chǎn)生ANR的上下文不同,導(dǎo)致ANR原因也不同媒咳,主要有以下三種情況:
InputDispatchingTimeout
主線程對(duì)5秒內(nèi)無(wú)法響應(yīng)屏幕觸摸事件或鍵盤(pán)輸入事件粹排;BroadcastQueueTimeout
BroadcastReceiver的onReceive()函數(shù)運(yùn)行在主線程中,在特定的時(shí)間(10秒)內(nèi)無(wú)法完成處理涩澡,后臺(tái)進(jìn)程超時(shí)時(shí)間是60秒顽耳。ServiceTimeout
比較少出現(xiàn)的一種類(lèi)型,原因是Service的各個(gè)生命周期函數(shù)在特定時(shí)間(20秒)內(nèi)無(wú)法完成處理妙同。
2 ANR原理
2.1 ANR的產(chǎn)生流程
具體代碼分析參考鏈接[3]
ANR 產(chǎn)生的流程如下所示:
- 判斷窗口是否可以分發(fā)事件
- 窗口處于 paused 狀態(tài), 則無(wú)法分發(fā)事件
- 如果窗口的連接沒(méi)有注冊(cè), 則無(wú)法分發(fā)事件
- 窗口連接失效, 則無(wú)法分發(fā)事件
- 窗口的事件待處理隊(duì)列 outboundQueue 已滿(mǎn), 則無(wú)法分發(fā)事件
- 若為 Key Event, 窗口的 outboundQueue 或 waitQueue 有數(shù)據(jù), 則無(wú)法分發(fā)事件
- 若非 Key Event, 窗口 waitQueue 非空 且 頭事件分發(fā)超時(shí) 500ms, 則無(wú)法分發(fā)事件
- 若不能分發(fā), 則安放定時(shí)器, 默認(rèn) 5s 之后重試
- 在此期間若是分發(fā)成功之后會(huì)重置定時(shí)器
- 重試的時(shí)候, 若仍然分發(fā)失敗, 則調(diào)用 onANRLocked 彈出 ANR 彈窗
2.2 ANR的彈出
具體代碼分析參考鏈接[3]
產(chǎn)生ANR對(duì)話框流程:
- onANRLocked 從 Native 層回溯到 Java 層的 IMS
- IMS 交由 AMS 處理 ANR
- AMS 處理 ANR
- 在 ServiceThread 線程 dump ANR 信息
- 在 UIThread 線程彈出 AppErrorDialog
2.3 ANR的信息收集
這部分代碼位于ActivityManagerService類(lèi)中:
final void appNotResponding(ProcessRecord app, ActivityRecord activity,
ActivityRecord parent, boolean aboveSystem, final String annotation) {
ArrayList<Integer> firstPids = new ArrayList<Integer>(5);
SparseArray<Boolean> lastPids = new SparseArray<Boolean>(20);
// mController是IActivityController接口的實(shí)例射富,是為Monkey測(cè)試程序預(yù)留的,默認(rèn)為null
if (mController != null) {
try {
// 0 == continue, -1 = kill process immediately
int res = mController.appEarlyNotResponding(app.processName, app.pid, annotation);
if (res < 0 && app.pid != MY_PID) {
app.kill("anr", true);
}
} catch (RemoteException e) {
mController = null;
Watchdog.getInstance().setActivityController(null);
}
}
// 更新CPU狀態(tài)信息
long anrTime = SystemClock.uptimeMillis();
if (MONITOR_CPU_USAGE) {
updateCpuStatsNow();
}
synchronized (this) {
// 某些特定情況下忽略本次ANR粥帚,比如系統(tǒng)關(guān)機(jī)胰耗,比如該進(jìn)程已經(jīng)處于anr狀態(tài)或者crash狀態(tài)
if (mShuttingDown) {
Slog.i(TAG, "During shutdown skipping ANR: " + app + " " + annotation);
return;
} else if (app.notResponding) {
Slog.i(TAG, "Skipping duplicate ANR: " + app + " " + annotation);
return;
} else if (app.crashing) {
Slog.i(TAG, "Crashing app skipping ANR: " + app + " " + annotation);
return;
}
// 為了防止多次對(duì)相同app的anr執(zhí)行重復(fù)代碼,在此處標(biāo)注記錄芒涡,屬于上面的特定情況種的一種
app.notResponding = true;
// 記錄ANR信息到Event Log中
EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,
app.processName, app.info.flags, annotation);
// 添加當(dāng)前app到firstpids列表中
firstPids.add(app.pid);
// 如果可能添加父進(jìn)程到firstpids列表種
int parentPid = app.pid;
if (parent != null && parent.app != null && parent.app.pid > 0) parentPid = parent.app.pid;
if (parentPid != app.pid) firstPids.add(parentPid);
if (MY_PID != app.pid && MY_PID != parentPid) firstPids.add(MY_PID);
// 添加所有進(jìn)程到firstpids中
for (int i = mLruProcesses.size() - 1; i >= 0; i--) {
ProcessRecord r = mLruProcesses.get(i);
if (r != null && r.thread != null) {
int pid = r.pid;
if (pid > 0 && pid != app.pid && pid != parentPid && pid != MY_PID) {
if (r.persistent) {
firstPids.add(pid);
} else {
lastPids.put(pid, Boolean.TRUE);
}
}
}
}
}
// 將ANR信息存在info變量中柴灯,后續(xù)打印到LOGCAT,這部分的信息會(huì)以ActivityManager為T(mén)ag打印出來(lái)拖陆,包含了ANR的進(jìn)程弛槐,出現(xiàn)原因以及當(dāng)時(shí)的CPU狀態(tài),這些對(duì)分析ANR是非常重要的信息
StringBuilder info = new StringBuilder();
info.setLength(0);
info.append("ANR in ").append(app.processName);
if (activity != null && activity.shortComponentName != null) {
info.append(" (").append(activity.shortComponentName).append(")");
}
info.append("\n");
info.append("PID: ").append(app.pid).append("\n");
if (annotation != null) {
info.append("Reason: ").append(annotation).append("\n");
}
if (parent != null && parent != activity) {
info.append("Parent: ").append(parent.shortComponentName).append("\n");
}
final ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
// 將ANR信息輸出到traces文件
File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,
NATIVE_STACKS_OF_INTEREST);
String cpuInfo = null;
if (MONITOR_CPU_USAGE) {
updateCpuStatsNow();
synchronized (mProcessCpuTracker) {
cpuInfo = mProcessCpuTracker.printCurrentState(anrTime);
}
info.append(processCpuTracker.printCurrentLoad());
info.append(cpuInfo);
}
info.append(processCpuTracker.printCurrentState(anrTime));
// 輸出到logcat的語(yǔ)句
Slog.e(TAG, info.toString());
// 如果traces文件未創(chuàng)建依啰,則只記錄當(dāng)前進(jìn)程trace并且發(fā)送QUIT信號(hào)到進(jìn)程
if (tracesFile == null) {
// There is no trace file, so dump (only) the alleged culprit's threads to the log
Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
}
// 將ANR信息處處到DropBox目錄下乎串,也就是說(shuō)除了traces文件還會(huì)有一個(gè)dropbox文件用于記錄ANR
addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
cpuInfo, tracesFile, null);
//...
synchronized (this) {
mBatteryStatsService.noteProcessAnr(app.processName, app.uid);
if (!showBackground && !app.isInterestingToUserLocked() && app.pid != MY_PID) {
app.kill("bg anr", true);
return;
}
// Set the app's notResponding state, and look up the errorReportReceiver
makeAppNotRespondingLocked(app,
activity != null ? activity.shortComponentName : null,
annotation != null ? "ANR " + annotation : "ANR",
info.toString());
// 發(fā)送SHOW_NOT_RESPONDING_MSG,準(zhǔn)備顯示ANR對(duì)話框
Message msg = Message.obtain();
HashMap<String, Object> map = new HashMap<String, Object>();
msg.what = SHOW_NOT_RESPONDING_MSG;
msg.obj = map;
msg.arg1 = aboveSystem ? 1 : 0;
map.put("app", app);
if (activity != null) {
map.put("activity", activity);
}
mUiHandler.sendMessage(msg);
}
}
當(dāng)發(fā)生ANR時(shí), 會(huì)按順序依次執(zhí)行:
- 輸出ANR Reason信息到EventLog. 也就是說(shuō)ANR觸發(fā)的時(shí)間點(diǎn)最接近的就是EventLog中輸出的am_anr信息;
- 收集并輸出重要進(jìn)程列表中的各個(gè)線程的traces信息叹誉,該方法較耗時(shí)鸯两;
- 輸出當(dāng)前各個(gè)進(jìn)程的CPU使用情況以及CPU負(fù)載情況;
- 將traces文件和 CPU使用情況信息保存到dropbox长豁,即data/system/dropbox目錄钧唐;
- 根據(jù)進(jìn)程類(lèi)型,來(lái)決定直接后臺(tái)殺掉,還是彈框告知用戶(hù);匠襟、
ANR輸出重要進(jìn)程的traces信息钝侠,這些進(jìn)程包含:
- firstPids隊(duì)列:第一個(gè)是ANR進(jìn)程,第二個(gè)是system_server酸舍,剩余是所有persistent進(jìn)程帅韧;
- Native隊(duì)列:是指/system/bin/目錄的mediaserver,sdcard 以及surfaceflinger進(jìn)程;
- lastPids隊(duì)列:是指mLruProcesses中的不屬于firstPids的所有進(jìn)程啃勉;
具體詳細(xì)過(guò)程參看:參考鏈接[4]
3 產(chǎn)生ANR常見(jiàn)原因及解決方案
- 主線程阻塞
解決辦法:避免死鎖的出現(xiàn)忽舟,使用子線程來(lái)處理耗時(shí)操作或阻塞任務(wù),避免主線程調(diào)用join()淮阐,sleep()或wait()方法叮阅;應(yīng)用程序的UI線程等待子線程釋放某個(gè)鎖,從而無(wú)法處理用戶(hù)的輸入泣特。 - I/O操作
解決方法:盡量避免在主線程文件讀取或數(shù)據(jù)庫(kù)操作浩姥、不要濫用SharePreferences。 - 頻繁刷新UI
解決辦法:避免頻繁實(shí)時(shí)刷新UI群扶,如下載進(jìn)度實(shí)時(shí)更新及刻,可以進(jìn)行采樣方式降低更新頻率; - 各大組件ANR
解決辦法:BroadCastReceiver不要進(jìn)行復(fù)雜操作的操作竞阐,可以在onReceive()方法中啟動(dòng)一個(gè)Service來(lái)處理缴饭;避免在Intent Receiver里啟動(dòng)一個(gè)Activity,因?yàn)樗鼤?huì)創(chuàng)建一個(gè)新的畫(huà)面骆莹,并從當(dāng)前用戶(hù)正在運(yùn)行的程序上搶奪焦點(diǎn)颗搂。如果你的應(yīng)用程序在響應(yīng)Intent廣 播時(shí)需要向用戶(hù)展示什么,你應(yīng)該使用Notification Manager來(lái)實(shí)現(xiàn)幕垦。
4 ANR分析
ANR可以生成traces.txt以及DropBox目錄下的ANR歷史記錄丢氢,因此可以考慮閱讀該文件來(lái)分析,除此之外我們還有DDMS幫助我們分析ANR先改,這兩種方式實(shí)際上是大同小異的疚察,只是應(yīng)用的場(chǎng)景不同。在講ANR分析之前仇奶,先看看Java應(yīng)用的分析貌嫡。
4.1 Java線程調(diào)用分析方法
為什么要在講Android的ANR分析方法之前提到Java的分析方法呢,因?yàn)樾枰诮忉孉NR之前稍微介紹一下線程狀態(tài)的概念,以便后面做敘述岛抄,同時(shí)也可以更方便的帶入分析的方法别惦。JDK中有一個(gè)關(guān)鍵命令可以幫助我們分析和調(diào)試Java應(yīng)用——jstack,命令的使用方法是:
jstack {pid}
其中pid可以通過(guò)jps命令獲得夫椭,jps命令會(huì)列出當(dāng)前系統(tǒng)中運(yùn)行的所有Java虛擬機(jī)進(jìn)程掸掸,比如這樣:
> jps
7266 Test
7267 Jps
上面的命令可以發(fā)現(xiàn)系統(tǒng)中目前有7266和7267兩個(gè)Java虛擬機(jī)進(jìn)程,此時(shí)如果想知道當(dāng)前Test進(jìn)程的情況蹭秋,就可以通過(guò)jstack命令來(lái)查看扰付。jstack命令的輸出結(jié)果很簡(jiǎn)單,它會(huì)打印出該進(jìn)程中所有線程的狀態(tài)以及調(diào)用關(guān)系感凤,甚至?xí)o出一些簡(jiǎn)單的分析結(jié)果:
2016-06-20 14:01:54
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode):
"Attach Listener" daemon prio=5 tid=0x00007fde7385d800 nid=0x3507 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"DestroyJavaVM" prio=5 tid=0x00007fde73873000 nid=0x1303 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Thread-1" prio=5 tid=0x00007fde73872800 nid=0x4a03 waiting for monitor entry [0x000000011cb30000]
java.lang.Thread.State: BLOCKED (on object monitor)
at Test.rightLeft(Test.java:48)
- waiting to lock <0x00000007d56540a0> (a Test$LeftObject)
- locked <0x00000007d5656180> (a Test$RightObject)
at Test$2.run(Test.java:68)
at java.lang.Thread.run(Thread.java:745)
"Thread-0" prio=5 tid=0x00007fde73871800 nid=0x4803 waiting for monitor entry [0x000000011ca2d000]
java.lang.Thread.State: BLOCKED (on object monitor)
at Test.leftRight(Test.java:34)
- waiting to lock <0x00000007d5656180> (a Test$RightObject)
- locked <0x00000007d56540a0> (a Test$LeftObject)
at Test$1.run(Test.java:60)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" daemon prio=5 tid=0x00007fde73821000 nid=0x4403 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=5 tid=0x00007fde73035000 nid=0x4203 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=5 tid=0x00007fde7381e000 nid=0x4003 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=5 tid=0x00007fde7481d800 nid=0x300f runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=5 tid=0x00007fde73010000 nid=0x2d03 in Object.wait() [0x000000011aacb000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d55047f8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
- locked <0x00000007d55047f8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" daemon prio=5 tid=0x00007fde7300f000 nid=0x2b03 in Object.wait() [0x000000011a9c8000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d5504410> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
- locked <0x00000007d5504410> (a java.lang.ref.Reference$Lock)
"VM Thread" prio=5 tid=0x00007fde7300c800 nid=0x2903 runnable
"GC task thread#0 (ParallelGC)" prio=5 tid=0x00007fde74000800 nid=0x2103 runnable
"GC task thread#1 (ParallelGC)" prio=5 tid=0x00007fde7400c000 nid=0x2303 runnable
"GC task thread#2 (ParallelGC)" prio=5 tid=0x00007fde7400c800 nid=0x2503 runnable
"GC task thread#3 (ParallelGC)" prio=5 tid=0x00007fde7400d000 nid=0x2703 runnable
"VM Periodic Task Thread" prio=5 tid=0x00007fde7481e000 nid=0x4603 waiting on condition
JNI global references: 110
Found one Java-level deadlock:
=============================
"Thread-1":
waiting to lock monitor 0x00007fde73818ab8 (object 0x00000007d56540a0, a Test$LeftObject),
which is held by "Thread-0"
"Thread-0":
waiting to lock monitor 0x00007fde73819f58 (object 0x00000007d5656180, a Test$RightObject),
which is held by "Thread-1"
Java stack information for the threads listed above:
===================================================
"Thread-1":
at Test.rightLeft(Test.java:48)
- waiting to lock <0x00000007d56540a0> (a Test$LeftObject)
- locked <0x00000007d5656180> (a Test$RightObject)
at Test$2.run(Test.java:68)
at java.lang.Thread.run(Thread.java:745)
"Thread-0":
at Test.leftRight(Test.java:34)
- waiting to lock <0x00000007d5656180> (a Test$RightObject)
- locked <0x00000007d56540a0> (a Test$LeftObject)
at Test$1.run(Test.java:60)
at java.lang.Thread.run(Thread.java:745)
Found 1 deadlock.
4.1.1 Thread基礎(chǔ)信息
輸出種包含所有的線程悯周,取其中的一條:
"Thread-1" prio=5 tid=0x00007fde73872800 nid=0x4a03 waiting for monitor entry [0x000000011cb30000]
java.lang.Thread.State: BLOCKED (on object monitor)
at Test.rightLeft(Test.java:48)
- waiting to lock <0x00000007d56540a0> (a Test$LeftObject)
- locked <0x00000007d5656180> (a Test$RightObject)
at Test$2.run(Test.java:68)
at java.lang.Thread.run(Thread.java:745)
- "Thread-1" prio=5 tid=0x00007fde73872800 nid=0x4a03 waiting for monitor entry [0x000000011cb30000]
首先描述了線程名是『Thread-1』,然后prio=5表示優(yōu)先級(jí)陪竿,tid表示的是線程id,nid表示native層的線程id屠橄,他們的值實(shí)際都是一個(gè)地址族跛,后續(xù)給出了對(duì)于線程狀態(tài)的描述,waiting for monitor entry [0x000000011cb30000]這里表示該線程目前處于一個(gè)等待進(jìn)入臨界區(qū)狀態(tài)锐墙,該臨界區(qū)的地址是[0x000000011cb30000]
這里對(duì)線程的描述多種多樣礁哄,簡(jiǎn)單解釋下上面出現(xiàn)的幾種狀態(tài)
- waiting on condition(等待某個(gè)事件出現(xiàn))
- waiting for monitor entry(等待進(jìn)入臨界區(qū))
- runnable(正在運(yùn)行)
- in Object.wait(處于等待狀態(tài))
- java.lang.Thread.State: BLOCKED (on object monitor)
這段是描述線程狀態(tài),我們知道Java的6種線程狀態(tài)定義在Thread.java中:
//Thread.java
public class Thread implements Runnable {
...
public enum State {
/**
* The thread has been created, but has never been started.
*/
NEW,
/**
* The thread may be run.
*/
RUNNABLE,
/**
* The thread is blocked and waiting for a lock.
*/
BLOCKED,
/**
* The thread is waiting.
*/
WAITING,
/**
* The thread is waiting for a specified amount of time.
*/
TIMED_WAITING,
/**
* The thread has been terminated.
*/
TERMINATED
}
...
}
- at xxx 調(diào)用棧
at Test.rightLeft(Test.java:48)
- waiting to lock <0x00000007d56540a0> (a Test$LeftObject)
- locked <0x00000007d5656180> (a Test$RightObject)
at Test$2.run(Test.java:68)
at java.lang.Thread.run(Thread.java:745)
這段線程的調(diào)用棧溪北,可以看到線程在我們執(zhí)行jstack命令的時(shí)候運(yùn)行到Test.java的48行桐绒,而在68行到48行之間,線程對(duì)一個(gè)TestLeftObject鎖茉继。
4.1.2 jstack分析信息
Found one Java-level deadlock:
=============================
"Thread-1":
waiting to lock monitor 0x00007fde73818ab8 (object 0x00000007d56540a0, a Test$LeftObject),
which is held by "Thread-0"
"Thread-0":
waiting to lock monitor 0x00007fde73819f58 (object 0x00000007d5656180, a Test$RightObject),
which is held by "Thread-1"
Java stack information for the threads listed above:
===================================================
"Thread-1":
at Test.rightLeft(Test.java:48)
- waiting to lock <0x00000007d56540a0> (a Test$LeftObject)
- locked <0x00000007d5656180> (a Test$RightObject)
at Test$2.run(Test.java:68)
at java.lang.Thread.run(Thread.java:745)
"Thread-0":
at Test.leftRight(Test.java:34)
- waiting to lock <0x00000007d5656180> (a Test$RightObject)
- locked <0x00000007d56540a0> (a Test$LeftObject)
at Test$1.run(Test.java:60)
at java.lang.Thread.run(Thread.java:745)
說(shuō)明中的信息很詳細(xì),它認(rèn)為我們的應(yīng)用出現(xiàn)了一個(gè)Java層的死鎖,即Thread-1等待一個(gè)被Thread-0持有的鎖,Thread-0等待一個(gè)被Thread-1持有的鎖叁怪,實(shí)際上的確也是這樣益眉,最后再來(lái)看看源代碼是不是這么回事。
public class Test {
public static class LeftObject {
}
public static class RightObject {
}
private Object leftLock = new LeftObject();
private Object rightLock = new RightObject();
public void leftRight() {
synchronized (leftLock) {
try {
TimeUnit.SECONDS.sleep(3);
} catch (InterruptedException e) {
e.printStackTrace();
}
synchronized (rightLock) {
System.out.println("leftRight");
}
}
}
public void rightLeft() {
synchronized (rightLock) {
try {
TimeUnit.SECONDS.sleep(3);
} catch (InterruptedException e) {
e.printStackTrace();
}
synchronized (leftLock) {
System.out.println("leftRight");
}
}
}
public static void main(String[] args) {
final Test test = new Test();
new Thread(new Runnable() {
@Override
public void run() {
test.leftRight();
}
}).start();
new Thread(new Runnable() {
@Override
public void run() {
test.rightLeft();
}
}).start();
}
}
4.2 DDMS分析ANR問(wèn)題
有了上面的基礎(chǔ)关摇,再來(lái)看看Android的ANR如何分析,Android的DDMS工具其實(shí)已經(jīng)給我們提供了一個(gè)類(lèi)似于jstack命令的玩意,可以很好的讓我們調(diào)試的時(shí)候?qū)崟r(shí)查看Android虛擬機(jī)的線程狀況终吼。
4.2.1 使用DDMS—Update Threads工具
使用DDMS的Update Threads工具可以分為如下幾步:
1)選擇需要查看的進(jìn)程;
2)點(diǎn)擊Update Threads按鈕氯哮;
3)在Threads視圖查看該進(jìn)程的所有線程狀態(tài)际跪;
4.2.2 閱讀Update Threads的輸出
Update Threads工具可以輸出當(dāng)前進(jìn)程的所有線程的狀態(tài),上半部分是線程列表,選中其中一條下半部分將展現(xiàn)出該線程當(dāng)前的調(diào)用棧垫卤。
1. 線程列表
上半部分種的線程列表分為好幾列威彰,其中ID欄表示的序號(hào),其中帶有『*』標(biāo)志的是守護(hù)線程穴肘,Tid是線程號(hào)歇盼,Status表示線程狀態(tài),utime表示執(zhí)行用戶(hù)代碼的累計(jì)時(shí)間评抚,stime表示執(zhí)行系統(tǒng)代碼的累計(jì)時(shí)間豹缀,Name表示的是線程名字。實(shí)際上utime還有stime他們具體的含義我也不是太清楚慨代,不過(guò)這不影響我們分析問(wèn)題邢笙,這里需要特別注意的是main線程啦,還有線程狀態(tài)侍匙。
2. main線程
main線程就是應(yīng)用主線程啦氮惯,點(diǎn)擊上半部分線程列表選中main線程,我們可以發(fā)現(xiàn)想暗,絕大多數(shù)不操作應(yīng)用的情況下妇汗,調(diào)用棧應(yīng)該是如下樣式的:
這是一個(gè)空閑等待狀態(tài),等待其他線程或者進(jìn)程發(fā)送消息到主線程说莫,再由主線程處理相應(yīng)事件杨箭。如果主線程在執(zhí)行過(guò)程中出現(xiàn)了問(wèn)題,就會(huì)出現(xiàn)ANR储狭,結(jié)合下面關(guān)于線程狀態(tài)的分析可以知道如果主線程的狀態(tài)是MONITOR一般肯定就是出現(xiàn)了ANR了互婿。
3. 線程狀態(tài)
我們剛剛在分心Java線程狀態(tài)時(shí)明明只有6個(gè)狀態(tài),但是現(xiàn)在Android虛擬機(jī)給出的線程狀態(tài)超出了這6個(gè)的限制辽狈,這也是需要在源碼中尋找答案的,VMThread.java類(lèi)中有這么一段代碼:
/**
* Holds a mapping from native Thread statuses to Java one. Required for
* translating back the result of getStatus().
*/
static final Thread.State[] STATE_MAP = new Thread.State[] {
Thread.State.TERMINATED, // ZOMBIE
Thread.State.RUNNABLE, // RUNNING
Thread.State.TIMED_WAITING, // TIMED_WAIT
Thread.State.BLOCKED, // MONITOR
Thread.State.WAITING, // WAIT
Thread.State.NEW, // INITIALIZING
Thread.State.NEW, // STARTING
Thread.State.RUNNABLE, // NATIVE
Thread.State.WAITING, // VMWAIT
Thread.State.RUNNABLE // SUSPENDED
};
而且慈参,native層的Thread.cpp中還有一段代碼:
const char* dvmGetThreadStatusStr(ThreadStatus status) {
switch (status) {
case THREAD_ZOMBIE: return "ZOMBIE";
case THREAD_RUNNING: return "RUNNABLE";
case THREAD_TIMED_WAIT: return "TIMED_WAIT";
case THREAD_MONITOR: return "MONITOR";
case THREAD_WAIT: return "WAIT";
case THREAD_INITIALIZING: return "INITIALIZING";
case THREAD_STARTING: return "STARTING";
case THREAD_NATIVE: return "NATIVE";
case THREAD_VMWAIT: return "VMWAIT";
case THREAD_SUSPENDED: return "SUSPENDED";
default: return "UNKNOWN";
}
}
由此我們可以看到Android虛擬機(jī)中有10種線程狀態(tài),其對(duì)應(yīng)關(guān)系如下:
Thread.java中定義的狀態(tài) | Thread.cpp中定義的狀態(tài) | 說(shuō)明 |
---|---|---|
TERMINATED | ZOMBIE | 線程死亡稻艰,終止運(yùn)行 |
RUNNABLE | RUNNING/RUNNABLE | 線程可運(yùn)行或正在運(yùn)行 |
TIMED_WAITING | TIMED_WAIT | 執(zhí)行了帶有超時(shí)參數(shù)的wait懂牧、sleep或join函數(shù) |
BLOCKED | MONITOR | 線程阻塞,等待獲取對(duì)象鎖 |
WAITING | WAIT | 執(zhí)行了無(wú)超時(shí)參數(shù)的wait函數(shù) |
NEW | INITIALIZING | 新建尊勿,正在初始化僧凤,為其分配資源 |
NEW | STARTING | 新建,正在啟動(dòng) |
RUNNABLE | NATIVE | 正在執(zhí)行JNI本地函數(shù) |
WAITING | VMWAIT | 正在等待VM資源 |
RUNNABLE | SUSPENDED | 線程暫停元扔,通常是由于GC或debug被暫停 |
UNKNOWN | 未知狀態(tài) |
4.2.3 舉個(gè)例子
public class MainActivity extends AppCompatActivity {
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
Button mBtn = (Button) findViewById(R.id.button);
mBtn.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
print();
}
});
}
public void print() {
BufferedReader bufferedReader = null;
String tmp = null;
try {
bufferedReader = new BufferedReader(new FileReader(new File(Environment.getExternalStorageDirectory() + "/test")));
while ((tmp = bufferedReader.readLine()) != null) {
Log.i("wangchen", tmp);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (bufferedReader != null) {
try {
bufferedReader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
簡(jiǎn)單的一個(gè)Activity躯保,點(diǎn)擊按鈕時(shí)將讀取文件內(nèi)容并進(jìn)行打印到logcat,本身沒(méi)有什么大問(wèn)題澎语,但是在該Activity的按鈕被點(diǎn)擊時(shí)卻出現(xiàn)了未響應(yīng)的情況途事。
通過(guò)DDMS验懊,我們查看到當(dāng)前未響應(yīng)時(shí)主線程一直處于如下調(diào)用狀態(tài):
at android.util.Log.println_native(Native Method)
at android.util.Log.i(Log.java:173)
at com.example.wangchen.androitest.MainActivity.print(MainActivity.java:37)
at com.example.wangchen.androitest.MainActivity$1.onClick(MainActivity.java:26)
at android.view.View.performClick(View.java:4446)
at android.view.View$PerformClick.run(View.java:18480)
at android.os.Handler.handleCallback(Handler.java:733)
at android.os.Handler.dispatchMessage(Handler.java:95)
at android.os.Looper.loop(Looper.java:136)
at android.app.ActivityThread.main(ActivityThread.java:5314)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:515)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:864)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:680)
at dalvik.system.NativeStart.main(Native Method)
由上面對(duì)主線程的分析可以知道,正常情況下主線程應(yīng)當(dāng)是處于空閑等待狀態(tài)尸变,如果長(zhǎng)時(shí)間處于處理某一個(gè)任務(wù)時(shí)就會(huì)導(dǎo)致其他被發(fā)送到主線程的事件無(wú)法被及時(shí)處理义图,導(dǎo)致ANR,實(shí)際上這里的test文件有30M召烂,完全打印是非常耗時(shí)的碱工,導(dǎo)致ANR也就理所當(dāng)然了,所以對(duì)于文件讀寫(xiě)操作還是建議在非主線程操作奏夫。
4.3 trace文件分析
在開(kāi)發(fā)調(diào)試過(guò)程中遇到ANR問(wèn)題大多是可以通過(guò)DDMS方法來(lái)分析問(wèn)題原因的怕篷,但是所有的ANR問(wèn)題不一定會(huì)在開(kāi)發(fā)階段出現(xiàn),如果在測(cè)試或者發(fā)版之后出現(xiàn)了ANR問(wèn)題酗昼,那么就需要通過(guò)traces文件來(lái)分析廊谓。根據(jù)之前的分析我們知道,traces文件位于/data/anr目錄下麻削,即便是沒(méi)有root的手機(jī)也是可以通過(guò)adb命令將該文件pull出來(lái)蒸痹,一個(gè)traces文件中包含了出現(xiàn)ANR時(shí)當(dāng)前系統(tǒng)的所有活動(dòng)進(jìn)程的情況,其中每一個(gè)進(jìn)程會(huì)包含所有線程的情況碟婆,因此文件的內(nèi)容量往往比較大电抚。但是一般造成ANR的進(jìn)程會(huì)被記錄在頭一段,因此盡可能詳細(xì)的分析頭一段進(jìn)程是解析traces文件的重要方法竖共。
4.3.1 trace文件內(nèi)容
adb pull /data/anr/traces.txt ./mytraces.txt
通過(guò)命令獲得trace文件,內(nèi)容如下:
// 進(jìn)程頭部信息:進(jìn)程的pid號(hào)俺祠,當(dāng)前時(shí)間公给,進(jìn)程名稱(chēng)
----- pid 4280 at 2016-05-30 00:17:13 -----
Cmd line: com.quicinc.cne.CNEService
// 進(jìn)程資源狀態(tài)信息
Build fingerprint: 'Xiaomi/virgo/virgo:6.0.1/MMB29M/6.3.21:user/release-keys'
ABI: 'arm'
Build type: optimized
Zygote loaded classes=4124 post zygote classes=18
Intern table: 51434 strong; 17 weak
JNI: CheckJNI is off; globals=286 (plus 277 weak)
Libraries: /system/lib/libandroid.so /system/lib/libcompiler_rt.so /system/lib/libjavacrypto.so /system/lib/libjnigraphics.so /system/lib/libmedia_jni.so /system/lib/libmiuinative.so /system/lib/libsechook.so /system/lib/libwebviewchromium_loader.so libjavacore.so (9)
Heap: 50% free, 16MB/33MB; 33690 objects
Dumping cumulative Gc timings
Total number of allocations 33690
Total bytes allocated 16MB
Total bytes freed 0B
Free memory 16MB
Free memory until GC 16MB
Free memory until OOME 111MB
Total memory 33MB
Max memory 128MB
Zygote space size 1624KB
Total mutator paused time: 0
Total time waiting for GC to complete: 0
Total GC count: 0
Total GC time: 0
Total blocking GC count: 0
Total blocking GC time: 0
suspend all histogram: Sum: 102us 99% C.I. 3us-25us Avg: 8.500us Max: 25us
// 每條線程的信息
DALVIK THREADS (10):
"Signal Catcher" daemon prio=5 tid=2 Runnable
| group="system" sCount=0 dsCount=0 obj=0x12c470a0 self=0xaeb8b000
| sysTid=4319 nice=0 cgrp=default sched=0/0 handle=0xb424f930
| state=R schedstat=( 111053493 34114006 33 ) utm=6 stm=5 core=0 HZ=100
| stack=0xb4153000-0xb4155000 stackSize=1014KB
| held mutexes= "mutator lock"(shared held)
native: #00 pc 00370e89 /system/lib/libart.so (art::DumpNativeStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int, char const*, art::ArtMethod*, void*)+160)
native: #01 pc 003504f7 /system/lib/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+150)
native: #02 pc 0035a3fb /system/lib/libart.so (art::DumpCheckpoint::Run(art::Thread*)+442)
native: #03 pc 0035afb9 /system/lib/libart.so (art::ThreadList::RunCheckpoint(art::Closure*)+212)
native: #04 pc 0035b4e7 /system/lib/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+142)
native: #05 pc 0035bbf7 /system/lib/libart.so (art::ThreadList::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+334)
native: #06 pc 00333d3f /system/lib/libart.so (art::Runtime::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+74)
native: #07 pc 0033b0a5 /system/lib/libart.so (art::SignalCatcher::HandleSigQuit()+928)
native: #08 pc 0033b989 /system/lib/libart.so (art::SignalCatcher::Run(void*)+340)
native: #09 pc 0003f54f /system/lib/libc.so (__pthread_start(void*)+30)
native: #10 pc 00019c2f /system/lib/libc.so (__start_thread+6)
(no managed stack frames)
"main" prio=5 tid=1 Native
| group="main" sCount=1 dsCount=0 obj=0x7541b3c0 self=0xb4cf6500
| sysTid=4280 nice=-1 cgrp=default sched=0/0 handle=0xb6f5cb34
| state=S schedstat=( 52155108 81807757 159 ) utm=2 stm=3 core=0 HZ=100
| stack=0xbe121000-0xbe123000 stackSize=8MB
| held mutexes=
native: #00 pc 00040984 /system/lib/libc.so (__epoll_pwait+20)
native: #01 pc 00019f5b /system/lib/libc.so (epoll_pwait+26)
native: #02 pc 00019f69 /system/lib/libc.so (epoll_wait+6)
native: #03 pc 00012c57 /system/lib/libutils.so (android::Looper::pollInner(int)+102)
native: #04 pc 00012ed3 /system/lib/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+130)
native: #05 pc 00082bed /system/lib/libandroid_runtime.so (android::NativeMessageQueue::pollOnce(_JNIEnv*, _jobject*, int)+22)
native: #06 pc 0000055d /data/dalvik-cache/arm/system@framework@boot.oat (Java_android_os_MessageQueue_nativePollOnce__JI+96)
at android.os.MessageQueue.nativePollOnce(Native method)
at android.os.MessageQueue.next(MessageQueue.java:323)
at android.os.Looper.loop(Looper.java:135)
at android.app.ActivityThread.main(ActivityThread.java:5435)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:735)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:625)
// ...
----- end 4280 -----
4.3.2 舉個(gè)例子
一般traces.txt日志輸出格式如下,本實(shí)例是在主線程中強(qiáng)行Sleep導(dǎo)致的ANR日志:
DALVIKTHREADS :
(mutexes: tll=0 tsl=0 tscl=0 ghl=0 hwl=0 hwll=0)
"main" prio=5 tid=1 Sleeping
| group="main" sCount=1 dsCount=0 obj=0x73f11000 self=0xf3c25800
| sysTid=2957 nice=0 cgrp=default sched=0/0 handle=0xf7770ea0
| state=S schedstat=( 107710942 40533261 131 ) utm=4 stm=6 core=2 HZ=100
| stack=0xff49d000-0xff49f000 stackSize=8MB
| heldmutexes=
atjava.lang.Thread.sleep!(Native method)
- sleepingon <0x31fd6f5d> (a java.lang.Object)
atjava.lang.Thread.sleep(Thread.java:1031)
- locked <0x31fd6f5d> (a java.lang.Object)
atjava.lang.Thread.sleep(Thread.java:985)
atcom.sunny.demo.MainActivity.startMethod(MainActivity.java:21)
atjava.lang.reflect.Method.invoke!(Native method)
atjava.lang.reflect.Method.invoke(Method.java:372)
atandroid.view.View$1.onClick(View.java:4015)
atandroid.view.View.performClick(View.java:4780)
atandroid.view.View$PerformClick.run(View.java:19866)
atandroid.os.Handler.handleCallback(Handler.java:739)
atandroid.os.Handler.dispatchMessage(Handler.java:95)
atandroid.os.Looper.loop(Looper.java:135)
atandroid.app.ActivityThread.main(ActivityThread.java:5254)
atjava.lang.reflect.Method.invoke!(Native method)
atjava.lang.reflect.Method.invoke(Method.java:372)
atcom.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:903)
atcom.android.internal.os.ZygoteInit.main(ZygoteInit.java:698)
- 第1行:固定頭蜘渣,指明下面都是當(dāng)前運(yùn)行的dvm thread:“DALVIK THREADS”淌铐;
- 第2行:輸出的是該進(jìn)程中各線程互斥量的值,有些手機(jī)上面可能沒(méi)有這一行日志信息蔫缸;
- 第3行:輸出的是線程名字(“main”)腿准,線程優(yōu)先級(jí)(“prio=5”),線程id(“tid=1”)拾碌,線程狀態(tài)(Sleeping)吐葱,比較常見(jiàn)的狀態(tài)還有Native、Waiting校翔;
- 第4行:分別是線程所處的線程組 (“main”)弟跑,線程被正常掛起的次處(“sCount=1”),線程因調(diào)試而掛起次數(shù)(”dsCount=0“)防症,當(dāng)前線程所關(guān)聯(lián)的java線程對(duì)象(”obj=0x73f11000“)以及該線程本身的地址(“0xf3c25800”)孟辑;
- 第5行:顯示線程調(diào)度信息哎甲,分別是該線程在linux系統(tǒng)下得本地線程id (“ sysTid=2957”),線程的調(diào)度有優(yōu)先級(jí)(“nice=0”)饲嗽,調(diào)度策略(sched=0/0)炭玫,優(yōu)先組屬(“cgrp=default”)以及 處理函數(shù)地址(“handle=0xf7770ea0”);
- 第6行:顯示更多該線程當(dāng)前上下文貌虾,分別是調(diào)度狀態(tài)(從 /proc/[pid]/task/[tid]/schedstat讀出)(“schedstat=( 107710942 40533261 131 )”)吞加,以及該線程運(yùn)行信息 ,它們是線程用戶(hù)態(tài)下使用的時(shí)間值(單位是jiffies)(“utm=4”)酝惧, 內(nèi)核態(tài)下得調(diào)度時(shí)間值(“stm=6”)榴鼎,以及最后運(yùn)行該線程的cpu標(biāo)識(shí)(“core=2”);
- 第7行:表示線程棧的地址(“stack=0xff49d000-0xff49f000”)以及棧大型泶健(“stackSize=8MB”)巫财;
- 后面是線程的調(diào)用棧信息,也是分析ANR的核心所在哩陕。
分析調(diào)試棧發(fā)現(xiàn):at com.sunny.demo.MainActivity.startMethod(MainActivity.java:21) 很容易就可以定位到我們的問(wèn)題所在平项。
5 ANR 檢測(cè)
5.1 StrictMode
嚴(yán)格模式StrictMode是Android SDK提供的一個(gè)用來(lái)檢測(cè)代碼中是否存在違規(guī)操作的工具類(lèi),StrictMode主要檢測(cè)兩大類(lèi)問(wèn)題:
1. 線程策略ThreadPolicy
- detectCustomSlowCalls: 檢測(cè)自定義耗時(shí)操作悍及。
- detectDiskReads: 檢測(cè)是否存在磁盤(pán)讀取操作闽瓢。
- detectDiskWrites: 檢測(cè)是否存在磁盤(pán)寫(xiě)入操作。
- detectNetWork: 檢測(cè)是否存在網(wǎng)絡(luò)操作心赶。
2. 虛擬機(jī)策略VmPolicy - detectActivityLeaks: 檢測(cè)是否存在Activity泄漏扣讼。
- detectLeakedClosableObjects: 檢測(cè)是否存在未關(guān)閉的Closable對(duì)象泄漏。
- detectLeakedSqlLiteObjects: 檢測(cè)是否存在Sqlite對(duì)象泄漏缨叫。
- setClassInstanceLimit: 檢測(cè)類(lèi)實(shí)例個(gè)數(shù)是否超過(guò)限制椭符。
可以看到,其中ThreadPolicy可以用來(lái)檢測(cè)可能存在的主線程耗時(shí)操作耻姥,解決這些檢測(cè)到問(wèn)題能搞減少應(yīng)用發(fā)生ANR的概率销钝。注意:只能在Debug模式下使用它。在應(yīng)用初始化的地方琐簇,例如Application或者M(jìn)ainActivity類(lèi)的onCreate方法中執(zhí)行如下代碼即可:
@Override
protected void onCreate(Bundle savedInstanceState) {
if (BuildConfig.DEBUG) {
//開(kāi)啟線程模式所有檢測(cè)策略
StrictMode.setThreadPolicy(new StrictMode.ThreadPolicy.Builder().detectAll().penaltyLog().build());
//開(kāi)啟虛擬機(jī)模式所有檢測(cè)策略
StrictMode.setVmPolicy(new VmPolicy.Builder().detectAll().penaltyLog().build());
}
super.onCreate(savedInstanceState);
}
上面的初始化代碼調(diào)用penaltyLog表示在Logcat中打印日志蒸健,調(diào)用detectAll方法表示啟動(dòng)所有的檢測(cè)策略,可以根據(jù)需求只開(kāi)啟部分檢測(cè)功能婉商。
5.2 BlockCanary
BlockCanary是一個(gè)非入侵式的性能監(jiān)控函數(shù)庫(kù)似忧,它的用法和LeakCanary類(lèi)似,用來(lái)監(jiān)控應(yīng)用主線程卡頓据某。它的基本原理是利用主線程的消息隊(duì)列處理機(jī)制橡娄,通過(guò)對(duì)比消息分發(fā)開(kāi)始和結(jié)束時(shí)間點(diǎn)來(lái)判斷是否超過(guò)設(shè)定時(shí)間。
在build.gradle中添加依賴(lài)
dependencies {
compile 'com.github.moduth:blockcanary-android:1.2.1'
//僅在debug包啟用BlockCanary進(jìn)行卡頓監(jiān)控和提示
debugCompile 'com.github.moduth:blockcanary-android:1.2.1'
releaseCompile 'com.github.moduth:blockcanary-no-op:1.2.1'
}
public class DemoApplication extends Application {
@Override
public void onCreate() {
//在主進(jìn)程初始化調(diào)用
BlockCanary.install(this, new AppBlockCanaryContext()).start();
}
}
public class AppBlockCanaryContext extends BlockCanaryContext {
//實(shí)現(xiàn)各種上下文癣籽,包括應(yīng)用標(biāo)識(shí)符挽唉,用戶(hù)uid滤祖,網(wǎng)絡(luò)類(lèi)型,卡慢判斷閾值瓶籽,Log保存位置等
}
參考鏈接:
[1] 安卓應(yīng)用無(wú)響應(yīng)匠童,你真的了解嗎?
[2] Android ANR的產(chǎn)生與分析 ★
[3] Android系統(tǒng)架構(gòu) —— IMS的ANR 產(chǎn)生流程 ★★
[4] 理解Android ANR的信息收集過(guò)程 ★
[5] Android ANR:原理分析及解決辦法
[6] Android應(yīng)用ANR分析 ★
[7] ANR產(chǎn)生的原因及其定位分析 ★
[8] ANR產(chǎn)生的原因及定位分析 √