簡(jiǎn)介
現(xiàn)在感覺(jué)自己做的工作,基本上脫離Android了就是用java寫(xiě)代碼,而且可能試用期完了就會(huì)被刷悠夯,很多東西是公司自己的,完全不知道怎么下手研究躺坟,導(dǎo)師指導(dǎo)也就是幾句話的說(shuō)你看下哪里哪里的代碼沦补。要不是旁邊的一個(gè)大牛,我是真的走不下去了瞳氓。大佬們策彤,要是被刷求介紹工作,不怕加班,只怕沒(méi)有討論沒(méi)有指導(dǎo)匣摘。
寫(xiě)這篇博客另外感謝之前小米的同事店诗,她在北郵念完研究生,但是卻很低調(diào)音榜,她是很好的學(xué)姐庞瘸,她當(dāng)時(shí)工作就是處理ANR問(wèn)題,所以一些資料是引用學(xué)姐的赠叼,當(dāng)然也有沒(méi)闡述清楚的問(wèn)題擦囊,比如廣播ANR和輸入ANR這個(gè)我們以后專(zhuān)門(mén)在寫(xiě)违霞。
題外話說(shuō)完了,該進(jìn)入正題了瞬场,ANR:Application Not Responding 即應(yīng)用無(wú)響應(yīng)
其實(shí)應(yīng)用沒(méi)響應(yīng)买鸽,彈出那個(gè)框框,就是Android系統(tǒng)要讓我們知道贯被,可能某些錯(cuò)誤的操作或者代碼使得沒(méi)有執(zhí)行應(yīng)用眼五,提升用戶(hù)體驗(yàn),但是我們開(kāi)發(fā)者應(yīng)該避免ANR彤灶,如果你還想和IOS系統(tǒng)競(jìng)爭(zhēng)的話看幼。要不然就真輸了。
理解核心
首先ANR分成四個(gè)類(lèi)型:
- ServiceTimeout-Service,(bind幌陕,create诵姜,start,unbind等等)搏熄,超過(guò)前臺(tái)20s棚唆,后臺(tái)200s沒(méi)有處理
完成發(fā)生ANR - BroadcastTimeout- BroadcastReceiver,超過(guò)前臺(tái)10S,后臺(tái)60s沒(méi)有處理完成發(fā)生ANR
- KeyDispatchTimeout-主要類(lèi)型按鍵或觸摸事件,觸摸開(kāi)始計(jì)算 超過(guò)5S沒(méi)有處理完成發(fā)生ANR
- ProcessContentProviderPublishTimedOutLocked-ContentProvider publish在20s沒(méi)有處理完成發(fā)生
ANR
Service-ANR
我們?nèi)绻私鈙ervice的聲明周期我們可能知道生命周期中的有些方法在主線程中執(zhí)行心例。所以我們就要從那些在主線程中執(zhí)行的方法開(kāi)始排查
- onCreate(),
- onStartCommand()等)有沒(méi)有做耗時(shí)的操作
如果看樣子代碼邏輯上沒(méi)有什么問(wèn)題瑟俭,那就要從系統(tǒng)狀態(tài)開(kāi)始查:
CPU的使用情況、系統(tǒng)服務(wù)的狀態(tài)等契邀,判斷當(dāng)時(shí)發(fā)生ANR進(jìn)程是否
受到系統(tǒng)運(yùn)行異常的影響。
錯(cuò)誤打印的log舉例:
Reason: Executing service com.android.bluetooth/.btservice.AdapterService(包名)
那么Service發(fā)生ANR的機(jī)制是什么失暴,我們肯定能猜出來(lái)坯门,在主線程中運(yùn)行超過(guò)時(shí)間了就發(fā)生ANR了,那么如果讓你設(shè)計(jì)會(huì)怎么設(shè)計(jì)逗扒,肯定是先埋下一個(gè)炸彈古戴,然后到了時(shí)間如果沒(méi)有人拆除那就爆炸。一個(gè)道理矩肩,Service-ANR就是這樣做的现恼。
void scheduleServiceTimeoutLocked(ProcessRecord proc) {
if (proc.executingServices.size() == 0 || proc.thread == null) {
return;
}
long now = SystemClock.uptimeMillis();
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_TIMEOUT_MSG);
msg.obj = proc;
//當(dāng)超時(shí)后仍沒(méi)有remove該SERVICE_TIMEOUT_MSG消息,則執(zhí)行service Timeout流程
mAm.mHandler.sendMessageAtTime(msg,
proc.execServicesFg ? (now+SERVICE_TIMEOUT) : (now+ SERVICE_BACKGROUND_TIMEOUT));
}
發(fā)送延時(shí)消息SERVICE_TIMEOUT_MSG,延時(shí)時(shí)長(zhǎng):
- 對(duì)于前臺(tái)服務(wù)黍檩,則超時(shí)為SERVICE_TIMEOUT叉袍,即timeout=20s;
- 對(duì)于后臺(tái)服務(wù)刽酱,則超時(shí)為SERVICE_BACKGROUND_TIMEOUT喳逛,即timeout=200s;
private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,
boolean finishing) {
r.executeNesting--;
if (r.executeNesting <= 0) {
if (r.app != null) {
r.app.execServicesFg = false;
r.app.executingServices.remove(r);
if (r.app.executingServices.size() == 0) {
//移除服務(wù)啟動(dòng)超時(shí)的消息
mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
} else if (r.executeFg) {
...
}
if (inDestroying) {
mDestroyingServices.remove(r);
r.bindings.clear();
}
mAm.updateOomAdjLocked(r.app);
}
r.executeFg = false;
...
if (finishing) {
if (r.app != null && !r.app.persistent) {
r.app.services.remove(r);
}
r.app = null;
}
}
}
handleCreateService()執(zhí)行后便會(huì)移除服務(wù)啟動(dòng)超時(shí)的消息SERVICE_TIMEOUT_MSG棵里。 Service啟動(dòng)過(guò)程出現(xiàn)ANR润文,”executing service [發(fā)送超時(shí)serviceRecord信息]”姐呐, 這往往是service的onCreate()回調(diào)方法執(zhí)行時(shí)間過(guò)長(zhǎng)。
在哪里調(diào)用的呢典蝌?
private void handleCreateService(CreateServiceData data) {
//當(dāng)應(yīng)用處于后臺(tái)即將進(jìn)行GC曙砂,而此時(shí)被調(diào)回到活動(dòng)狀態(tài),則跳過(guò)本次gc骏掀。
unscheduleGcIdler();
LoadedApk packageInfo = getPackageInfoNoCheck(data.info.applicationInfo, data.compatInfo);
java.lang.ClassLoader cl = packageInfo.getClassLoader();
//通過(guò)反射創(chuàng)建目標(biāo)服務(wù)對(duì)象
Service service = (Service) cl.loadClass(data.info.name).newInstance();
...
try {
//創(chuàng)建ContextImpl對(duì)象
ContextImpl context = ContextImpl.createAppContext(this, packageInfo);
context.setOuterContext(service);
//創(chuàng)建Application對(duì)象
Application app = packageInfo.makeApplication(false, mInstrumentation);
service.attach(context, this, data.info.name, data.token, app,
ActivityManagerNative.getDefault());
//調(diào)用服務(wù)onCreate()方法
service.onCreate();
mServices.put(data.token, service);
//調(diào)用服務(wù)創(chuàng)建完成
ActivityManagerNative.getDefault().serviceDoneExecuting(
data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
} catch (Exception e) {
...
}
}
onCreate()方法調(diào)用完成之后調(diào)用的鸠澈。
Broadcast處理超時(shí)
主線程在執(zhí)行 BroadcastReceiver 的 onReceive 函數(shù)時(shí)10/60 秒內(nèi)沒(méi)有執(zhí)行完畢
錯(cuò)誤打印log舉例:
Reason: Broadcast of Intent { act=android.net.wifi.WIFI_STATE_CHANGED flg=0x4000010
cmp=com.android.settings/.widget.SettingsAppWidgetProvider (has extras) }
和Service一樣,廣播的一些回調(diào)也在主線程中砖织,當(dāng)然onReceive()也可以調(diào)度在其他線程執(zhí)行款侵,通過(guò)Context.registerReceiver(BroadcastReceiver, IntentFilter,
String, Handler)這個(gè)方法注冊(cè)廣播接收器, 可以指定一個(gè)處理的Handler侧纯,將onReceive()調(diào)度在非主線程執(zhí)行
final void processNextBroadcast(boolean fromMsg) {
...
r = mOrderedBroadcasts.get(0); //獲取串行廣播的第一個(gè)廣播
boolean forceReceive = false;
int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
if (mService.mProcessesReady && r.dispatchTime > 0) {
long now = SystemClock.uptimeMillis();
if ((numReceivers > 0) && (now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
broadcastTimeoutLocked(false); //當(dāng)廣播處理時(shí)間超時(shí)新锈,則強(qiáng)制結(jié)束這條廣播
}
}
...
if (r.receivers == null || r.nextReceiver >= numReceivers
|| r.resultAbort || forceReceive) {
if (r.resultTo != null) {
//處理廣播消息消息,調(diào)用到onReceive()
performReceiveLocked(r.callerApp, r.resultTo,
new Intent(r.intent), r.resultCode,
r.resultData, r.resultExtras, false, false, r.userId);
}
...
//part3: 獲取下一個(gè)receiver
r.receiverTime = SystemClock.uptimeMillis();
if (recIdx == 0) {
r.dispatchTime = r.receiverTime;
r.dispatchClockTime = System.currentTimeMillis();
}
if (!mPendingBroadcastTimeoutMessage) {
long timeoutTime = r.receiverTime + mTimeoutPeriod;
setBroadcastTimeoutLocked(timeoutTime); //設(shè)置廣播超時(shí)延時(shí)消息
}
...
}
判定當(dāng)前時(shí)間是否已經(jīng)超過(guò)了r.dispatchTime + 2×mTimeoutPeriod×numReceivers:
就是說(shuō)眶熬,每次派發(fā)一個(gè)廣播就更新一次時(shí)間妹笆,完了之后就會(huì)發(fā)出一個(gè)超時(shí)消。
Input-ANR
主線程對(duì)輸入事件在 5 秒內(nèi)沒(méi)有處理完畢
log打印舉例:
Reason: Input dispatching timed out (Waiting because the focused window has not finished
processing the input events that were previously delivered to it.)
- 1:UI線程盡量只做跟UI相關(guān)的工作
- 2:耗時(shí)的工作(比如數(shù)據(jù)庫(kù)操作娜氏, I/O拳缠,連接網(wǎng)絡(luò)或 者別的有可能阻礙UI線程的操作)
把它放入單獨(dú)的線程處理 - 3:盡量用Handler來(lái)處理UIthread和別的thread之間的交互
InputDispatcherThread是一個(gè)線程,它處理一次消息的派發(fā)輸入事件作為一個(gè)消息贸弥,需要排隊(duì)等待派發(fā)窟坐,每
一個(gè)Connection都維護(hù)兩個(gè)隊(duì)列: outboundQueue和waitQueue
waitQueue: 已經(jīng)發(fā)送給窗口的事件
publishKeyEvent完成后,表示事件已經(jīng)派發(fā)了绵疲,就將事件從outboundQueue挪到了waitQueue
事件經(jīng)過(guò)這么一輪處理哲鸳,就算是從InputDispatcher派發(fā)出去了,但事件是不是被窗口收到了盔憨,還需要等待接
收方的“finished”通知
這部分目前我也不熟悉徙菠,以后專(zhuān)門(mén)會(huì)有個(gè)專(zhuān)題研究輸入事件
ContentProvider處理超時(shí)
主線程在執(zhí)行 ContentProvider 相關(guān)操作時(shí)沒(méi)有在規(guī)定的時(shí)間內(nèi)執(zhí)行完畢
log舉例:
Reason: timeout publishing content providers,不會(huì)報(bào)告 ANR彈框。
產(chǎn)生這類(lèi)ANR是應(yīng)用啟動(dòng)郁岩,調(diào)用AMS.attachApplicationLocked()方法婿奔,發(fā)布啟動(dòng)進(jìn)程的所有
ContentProvider時(shí)發(fā)生
private final void processContentProviderPublishTimedOutLocked(ProcessRecord app){
cleanupAppInLaunchingProvidersLocked(app, true);
removeProcessLocked(app, false, true, "timeout publishing content providers");
}
AMS.attachApplicationLocked()
|-->mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIME_OUT_MSG)
AMS.publishContentProviders()
|-->mHandler.removeMessage(CONTENT_PROVIDER_PUBLISH_TIME_OUT)
觸發(fā)ANR的常見(jiàn)原因
- 在主線程執(zhí)行耗時(shí)操作
- layout層級(jí)太深
- for循環(huán)內(nèi)部代碼太多
- 主線程執(zhí)行IO操作
- 主線程執(zhí)行文件操作等等
- 被Binder對(duì)端block
- 被子線程同步鎖block
- Binder被沾滿(mǎn)導(dǎo)致主線程無(wú)法和SystemServer通信
- 得不到系統(tǒng)資源(CPU,RAM,IO等)
- 等等
log記錄
針對(duì)ANR我們最常規(guī)的就是看Log,然后針對(duì)log排查問(wèn)題
可以將anr目錄下的文件放到電腦上進(jìn)行查看
adb pull data/anr .
屬性系統(tǒng)可以通過(guò)
adb shell getprop dalvik.vm.stack-trace-file
這種方式查找對(duì)應(yīng)的屬性值
當(dāng)觸發(fā)ANR之后會(huì)調(diào)用AppErrors.appNotResponding()方法
final void appNotResponding(ProcessRecord app, ActivityRecord activity,
ActivityRecord parent, boolean aboveSystem, final String annotation) {
ArrayList<Integer> firstPids = new ArrayList<Integer>(5);
SparseArray<Boolean> lastPids = new SparseArray<Boolean>(20);
...
//記錄ANR時(shí)間
long anrTime = SystemClock.uptimeMillis();
//更新CPU狀態(tài)
if (ActivityManagerService.MONITOR_CPU_USAGE) {
mService.updateCpuStatsNow();
}
//特定場(chǎng)景下忽略ANR
synchronized (mService) {
if (mService.mShuttingDown) {
Slog.i(TAG, "During shutdown skipping ANR: " + app + " " + annotation);
return;
} else if (app.notResponding) {
Slog.i(TAG, "Skipping duplicate ANR: " + app + " " + annotation);
return;
} else if (app.crashing) {
Slog.i(TAG, "Crashing app skipping ANR: " + app + " " + annotation);
return;
}
}
//為了防止多次對(duì)相同app的anr執(zhí)行重復(fù)代碼问慎,在此處標(biāo)注記錄萍摊,屬于上面的特定情況種的一種
app.notResponding = true;
//記錄ANR信息到Event Log中
EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,
app.processName, app.info.flags, annotation);
//添加當(dāng)前app到firstpids列表中
firstPids.add(app.pid);
//如果可能添加父進(jìn)程到firstpids列表種
int parentPid = app.pid;
...
// 將ANR信息存在info變量中,后續(xù)打印到LOGCAT蝴乔,這部分的信息會(huì)以ActivityManager為T(mén)ag打印出來(lái)记餐,包含了ANR的進(jìn)程,出現(xiàn)原因以及當(dāng)時(shí)的CPU狀態(tài)薇正,這些對(duì)分析ANR是非常重要的信息
StringBuilder info = new StringBuilder();
info.setLength(0);
info.append("ANR in ").append(app.processName);
if (activity != null && activity.shortComponentName != null) {
info.append(" (").append(activity.shortComponentName).append(")");
}
info.append("\n");
info.append("PID: ").append(app.pid).append("\n");
if (annotation != null) {
info.append("Reason: ").append(annotation).append("\n");
}
if (parent != null && parent != activity) {
info.append("Parent: ").append(parent.shortComponentName).append("\n");
}
//將ANR信息輸出到traces文件片酝,分為兩種囚衔,一種帶native層信息,一種不帶
ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true;
String[] nativeProcs = NATIVE_STACKS_OF_INTEREST;
// don't dump native PIDs for background ANRs
File tracesFile = null;
if (isSilentANR) {
//這里返回了一個(gè)文件雕沿,這里的文件路徑是:`/data/anr/traces.txt`
//查找方法:adb shell getprop dalvik.vm.stack-trace-file
tracesFile = mService.dumpStackTraces(true, firstPids, null, lastPids,
null);
} else {
tracesFile = mService.dumpStackTraces(true, firstPids, processCpuTracker, lastPids,
nativeProcs);
}
//再次更新CPU信息练湿,并且輸出到SystemLog中
String cpuInfo = null;
if (ActivityManagerService.MONITOR_CPU_USAGE) {
mService.updateCpuStatsNow();
synchronized (mService.mProcessCpuTracker) {
cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
}
info.append(processCpuTracker.printCurrentLoad());
info.append(cpuInfo);
}
info.append(processCpuTracker.printCurrentState(anrTime));
Slog.e(TAG, info.toString());
//上面的信息已經(jīng)對(duì)應(yīng)的ANR信息寫(xiě)入/data/anr/traces.txt中
//給底層發(fā)送信號(hào)Process.SIGNAL_QUIT=3
if (tracesFile == null) {
Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
}
//將traces文件 和 CPU使用率信息保存到dropbox,即data/system/dropbox目錄
//命名:system_server/system_app/data_app + type+...比如下面
//data_app_anr@1501989621992.txt.gz
//data_app_crash@1501989671926.txt
mService.addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
cpuInfo, tracesFile, null);
synchronized (mService) {
mService.mBatteryStatsService.noteProcessAnr(app.processName, app.uid);
//如果是后臺(tái)ANR則直接殺掉結(jié)束
if (isSilentANR) {
app.kill("bg anr", true);
return;
}
//設(shè)置app的not響應(yīng)狀態(tài)审轮,并查找errorReportReceiver
makeAppNotRespondingLocked(app,
activity != null ? activity.shortComponentName : null,
annotation != null ? "ANR " + annotation : "ANR",
info.toString());
//彈出ANR對(duì)話框
Message msg = Message.obtain();
HashMap<String, Object> map = new HashMap<String, Object>();
msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
msg.obj = map;
msg.arg1 = aboveSystem ? 1 : 0;
map.put("app", app);
if (activity != null) {
map.put("activity", activity);
}
//向ui線程發(fā)送肥哎,內(nèi)容為SHOW_NOT_RESPONDING_MSG的消息
mService.mUiHandler.sendMessage(msg);
}
}
我們來(lái)小節(jié)一下上面發(fā)生了什么:
- 立刻更新了CPU的信息
/** 2721 cpu (total|1|6),(user|1|6),(system|1|6),(iowait|1|6),(irq|1|6),(softirq|1|6) */ public static final int CPU = 2721; 給event_log中寫(xiě)入值
- 忽略一些anr
- 在event_log中打印am_anr的信息,這個(gè)是anr立刻發(fā)生的記錄
- 將ANR信息存在info變量中疾渣,后續(xù)打印到LOGCAT篡诽,這部分的信息會(huì)以ActivityManager為T(mén)ag打印出來(lái),包含了ANR的進(jìn)程榴捡,出現(xiàn)原因以及當(dāng)時(shí)的CPU狀態(tài)杈女,這些對(duì)分析ANR是非常重要的信息
- 將ANR信息輸出到data/anr/traces文件
- 沒(méi)有輸出到traces文件的時(shí)候,給底層發(fā)送一個(gè)rocess.SIGNAL_QUIT=3信號(hào)
- 將traces文件 和 CPU使用率信息保存到dropbox吊圾,即data/system/dropbox目錄
- 如果是后臺(tái)ANR則直接殺掉結(jié)束
- 彈出ANR對(duì)話框
細(xì)節(jié)
怎么樣就將信息保存到了/data/anr/traces.txt了
1.AMS.dumpStackTraces
public static File dumpStackTraces(boolean clearTraces, ArrayList<Integer> firstPids,
ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids, String[] nativeProcs) {
//tracesPath = "data/anr/traces.txt"
String tracesPath = SystemProperties.get("dalvik.vm.stack-trace-file", null);
if (tracesPath == null || tracesPath.length() == 0) {
return null;
}
File tracesFile = new File(tracesPath);
try {
if (clearTraces && tracesFile.exists()) tracesFile.delete();
tracesFile.createNewFile();
FileUtils.setPermissions(tracesFile.getPath(), 0666, -1, -1); // -rw-rw-rw-
} catch (IOException e) {
Slog.w(TAG, "Unable to prepare ANR traces file: " + tracesPath, e);
return null;
}
//[2]
dumpStackTraces(tracesPath, firstPids, processCpuTracker, lastPids, nativeProcs);
return tracesFile;
}
2.dumpStackTraces()
private static void dumpStackTraces(String tracesPath, ArrayList<Integer> firstPids,
ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids, String[] nativeProcs){
FileObserver observer = new FileObserver(tracesPath, FileObserver.CLOSE_WRITE) {
@Override
public synchronized void onEvent(int event, String path) { notify(); }
};
try {
observer.startWatching();
// 獲取發(fā)生ANR進(jìn)程的pid,然后遍歷這些進(jìn)程給進(jìn)程發(fā)送Process.SIGNAL_QUIT=3的信號(hào)
if (firstPids != null) {
try {
int num = firstPids.size();
for (int i = 0; i < num; i++) {
synchronized (observer) {
final long sime = SystemClock.elapsedRealtime();
Process.sendSignal(firstPids.get(i), Process.SIGNAL_QUIT);
observer.wait(1000); // Wait for write-close, give up after 1 sec
}
}
} catch (InterruptedException e) {
Slog.wtf(TAG, e);
}
}
// 接下來(lái)收集本地pids的堆棧
if (nativeProcs != null) {
int[] pids = Process.getPidsForCommands(nativeProcs);
if (pids != null) {
for (int pid : pids) {
final long sime = SystemClock.elapsedRealtime();
Debug.dumpNativeBacktraceToFileTimeout(pid, tracesPath, 10);//[3]輸出native進(jìn)程的trace并且限制超時(shí)時(shí)間
}
}
}
if (processCpuTracker != null) {
processCpuTracker.init();
System.gc();
processCpuTracker.update();
try {
synchronized (processCpuTracker) {
processCpuTracker.wait(500); // measure over 1/2 second.
}
} catch (InterruptedException e) {
}
processCpuTracker.update();
//從lastPids中選取CPU使用率 top 5的進(jìn)程达椰,輸出這些進(jìn)程的stacks
final int N = processCpuTracker.countWorkingStats();
int numProcs = 0;
for (int i=0; i<N && numProcs<5; i++) {
ProcessCpuTracker.Stats stats = processCpuTracker.getWorkingStats(i);
if (lastPids.indexOfKey(stats.pid) >= 0) {
numProcs++;
try {
synchronized (observer) {
final long stime = SystemClock.elapsedRealtime();
Process.sendSignal(stats.pid, Process.SIGNAL_QUIT);
observer.wait(1000); // Wait for write-close, give up after 1 sec
}
} catch (InterruptedException e) {
Slog.wtf(TAG, e);
}
} else if (DEBUG_ANR) {
Slog.d(TAG, "Skipping next CPU consuming process, not a java proc: "
+ stats.pid);
}
}
}
} finally {
observer.stopWatching();
}
}
小結(jié):
-
收集發(fā)生anr進(jìn)程的調(diào)用棧
- 發(fā)生anr的進(jìn)程
- anr進(jìn)程的父進(jìn)程(anr進(jìn)程是由于AMS生成,AMS在system_server進(jìn)程中项乒,system_server進(jìn)程是anr的父進(jìn)程)
- mLruProcesses中所有的persistent進(jìn)程
-
收集Native進(jìn)程的調(diào)用棧
"/system/bin/audioserver"
"/system/bin/cameraserver"
"/system/bin/drmserver"
"/system/bin/mediadrmserver"
"/system/bin/mediaserver"
"/system/bin/sdcard"
"/system/bin/surfaceflinger"
-
"media.codec"
// system/bin/mediacodec -
"media.extractor"
// system/bin/mediaextractor -
"com.android.bluetooth"
// Bluetooth service
-
收集lastPids進(jìn)程的stacks
- 收集前五名
注意收集信息等待的時(shí)間
3.Debug.dumpNativeBacktraceToFileTimeout()
static void android_os_Debug_dumpNativeBacktraceToFileTimeout(JNIEnv* env, jobject clazz,
jint pid, jstring fileName, jint timeoutSecs)
{
if (fileName == NULL) {
jniThrowNullPointerException(env, "file == null");
return;
}
const jchar* str = env->GetStringCritical(fileName, 0);
String8 fileName8;
if (str) {
fileName8 = String8(reinterpret_cast<const char16_t*>(str),
env->GetStringLength(fileName));
env->ReleaseStringCritical(fileName, str);
}
//打開(kāi)文件(data/anr/traces.txt)
int fd = open(fileName8.string(), O_CREAT | O_WRONLY | O_NOFOLLOW | O_CLOEXEC | O_APPEND, 0666);
if (fd < 0) {
fprintf(stderr, "Can't open %s: %s\n", fileName8.string(), strerror(errno));
return;
}
dump_backtrace_to_file_timeout(pid, fd, timeoutSecs);//[4]
close(fd);
}
4.dump_backtrace_to_file_timeout()
int dump_backtrace_to_file_timeout(pid_t tid, int fd, int timeout_secs) {
//發(fā)送dump請(qǐng)求得到sock_fd
int sock_fd = make_dump_request(DEBUGGER_ACTION_DUMP_BACKTRACE, tid, timeout_secs);
if (sock_fd < 0) {
return -1;
}
int result = 0;
char buffer[1024];
ssize_t n;
int flag = 0;
//從sock_fd中讀取信息寫(xiě)入data/anr/traces.txt中
while ((n = TEMP_FAILURE_RETRY(read(sock_fd, buffer, sizeof(buffer)))) > 0) {
flag = 1;
if (TEMP_FAILURE_RETRY(write(fd, buffer, n)) != n) {
result = -1;
break;
}
}
close(sock_fd);
...
return result;
}
主要是通過(guò)給底層發(fā)送DEBUGGER_ACTION_DUMP_BACKTRACE
來(lái)請(qǐng)求dump的sock_fd句柄啰劲,底層調(diào)用dump_backtraces()來(lái)獲取信息,從而寫(xiě)入data/anr/traces.txt文件中
當(dāng)發(fā)生anr的時(shí)候檀何,距離ANR最近的時(shí)間是am_anr這個(gè)日志的時(shí)間蝇裤,然后會(huì)打印各種信息有底層dump的,有進(jìn)程的調(diào)用棧信息等等频鉴。最后將trances.txt寫(xiě)入data/system/dropbox目錄下猖辫,并且重命名,規(guī)則見(jiàn)上文砚殿。
補(bǔ)充
其中Process.sendSignal(stats.pid, Process.SIGNAL_QUIT);
發(fā)出退出進(jìn)程信號(hào)
ANR的相關(guān)log小結(jié)
- system.log 包含ANR發(fā)生時(shí)間點(diǎn)信息、 ANR發(fā)生前的CPU信息芝囤,還包含大量系統(tǒng)服務(wù)輸出的信息似炎。
- main.log 包含ANR發(fā)生前應(yīng)用自身輸出的信息,可供分析應(yīng)用是否有異常悯姊;此外還包含輸出的GC信息羡藐,
可供分析內(nèi)存回收的速度,判斷系統(tǒng)是否處于低內(nèi)存或內(nèi)存碎片化狀態(tài)悯许。 - event.log 包含AMS與WMS輸出的應(yīng)用程序聲明周期信息仆嗦,可供分析窗口創(chuàng)建速度以及焦點(diǎn)轉(zhuǎn)換情況。
- kernel.log 包含kernel打出的信息先壕, LowMemoryKiller殺進(jìn)程瘩扼、內(nèi)存碎片化或內(nèi)存不足谆甜, mmc驅(qū)動(dòng)異常都
可以在這里找到。
比如:
system log檢索 ANR in 關(guān)鍵字
09-16 00:50:10 820 907 E ActivityManager: ANR in com.android.systemui, time=130090695
09-16 00:50:10 820 907 E ActivityManager: Reason: Broadcast of Intent { act=android.intent.action.TIME_TICK
flg=0x50000114 (has extras) }
09-16 00:50:10 820 907 E ActivityManager: Load: 30.4 / 22.34 / 19.94
09-16 00:50:10 820 907 E ActivityManager: Android time :[2015-10-16 00:50:05.76] [130191,266]
09-16 00:50:10 820 907 E ActivityManager: CPU usage from 6753ms to -4ms ago:
09-16 00:50:10 820 907 E ActivityManager: 47% 320/netd: 3.1% user + 44% kernel / faults: 14886 minor 3 major
09-16 00:50:10 820 907 E ActivityManager: 15% 10007/com.sohu.sohuvideo: 2.8% user + 12% kernel / faults: 1144
minor
09-16 00:50:10 820 907 E ActivityManager: 13% 10654/hif_thread: 0% user + 13% kernel
09-16 00:50:10 820 907 E ActivityManager: 11% 175/mmcqd/0: 0% user + 11% kernel
09-16 00:50:10 820 907 E ActivityManager: 5.1% 12165/app_process: 1.6% user + 3.5% kernel / faults: 9703 minor
540 major
09-16 00:50:10 820 907 E ActivityManager: 3.3% 29533/com.android.systemui: 2.6% user + 0.7% kernel / faults:
8402 minor 343 major
09-16 00:50:10 820 907 E ActivityManager: 3.2% 820/system_server: 0.8% user + 2.3% kernel / faults: 5120 minor
523 major
09-16 00:50:10 820 907 E ActivityManager: 2.5% 11817/com.netease.pomelo.push.l.messageservice_V2: 0.7% user +
1.7% kernel / faults: 7728 minor 687 major
09-16 00:50:10 820 907 E ActivityManager: 1.6% 11887/com.android.email: 0.5% user + 1% kernel / faults: 6259
minor 587 major
09-16 00:50:10 820 907 E ActivityManager: 1.4% 11854/com.android.settings: 0.7% user + 0.7% kernel / faults:
5404 minor 471 major
09-16 00:50:10 820 907 E ActivityManager: 1.4% 11869/android.process.acore: 0.7% user + 0.7% kernel / faults:
6131 minor 561 major
09-16 00:50:10 820 907 E ActivityManager: 1.3% 11860/com.tencent.mobileqq: 0.1% user + 1.1% kernel / faults:
5542 minor 470 major
...
09-16 00:50:10 820 907 E ActivityManager: +0% 12832/cat: 0% user + 0% kernel
09-16 00:50:10 820 907 E ActivityManager: +0% 13211/zygote64: 0% user + 0% kernel
09-16 00:50:10 820 907 E ActivityManager: 87% TOTAL: 3% user + 18% kernel + 64% iowait + 0.5% softirq
調(diào)用堆棧中的信息
main(線程名)集绰、 prio(線程優(yōu)先級(jí),默認(rèn)是5)规辱、 tid(線程唯一標(biāo)識(shí)ID)、 Sleeping(線程當(dāng)前狀態(tài))
"main" prio=5 tid=1 Sleeping
| group="main" sCount=1 dsCount=0 obj=0x73132d10 self=0x5598a5f5e0
//sysTid是線程號(hào)(主線程的線程號(hào)和進(jìn)程號(hào)相同)
| sysTid=17027 nice=0 cgrp=default sched=0/0 handle=0x7fb6db6fe8
| state=S schedstat=( 420582038 5862546 143 ) utm=24 stm=18 core=6 HZ=100
| stack=0x7fefba3000-0x7fefba5000 stackSize=8MB
| held mutexes=
// java 堆棧調(diào)用信息(這里可查看導(dǎo)致ANR的代碼調(diào)用流程)(分析ANR最重要的信息)
at java.lang.Thread.sleep!(Native method)
- sleeping on <0x0c60f3c7> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:1031)
- locked <0x0c60f3c7> (a java.lang.Object) // 鎖住對(duì)象0x0c60f3c7
at java.lang.Thread.sleep(Thread.java:985)
at android.os.SystemClock.sleep(SystemClock.java:120)
at org.code.ipc.MessengerService.onCreate(MessengerService.java:63) //導(dǎo)致ANR的代碼
at android.app.ActivityThread.handleCreateService(ActivityThread.java:2877)
at android.app.ActivityThread.access$1900(ActivityThread.java:150)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1427)
at android.os.Handler.dispatchMessage(Handler.java:102)
at android.os.Looper.loop(Looper.java:148)
at android.app.ActivityThread.main(ActivityThread.java:5417)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)
線程狀態(tài) | 含義 |
---|---|
SUSPENDED | 線程暫停栽燕,可能是由于輸出Trace罕袋、 GC或debug被暫停 |
NATIVE | 正在執(zhí)行JNI本地函數(shù) |
MONITOR | 線程阻塞,等待獲取對(duì)象鎖 |
WAIT | 執(zhí)行了無(wú)限等待的wait函數(shù) |
TIMED_WAIT | 執(zhí)行了帶有超時(shí)參數(shù)的wait碍岔、 sleep或join函數(shù) |
VMWAIT | 正在等待VM資源 |
RUNNING/RUNNABLE | 線程可運(yùn)行或正在運(yùn)行 |
INITALIZING | 新建浴讯,正在初始化,為其分配資源 |
STARTING | 新建蔼啦,正在啟動(dòng) |
ZOMBIE | 線程死亡榆纽,終止運(yùn)行,等待父線程回收它 |
UNKNOWN | 未知狀態(tài) |
輔助關(guān)鍵字
Log關(guān)鍵字 | 含義 |
---|---|
am_proc_start | 開(kāi)始創(chuàng)建應(yīng)用進(jìn)程 |
am_proc_bound | 應(yīng)用進(jìn)程創(chuàng)建完畢 |
am_restart_activity realActivityStart | 創(chuàng)建進(jìn)程完成后首次啟動(dòng)應(yīng)用 |
am_resume_activity | 窗口Resume開(kāi)始 |
am_on_resume_called | 窗口Resume完畢 |
am_pause_activity | 窗口Pause開(kāi)始 |
am_on_paused_called | 窗口Pause完畢 |
am_failed_to_pause | 窗口Pause超時(shí) |
am_finish_activity | 應(yīng)用Finish開(kāi)始 |
am_proc_died | 進(jìn)程死亡( 比如被LowMemoryKiller殺死) |
日志的獲取
- logcat 通過(guò)adb logcat命令輸出Android的一些當(dāng)前運(yùn)行日志询吴,可以通過(guò)logcat的 -b 參數(shù)指定要輸出的日志緩
沖區(qū)掠河,緩沖區(qū)對(duì)應(yīng)著logcat的一種日志類(lèi)型。 - adb logcat –b all
- adb logcat –b radio
- adb logcat –b system
- adb logcat –b events
- adb logcat –b main
ANR(Application Not Responding)日志: /data/anr/traces.txt