最近工作中需要對(duì)Crash的應(yīng)用進(jìn)行處理驻债,看了下Android有提供相關(guān)的Manager對(duì)各類異常進(jìn)行記錄阅畴,也就是DropboxManager柄慰。
什么是 DropBoxManager ?
Enqueues chunks of data (from various sources – application crashes, kernel log records, etc.). The queue is size bounded and will drop old data if the enqueued data exceeds the maximum size. You can think of this as a persistent, system-wide, blob-oriented “l(fā)ogcat”.DropBoxManager entries are not sent anywhere directly, but other system. services and debugging tools may scan and upload entries for processing
DropBoxManager 是 Android 在 Froyo(API level 8) 引入的用來(lái)持續(xù)化存儲(chǔ)系統(tǒng)數(shù)據(jù)的機(jī)制, 主要用于記錄 Android 運(yùn)行過(guò)程中, 內(nèi)核, 系統(tǒng)進(jìn)程, 用戶進(jìn)程等出現(xiàn)嚴(yán)重問(wèn)題時(shí)的 log, 可以認(rèn)為這是一個(gè)可持續(xù)存儲(chǔ)的系統(tǒng)級(jí)別的 logcat.
我們可以通過(guò)用參數(shù) DROPBOX_SERVICE 調(diào)用 getSystemService(String) 來(lái)獲得這個(gè)服務(wù), 并查詢出所有存儲(chǔ)在 DropBoxManager 里的系統(tǒng)錯(cuò)誤記錄.
Android 缺省能記錄哪些系統(tǒng)錯(cuò)誤 ?
具體能記錄哪些系統(tǒng)錯(cuò)誤嫉到,官方的文檔中沒(méi)有提及梦湘,我們?cè)贒ropboxManager.java源代碼文件中的EXTRA_TAG(tag)中找到相關(guān)信息瞎颗。
- crash (應(yīng)用程序強(qiáng)制關(guān)閉, Force Close)
當(dāng)Java層遇到未被 catch 的例外時(shí), ActivityManagerService 會(huì)記錄一次 crash到 DropBoxManager中, 并彈出 Force Close對(duì)話框提示用戶.
/**
* Used by {@link com.android.internal.os.RuntimeInit} to report when an application crashes.
* The application process will exit immediately after this call returns.
* @param app object of the crashing app, null for the system server
* @param crashInfo describing the exception
*/
public void handleApplicationCrash(IBinder app, ApplicationErrorReport.CrashInfo crashInfo) {
ProcessRecord r = findAppProcess(app, "Crash");
final String processName = app == null ? "system_server": (r == null ? "unknown" : r.processName);
handleApplicationCrashInner("crash", r, processName, crashInfo);
}
/* Native crash reporting uses this inner version because it needs to be somewhat
* decoupled from the AM-managed cleanup lifecycle
*/
void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,ApplicationErrorReport.CrashInfo crashInfo) {
EventLog.writeEvent(EventLogTags.AM_CRASH, Binder.getCallingPid(),
UserHandle.getUserId(Binder.getCallingUid()), processName, r == null ? -1 : r.info.flags,crashInfo.exceptionClassName,crashInfo.exceptionMessage,crashInfo.throwFileName,crashInfo.throwLineNumber);
addErrorToDropBox(eventType, r, processName, null, null, null, null, null, crashInfo);
crashApplication(r, crashInfo);
}
- anr (應(yīng)用程序沒(méi)響應(yīng), Application Not Responding, ANR)
當(dāng)應(yīng)用程序的主線程(UI線程)長(zhǎng)時(shí)間未能得到響應(yīng)時(shí), ActivityManagerService 會(huì)記錄一次 anr到 DropBoxManager中, 并彈出 Application Not Responding對(duì)話框提示用戶.
final void appNotResponding(ProcessRecord app, ActivityRecord activity, ActivityRecord parent, boolean aboveSystem, final String annotation) {
//......
addErrorToDropBox("anr", app, app.processName, activity, parent, annotation, cpuInfo, tracesFile, null);
//......
}
- wtf (What a Terrible Failure)
‘a(chǎn)ndroid.util.Log’ 類提供了靜態(tài)的 wtf 函數(shù), 應(yīng)用程序可以在代碼中用來(lái)主動(dòng)報(bào)告一個(gè)不應(yīng)當(dāng)發(fā)生的情況. 依賴于系統(tǒng)設(shè)置, 這個(gè)函數(shù)會(huì)通過(guò) ActivityManagerService 增加一個(gè) wtf 記錄到 DropBoxManager中, 并/或終止當(dāng)前應(yīng)用程序進(jìn)程.
/**
* Used by {@link Log} via {@link com.android.internal.os.RuntimeInit} to report serious errors.
* @param app object of the crashing app, null for the system server
* @param tag reported by the caller
* @param system whether this wtf is coming from the system
* @param crashInfo describing the context of the error
* @return true if the process should exit immediately (WTF is fatal)
*/
public boolean handleApplicationWtf(final IBinder app, final String tag, boolean system,
final ApplicationErrorReport.CrashInfo crashInfo) {
final int callingUid = Binder.getCallingUid();
final int callingPid = Binder.getCallingPid();
if (system) {
// If this is coming from the system, we could very well have low-level
// system locks held, so we want to do this all asynchronously. And we
// never want this to become fatal, so there is that too.
mHandler.post(new Runnable() {
@Override public void run() {
handleApplicationWtfInner(callingUid, callingPid, app, tag, crashInfo);
}
});
return false;
}
final ProcessRecord r = handleApplicationWtfInner(callingUid, callingPid, app, tag,
crashInfo);
if (r != null && r.pid != Process.myPid() &&
Settings.Global.getInt(mContext.getContentResolver(),
Settings.Global.WTF_IS_FATAL, 0) != 0) {
crashApplication(r, crashInfo);
return true;
} else {
return false;
}
}
ProcessRecord handleApplicationWtfInner(int callingUid, int callingPid, IBinder app, String tag,
final ApplicationErrorReport.CrashInfo crashInfo) {
final ProcessRecord r = findAppProcess(app, "WTF");
final String processName = app == null ? "system_server"
: (r == null ? "unknown" : r.processName);
EventLog.writeEvent(EventLogTags.AM_WTF, UserHandle.getUserId(callingUid), callingPid,
processName, r == null ? -1 : r.info.flags, tag, crashInfo.exceptionMessage);
addErrorToDropBox("wtf", r, processName, null, null, tag, null, null, crashInfo);
return r;
}
- strict_mode (StrictMode Violation)
StrictMode (嚴(yán)格模式), 顧名思義, 就是在比正常模式檢測(cè)得更嚴(yán)格, 通常用來(lái)監(jiān)測(cè)不應(yīng)當(dāng)在主線程執(zhí)行的網(wǎng)絡(luò), 文件等操作. 任何 StrictMode 違例都會(huì)被 ActivityManagerService 在 DropBoxManager 中記錄為一次 strict_mode違例.
public void handleApplicationStrictModeViolation(
IBinder app,
int violationMask,
StrictMode.ViolationInfo info) {
ProcessRecord r = findAppProcess(app, "StrictMode");
if (r == null) {
return;
}
if ((violationMask & StrictMode.PENALTY_DROPBOX) != 0) {
Integer stackFingerprint = info.hashCode();
boolean logIt = true;
synchronized (mAlreadyLoggedViolatedStacks) {
if (mAlreadyLoggedViolatedStacks.contains(stackFingerprint)) {
logIt = false;
// TODO: sub-sample into EventLog for these, with
// the info.durationMillis? Then we'd get
// the relative pain numbers, without logging all
// the stack traces repeatedly. We'd want to do
// likewise in the client code, which also does
// dup suppression, before the Binder call.
} else {
if (mAlreadyLoggedViolatedStacks.size() >= MAX_DUP_SUPPRESSED_STACKS) {
mAlreadyLoggedViolatedStacks.clear();
}
mAlreadyLoggedViolatedStacks.add(stackFingerprint);
}
}
if (logIt) {
logStrictModeViolationToDropBox(r, info);
}
}
//......
}
// Depending on the policy in effect, there could be a bunch of
// these in quick succession so we try to batch these together to
// minimize disk writes, number of dropbox entries, and maximize
// compression, by having more fewer, larger records.
private void logStrictModeViolationToDropBox(
ProcessRecord process,
StrictMode.ViolationInfo info) {
if (info == null) {
return;
}
final boolean isSystemApp = process == null ||
(process.info.flags & (ApplicationInfo.FLAG_SYSTEM |
ApplicationInfo.FLAG_UPDATED_SYSTEM_APP)) != 0;
final String processName = process == null ? "unknown" : process.processName;
final String dropboxTag = isSystemApp ? "system_app_strictmode" : "data_app_strictmode";
final DropBoxManager dbox = (DropBoxManager)
mContext.getSystemService(Context.DROPBOX_SERVICE);
// Exit early if the dropbox isn't configured to accept this report type.
if (dbox == null || !dbox.isTagEnabled(dropboxTag)) return;
boolean bufferWasEmpty;
boolean needsFlush;
final StringBuilder sb = isSystemApp ? mStrictModeBuffer : new StringBuilder(1024);
synchronized (sb) {
bufferWasEmpty = sb.length() == 0;
appendDropBoxProcessHeaders(process, processName, sb);
sb.append("Build: ").append(Build.FINGERPRINT).append("\n");
sb.append("System-App: ").append(isSystemApp).append("\n");
sb.append("Uptime-Millis: ").append(info.violationUptimeMillis).append("\n");
if (info.violationNumThisLoop != 0) {
sb.append("Loop-Violation-Number: ").append(info.violationNumThisLoop).append("\n");
}
if (info.numAnimationsRunning != 0) {
sb.append("Animations-Running: ").append(info.numAnimationsRunning).append("\n");
}
if (info.broadcastIntentAction != null) {
sb.append("Broadcast-Intent-Action: ").append(info.broadcastIntentAction).append("\n");
}
if (info.durationMillis != -1) {
sb.append("Duration-Millis: ").append(info.durationMillis).append("\n");
}
if (info.numInstances != -1) {
sb.append("Instance-Count: ").append(info.numInstances).append("\n");
}
if (info.tags != null) {
for (String tag : info.tags) {
sb.append("Span-Tag: ").append(tag).append("\n");
}
}
sb.append("\n");
if (info.crashInfo != null && info.crashInfo.stackTrace != null) {
sb.append(info.crashInfo.stackTrace);
}
sb.append("\n");
// Only buffer up to ~64k. Various logging bits truncate
// things at 128k.
needsFlush = (sb.length() > 64 * 1024);
}
// Flush immediately if the buffer's grown too large, or this
// is a non-system app. Non-system apps are isolated with a
// different tag & policy and not batched.
//
// Batching is useful during internal testing with
// StrictMode settings turned up high. Without batching,
// thousands of separate files could be created on boot.
if (!isSystemApp || needsFlush) {
new Thread("Error dump: " + dropboxTag) {
@Override
public void run() {
String report;
synchronized (sb) {
report = sb.toString();
sb.delete(0, sb.length());
sb.trimToSize();
}
if (report.length() != 0) {
dbox.addText(dropboxTag, report);
}
}
}.start();
return;
}
// System app batching:
if (!bufferWasEmpty) {
// An existing dropbox-writing thread is outstanding, so
// we don't need to start it up. The existing thread will
// catch the buffer appends we just did.
return;
}
// Worker thread to both batch writes and to avoid blocking the caller on I/O.
// (After this point, we shouldn't access AMS internal data structures.)
new Thread("Error dump: " + dropboxTag) {
@Override
public void run() {
// 5 second sleep to let stacks arrive and be batched together
try {
Thread.sleep(5000); // 5 seconds
} catch (InterruptedException e) {}
String errorReport;
synchronized (mStrictModeBuffer) {
errorReport = mStrictModeBuffer.toString();
if (errorReport.length() == 0) {
return;
}
mStrictModeBuffer.delete(0, mStrictModeBuffer.length());
mStrictModeBuffer.trimToSize();
}
dbox.addText(dropboxTag, errorReport);
}
}.start();
}
- lowmem (低內(nèi)存)
在內(nèi)存不足的時(shí)候, Android 會(huì)終止后臺(tái)應(yīng)用程序來(lái)釋放內(nèi)存, 但如果沒(méi)有后臺(tái)應(yīng)用程序可被釋放時(shí),ActivityManagerService 就會(huì)在 DropBoxManager 中記錄一次 lowmem.
public void handleMessage(Message msg) {
switch (msg.what) {
//...
case REPORT_MEM_USAGE: {
//......
Thread thread = new Thread() {
@Override public void run() {
StringBuilder dropBuilder = new StringBuilder(1024);
StringBuilder logBuilder = new StringBuilder(1024);
//......
addErrorToDropBox("lowmem", null, "system_server", null,
null, tag.toString(), dropBuilder.toString(), null, null);
//......
}
};
thread.start();
break;
}
//......
}
- watchdog
如果 WatchDog 監(jiān)測(cè)到系統(tǒng)進(jìn)程(system_server)出現(xiàn)問(wèn)題, 會(huì)增加一條 watchdog記錄到 DropBoxManager 中, 并終止系統(tǒng)進(jìn)程的執(zhí)行.
/** This class calls its monitor every minute. Killing this process if they don't return **/
public class Watchdog extends Thread {
//......
@Override
public void run() {
boolean waitedHalf = false;
while (true) {
//......
// If we got here, that means that the system is most likely hung.
// First collect stack traces from all threads of the system process.
// Then kill this process so that the system will restart.
//......
// Try to add the error to the dropbox, but assuming that the ActivityManager
// itself may be deadlocked. (which has happened, causing this statement to
// deadlock and the watchdog as a whole to be ineffective)
Thread dropboxThread = new Thread("watchdogWriteToDropbox") {
public void run() {
mActivity.addErrorToDropBox(
"watchdog", null, "system_server", null, null,
name, null, stack, null);
}
};
dropboxThread.start();
try {
dropboxThread.join(2000); // wait up to 2 seconds for it to return.
} catch (InterruptedException ignored) {}
//......
}
}
//......
}
- netstats_error
NetworkStatsService 負(fù)責(zé)收集并持久化存儲(chǔ)網(wǎng)絡(luò)狀態(tài)的統(tǒng)計(jì)數(shù)據(jù), 當(dāng)遇到明顯的網(wǎng)絡(luò)狀態(tài)錯(cuò)誤時(shí), 它會(huì)增加一條 netstats_error 記錄到 DropBoxManager. - BATTERY_DISCHARGE_INFO
BatteryService 負(fù)責(zé)檢測(cè)充電狀態(tài), 并更新手機(jī)電池信息. 當(dāng)遇到明顯的 discharge 事件, 它會(huì)增加一條 BATTERY_DISCHARGE_INFO 記錄到 DropBoxManager. - 系統(tǒng)服務(wù)(System Serve)啟動(dòng)完成后的檢測(cè)
系統(tǒng)服務(wù)(System Serve)啟動(dòng)完成后會(huì)進(jìn)行一系列自檢, 包括:
-- 開(kāi)機(jī)
每次開(kāi)機(jī)都會(huì)增加一條 SYSTEM_BOOT 記錄.
-- System Server 重啟
如果系統(tǒng)服務(wù)(System Server)不是開(kāi)機(jī)后的第一次啟動(dòng), 會(huì)增加一條 SYSTEM_RESTART 記錄, 正常情況下系統(tǒng)服務(wù)(System Server)在一次開(kāi)機(jī)中只會(huì)啟動(dòng)一次, 啟動(dòng)第二次就意味著 bug.
-- Kernel Panic (內(nèi)核錯(cuò)誤)
發(fā)生 Kernel Panic 時(shí), Kernel 會(huì)記錄一些 log 信息到文件系統(tǒng), 因?yàn)?Kernel 已經(jīng)掛掉了, 當(dāng)然這時(shí)不可能有其他機(jī)會(huì)來(lái)記錄錯(cuò)誤信息了. 唯一能檢測(cè) Kernel Panic 的辦法就是在手機(jī)啟動(dòng)后檢查這些 log 文件是否存在, 如果存在則意味著上一次手機(jī)是因?yàn)?Kernel Panic 而宕機(jī), 并記錄這些日志到 DropBoxManager 中. DropBoxManager 記錄 TAG 名稱和對(duì)應(yīng)的文件名分別是:
SYSTEM_LAST_KMSG, 如果 /proc/last_kmsg 存在.
APANIC_CONSOLE, 如果 /data/dontpanic/apanic_console 存在.
APANIC_THREADS, 如果 /data/dontpanic/apanic_threads 存在.
-- 系統(tǒng)恢復(fù)(System Recovery)
通過(guò)檢測(cè)文件 /cache/recovery/log 是否存在來(lái)檢測(cè)設(shè)備是否因?yàn)橄到y(tǒng)恢復(fù)而重啟, 并增加一條 SYSTEM_RECOVERY_LOG 記錄到 DropBoxManager 中.
private void logBootEvents(Context ctx) throws IOException {
final DropBoxManager db = (DropBoxManager) ctx.getSystemService(Context.DROPBOX_SERVICE);
final SharedPreferences prefs = ctx.getSharedPreferences("log_files", Context.MODE_PRIVATE);
final String headers = new StringBuilder(512)
.append("Build: ").append(Build.FINGERPRINT).append("\n")
.append("Hardware: ").append(Build.BOARD).append("\n")
.append("Revision: ")
.append(SystemProperties.get("ro.revision", "")).append("\n")
.append("Bootloader: ").append(Build.BOOTLOADER).append("\n")
.append("Radio: ").append(Build.RADIO).append("\n")
.append("Kernel: ")
.append(FileUtils.readTextFile(new File("/proc/version"), 1024, "...\n"))
.append("\n").toString();
String recovery = RecoverySystem.handleAftermath();
if (recovery != null && db != null) {
db.addText("SYSTEM_RECOVERY_LOG", headers + recovery);
}
if (SystemProperties.getLong("ro.runtime.firstboot", 0) == 0) {
String now = Long.toString(System.currentTimeMillis());
SystemProperties.set("ro.runtime.firstboot", now);
if (db != null) db.addText("SYSTEM_BOOT", headers);
// Negative sizes mean to take the *tail* of the file (see FileUtils.readTextFile())
addFileToDropBox(db, prefs, headers, "/proc/last_kmsg",
-LOG_SIZE, "SYSTEM_LAST_KMSG");
addFileToDropBox(db, prefs, headers, "/cache/recovery/log",
-LOG_SIZE, "SYSTEM_RECOVERY_LOG");
addFileToDropBox(db, prefs, headers, "/data/dontpanic/apanic_console",
-LOG_SIZE, "APANIC_CONSOLE");
addFileToDropBox(db, prefs, headers, "/data/dontpanic/apanic_threads",
-LOG_SIZE, "APANIC_THREADS");
} else {
if (db != null) db.addText("SYSTEM_RESTART", headers);
}
// Scan existing tombstones (in case any new ones appeared)
File[] tombstoneFiles = TOMBSTONE_DIR.listFiles();
for (int i = 0; tombstoneFiles != null && i < tombstoneFiles.length; i++) {
addFileToDropBox(db, prefs, headers, tombstoneFiles[i].getPath(),
LOG_SIZE, "SYSTEM_TOMBSTONE");
}
// Start watching for new tombstone files; will record them as they occur.
// This gets registered with the singleton file observer thread.
sTombstoneObserver = new FileObserver(TOMBSTONE_DIR.getPath(), FileObserver.CLOSE_WRITE) {
@Override
public void onEvent(int event, String path) {
try {
String filename = new File(TOMBSTONE_DIR, path).getPath();
addFileToDropBox(db, prefs, headers, filename, LOG_SIZE, "SYSTEM_TOMBSTONE");
} catch (IOException e) {
Slog.e(TAG, "Can't log tombstone", e);
}
}
};
sTombstoneObserver.startWatching();
}
- SYSTEM_TOMBSTONE (Native 進(jìn)程的崩潰)
Tombstone 是 Android 用來(lái)記錄 native 進(jìn)程崩潰的 core dump日志, 系統(tǒng)服務(wù)在啟動(dòng)完成后會(huì)增加一個(gè) Observer 來(lái)偵測(cè) tombstone 日志文件的變化, 每當(dāng)生成新的 tombstone文件, 就會(huì)增加一條 SYSTEM_TOMBSTONE記錄到 DropBoxManager 中.
DropBoxManager 如何存儲(chǔ)記錄數(shù)據(jù) ?
DropBoxManager 使用的是文件存儲(chǔ), 所有的記錄都存儲(chǔ)在 /data/system/dropbox目錄中, 一條記錄就是一個(gè)文件, 當(dāng)文本文件的尺寸超過(guò)文件系統(tǒng)的最小區(qū)塊尺寸后, DropBoxManager 還會(huì)自動(dòng)壓縮該文件, 通常文件名以調(diào)用 DropBoxManager 的 TAG 參數(shù)開(kāi)頭.
$ adb shell ls -l /data/system/dropbox
-rw------- system system 258 2012-11-21 11:36 SYSTEM_RESTART@1353469017940.txt
-rw------- system system 39 2012-11-21 11:40 event_data@1353469222884.txt
-rw------- system system 39 2012-11-21 12:10 event_data@1353471022975.txt
-rw------- system system 34 2012-11-21 18:10 event_log@1353492624170.txt
-rw------- system system 34 2012-11-21 18:40 event_log@1353494424296.txt
-rw------- system system 34 2012-11-22 10:10 event_log@1353550227432.txt
-rw------- system system 1528 2012-11-21 22:54 system_app_crash@1353509648395.txt
-rw------- system system 1877 2012-11-21 11:36 system_app_strictmode@1353469014395.txt
-rw------- system system 3724 2012-11-21 11:36 system_app_strictmode@1353469014924.txt.gz
如何利用 DropBoxManager ?
利用 DropBoxManager 來(lái)記錄需要持久化存儲(chǔ)的錯(cuò)誤日志信息
DropBoxManager 提供了 logcat 之外的另外一種錯(cuò)誤日志記錄機(jī)制, 程序可以在出錯(cuò)的時(shí)候自動(dòng)將相關(guān)信息記錄到 DropBoxManager 中. 相對(duì)于 logcat, DropBoxManager 更適合于程序的自動(dòng)抓錯(cuò), 避免人為因素而產(chǎn)生的錯(cuò)誤遺漏. 并且 DropBoxManager 是 Android 系統(tǒng)的公開(kāi)服務(wù), 相對(duì)于很多私有實(shí)現(xiàn), 出現(xiàn)兼容性問(wèn)題的幾率會(huì)大大降低.錯(cuò)誤自動(dòng)上報(bào)
可以將 DropBoxManager 和設(shè)備的 BugReport 結(jié)合起來(lái), 實(shí)現(xiàn)自動(dòng)上報(bào)錯(cuò)誤到服務(wù)器. 每當(dāng)生成新的記錄, DropBoxManager 就會(huì)廣播一個(gè) DropBoxManager.ACTION_DROPBOX_ENTRY_ADDED Intent, 設(shè)備的 BugReport 服務(wù)需要偵聽(tīng)這個(gè) Intent, 然后觸發(fā)錯(cuò)誤的自動(dòng)上報(bào).
參考
介紹 Android DropBoxManager Service
Android Official Site
DropBoxManager Overview
ActivityManager Service Overview
Android StrictMode Overview