一悍募、Java Crash處理
1姻锁、在Thread類中有這樣一個接口:UncaughtExceptionHandler券膀。
通過查看相關(guān)注釋可以知道:當(dāng)線程由于未捕獲的異常突然終止時,JVM會通過getUnaughtExceptionHandler查詢線程的UnaughtExceptionHandler钉稍,并調(diào)用它的uncaughtException方法。如果未設(shè)置UncaughtExceptionHandler报嵌,系統(tǒng)會用ThreadGroup進(jìn)行處理虱咧。
/**
* Interface for handlers invoked when a <tt>Thread</tt> abruptly
* terminates due to an uncaught exception.
* <p>When a thread is about to terminate due to an uncaught exception
* the Java Virtual Machine will query the thread for its
* <tt>UncaughtExceptionHandler</tt> using
* {@link #getUncaughtExceptionHandler} and will invoke the handler's
* <tt>uncaughtException</tt> method, passing the thread and the
* exception as arguments.
* If a thread has not had its <tt>UncaughtExceptionHandler</tt>
* explicitly set, then its <tt>ThreadGroup</tt> object acts as its
* <tt>UncaughtExceptionHandler</tt>. If the <tt>ThreadGroup</tt> object
* has no
* special requirements for dealing with the exception, it can forward
* the invocation to the {@linkplain #getDefaultUncaughtExceptionHandler
* default uncaught exception handler}.
*/
@FunctionalInterface
public interface UncaughtExceptionHandler {
/**
* Method invoked when the given thread terminates due to the
* given uncaught exception.
* <p>Any exception thrown by this method will be ignored by the
* Java Virtual Machine.
* @param t the thread
* @param e the exception
*/
void uncaughtException(Thread t, Throwable e);
}
查看ThreadGroup的uncaughtException,它會查詢線程設(shè)置的UnaughtExceptionHandler锚国,如果沒有的話腕巡,只是進(jìn)行打印處理,并沒有退出操作血筑。說明一定有其他地方對Thread設(shè)置了UnaughtExceptionHandler绘沉。
/**
* Called by the Java Virtual Machine when a thread in this
* thread group stops because of an uncaught exception, and the thread
* does not have a specific {@link Thread.UncaughtExceptionHandler}
* installed.
* <p>
* The <code>uncaughtException</code> method of
* <code>ThreadGroup</code> does the following:
* <ul>
* <li>If this thread group has a parent thread group, the
* <code>uncaughtException</code> method of that parent is called
* with the same two arguments.
* <li>Otherwise, this method checks to see if there is a
* {@linkplain Thread#getDefaultUncaughtExceptionHandler default
* uncaught exception handler} installed, and if so, its
* <code>uncaughtException</code> method is called with the same
* two arguments.
* <li>Otherwise, this method determines if the <code>Throwable</code>
* argument is an instance of {@link ThreadDeath}. If so, nothing
* special is done. Otherwise, a message containing the
* thread's name, as returned from the thread's {@link
* Thread#getName getName} method, and a stack backtrace,
* using the <code>Throwable</code>'s {@link
* Throwable#printStackTrace printStackTrace} method, is
* printed to the {@linkplain System#err standard error stream}.
* </ul>
* <p>
* Applications can override this method in subclasses of
* <code>ThreadGroup</code> to provide alternative handling of
* uncaught exceptions.
*
* @param t the thread that is about to exit.
* @param e the uncaught exception.
* @since JDK1.0
*/
public void uncaughtException(Thread t, Throwable e) {
if (parent != null) {
parent.uncaughtException(t, e);
} else {
Thread.UncaughtExceptionHandler ueh =
Thread.getDefaultUncaughtExceptionHandler();
if (ueh != null) {
ueh.uncaughtException(t, e);
} else if (!(e instanceof ThreadDeath)) {
System.err.print("Exception in thread \""
+ t.getName() + "\" ");
e.printStackTrace(System.err);
}
}
}
2、Thread的UncaughtExceptionHandler何時設(shè)置的豺总?
通過AMS-Activity啟動流程车伞,我們可以知道App啟動大概要經(jīng)歷以下步驟:
在RuntimeInit.commonInit()方法中,會通過Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler)) 設(shè)置異常處理的handler喻喳。
protected static final void commonInit() {
if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");
/*
* set handlers; these apply to all threads in the VM. Apps can replace
* the default handler, but not the pre handler.
*/
LoggingHandler loggingHandler = new LoggingHandler();
RuntimeHooks.setUncaughtExceptionPreHandler(loggingHandler);
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));
/*
* Install a time zone supplier that uses the Android persistent time zone system property.
*/
RuntimeHooks.setTimeZoneIdSupplier(() -> SystemProperties.get("persist.sys.timezone"));
/*
* Sets handler for java.util.logging to use Android log facilities.
* The odd "new instance-and-then-throw-away" is a mirror of how
* the "java.util.logging.config.class" system property works. We
* can't use the system property here since the logger has almost
* certainly already been initialized.
*/
LogManager.getLogManager().reset();
new AndroidConfig();
/*
* Sets the default HTTP User-Agent used by HttpURLConnection.
*/
String userAgent = getDefaultUserAgent();
System.setProperty("http.agent", userAgent);
/*
* Wire socket tagging to traffic stats.
*/
NetworkManagementSocketTagger.install();
initialized = true;
}
3另玖、崩潰的源頭:KillApplicationHandler
查看源碼可知,在finally中表伦,KillApplicationHandler主動殺死了進(jìn)程谦去。
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
public void uncaughtException(Thread t, Throwable e) {
try {
ensureLogging(t, e);
if (mCrashing) return;
mCrashing = true;
if (ActivityThread.currentActivityThread() != null) {
ActivityThread.currentActivityThread().stopProfiling();
}
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
...
} finally {
// Try everything to make sure this process goes away.
Process.killProcess(Process.myPid());
System.exit(10);
}
}
}
4、KillApplicationHandler中的其他操作
在uncaughtException中蹦哼,通過AMS.handleApplicationCrash()做了進(jìn)一步處理鳄哭。通過addErrorToDropBox()在系統(tǒng)中記錄日志,可以記錄 java crash翔怎、native crash窃诉、anr等,日志目錄是:/data/system/dropbox 赤套。
public void handleApplicationCrash(IBinder app,
ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
ProcessRecord r = findAppProcess(app, "Crash");
final String processName = app == null ? "system_server"
: (r == null ? "unknown" : r.processName);
handleApplicationCrashInner("crash", r, processName, crashInfo);
}
void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
ApplicationErrorReport.CrashInfo crashInfo) {
...
addErrorToDropBox(
eventType, r, processName, null, null, null, null, null, null, crashInfo,
new Float(loadingProgress), incrementalMetrics, null);
mAppErrors.crashApplication(r, crashInfo);
}
5、Android 處理Java Crash的調(diào)用流程
未捕獲的異常 -> JVM 觸發(fā)調(diào)用 ->
KillApplicationHandler.uncaughtException {
try {
ActivityManager.getService().handleApplicationCrash(); // 交給AMS處理
} finally { // 退出App進(jìn)程
Process.killProcess(Process.myPid());
System.exit(10);
}
}
-> AMS.handleApplicationCrash
-> AMS.handleApplicationCrashInner {
addErrorToDropBox(); // 系統(tǒng)記錄崩潰日志
mAppErrors.crashApplication();
}
-> AppErrors.crashApplication
-> AppErrors.crashApplicationInner {
// 處理crash
if (!makeAppCrashingLocked()){
return;
}
// 展示崩潰彈窗
final Message msg = Message.obtain();
msg.what = ActivityManagerService.SHOW_ERROR_UI_MSG;
mService.mUiHandler.sendMessage(msg);
// 處理彈窗結(jié)果珊膜,重啟容握、退出等
int res = result.get(); // 阻塞
switch (res) {}
}
二、native crash處理
1车柠、java層監(jiān)聽
由Binder(五)服務(wù)注冊流程-發(fā)送注冊請求可知:
手機開機后會啟動system_server進(jìn)程剔氏,然后調(diào)用SystemServer的main方法,在main方法中通過startBootstrapServices啟動AMS竹祷。之后通過startOtherServices方法調(diào)用AMS的systemReady 谈跛,在systemReady的回調(diào)中,會通過 mActivityManagerService.startObservingNativeCrashes() 注冊 native crash 的監(jiān)聽塑陵。
在NativeCrashListener的run方法中感憾,開啟了socket監(jiān)聽。
public void startObservingNativeCrashes() {
final NativeCrashListener ncl = new NativeCrashListener(this);
ncl.start();
}
final class NativeCrashListener extends Thread {
public void run() {
final byte[] ackSignal = new byte[1];
...
try {
FileDescriptor serverFd = Os.socket(AF_UNIX, SOCK_STREAM, 0);
final UnixSocketAddress sockAddr = UnixSocketAddress.createFileSystem(DEBUGGERD_SOCKET_PATH);
Os.bind(serverFd, sockAddr);
Os.listen(serverFd, 1);
Os.chmod(DEBUGGERD_SOCKET_PATH, 0777);
while (true) {
FileDescriptor peerFd = null;
try {
peerFd = Os.accept(serverFd, null /* peerAddress */);
if (peerFd != null) {
consumeNativeCrashData(peerFd);
}
} catch (Exception e) {
...
} finally {
...
}
}
} catch (Exception e) {
...
}
}
}
2令花、native上報
native程序是動態(tài)鏈接程序阻桅,需要鏈接器才能跑起來凉倚,liner就是Android的鏈接器,查看linker_main.cpp嫂沉。經(jīng)過一系列調(diào)用 _linker_init -> _linker_init_post_relocation -> debuggerd_init 進(jìn)入debuggerd_handler.cpp的debuggerd_init方法中稽寒。
/* This is the entry point for the linker, called from begin.S. This
* method is responsible for fixing the linker's own relocations, and
* then calling __linker_init_post_relocation().
*/
extern "C" ElfW(Addr) __linker_init(void* raw_args) {
...
ElfW(Addr) start_address = __linker_init_post_relocation(args);
return start_address;
}
static ElfW(Addr) __linker_init_post_relocation(KernelArgumentBlock& args) {
#ifdef __ANDROID__
debuggerd_callbacks_t callbacks = {
.get_abort_message = []() {
return g_abort_message;
},
.post_dump = ¬ify_gdb_of_libraries,
};
debuggerd_init(&callbacks);
#endif
}
在debuggerd_init方法中,注冊了用于處理signal的debuggerd_signal_handler趟章。
void debuggerd_init(debuggerd_callbacks_t* callbacks) {
...
struct sigaction action;
memset(&action, 0, sizeof(action));
sigfillset(&action.sa_mask);
action.sa_sigaction = debuggerd_signal_handler;
action.sa_flags = SA_RESTART | SA_SIGINFO;
// Use the alternate signal stack if available so we can catch stack overflows.
action.sa_flags |= SA_ONSTACK;
debuggerd_register_handlers(&action);
}
// /system/core/debuggerd/include/debuggerd/handler.h
static void __attribute__((__unused__)) debuggerd_register_handlers(struct sigaction* action) {
sigaction(SIGABRT, action, nullptr);
sigaction(SIGBUS, action, nullptr);
sigaction(SIGFPE, action, nullptr);
sigaction(SIGILL, action, nullptr);
sigaction(SIGSEGV, action, nullptr);
#if defined(SIGSTKFLT)
sigaction(SIGSTKFLT, action, nullptr);
#endif
sigaction(SIGSYS, action, nullptr);
sigaction(SIGTRAP, action, nullptr);
sigaction(DEBUGGER_SIGNAL, action, nullptr);
}
在debuggerd_signal_handler中杏糙,會通過clone子線程啟動crashdump,用于記錄崩潰日志蚓土,等子線程執(zhí)行完畢后搔啊,通過resend_signal kill掉當(dāng)前進(jìn)程。
static void debuggerd_signal_handler(int signal_number, siginfo_t* info, void* context) {
...
// clone子線程啟動crashdump
pid_t child_pid =
clone(debuggerd_dispatch_pseudothread, pseudothread_stack,
CLONE_THREAD | CLONE_SIGHAND | CLONE_VM | CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID,
&thread_info, nullptr, nullptr, &thread_info.pseudothread_tid);
if (child_pid == -1) {
fatal_errno("failed to spawn debuggerd dispatch thread");
}
// 等待子線程啟動
futex_wait(&thread_info.pseudothread_tid, -1);
// 等待子線程執(zhí)行完畢
futex_wait(&thread_info.pseudothread_tid, child_pid);
...
if (info->si_signo == DEBUGGER_SIGNAL) {
...
} else {
// 重新發(fā)送信號
resend_signal(info);
}
}
static void resend_signal(siginfo_t* info) {
// Signals can either be fatal or nonfatal.
// For fatal signals, crash_dump will send us the signal we crashed with
// before resuming us, so that processes using waitpid on us will see that we
// exited with the correct exit status (e.g. so that sh will report
// "Segmentation fault" instead of "Killed"). For this to work, we need
// to deregister our signal handler for that signal before continuing.
if (info->si_signo != DEBUGGER_SIGNAL) {
signal(info->si_signo, SIG_DFL); // 設(shè)置成系統(tǒng)默認(rèn)處理北戏,會kill掉當(dāng)前進(jìn)程
int rc = syscall(SYS_rt_tgsigqueueinfo, __getpid(), __gettid(), info->si_signo, info);
if (rc != 0) {
fatal_errno("failed to resend signal during crash");
}
}
}
在crash_dump的main方法中负芋,fork子進(jìn)程與tombstoned通信,記錄crash日志嗜愈;并通知AMS native crash旧蛾。
// /system/core/debuggerd/crash_dump.cpp
int main(int argc, char** argv) {
...
// fork子進(jìn)程
pid_t forkpid = fork();
if (forkpid == -1) {
PLOG(FATAL) << "fork failed";
} else if (forkpid == 0) {
fork_exit_read.reset();
} else {
// 等待子進(jìn)程處理完畢
fork_exit_write.reset();
char buf;
TEMP_FAILURE_RETRY(read(fork_exit_read.get(), &buf, sizeof(buf)));
_exit(0);
}
...
// 連接tombstoned,輸出日志
{
ATRACE_NAME("tombstoned_connect");
LOG(INFO) << "obtaining output fd from tombstoned, type: " << dump_type;
g_tombstoned_connected =
tombstoned_connect(g_target_thread, &g_tombstoned_socket, &g_output_fd, dump_type);
}
if (g_tombstoned_connected) {
if (TEMP_FAILURE_RETRY(dup2(g_output_fd.get(), STDOUT_FILENO)) == -1) {
PLOG(ERROR) << "failed to dup2 output fd (" << g_output_fd.get() << ") to STDOUT_FILENO";
}
} else {
unique_fd devnull(TEMP_FAILURE_RETRY(open("/dev/null", O_RDWR)));
TEMP_FAILURE_RETRY(dup2(devnull.get(), STDOUT_FILENO));
g_output_fd = std::move(devnull);
}
...
// 通知AMS
if (fatal_signal) {
// Don't try to notify ActivityManager if it just crashed, or we might hang until timeout.
if (thread_info[target_process].thread_name != "system_server") {
activity_manager_notify(target_process, signo, amfd_data);
}
}
...
// 通知tombstoned處理完畢
if (g_tombstoned_connected && !tombstoned_notify_completion(g_tombstoned_socket.get())) {
LOG(ERROR) << "failed to notify tombstoned of completion";
}
return 0;
}
三蠕嫁、崩潰優(yōu)化(java層)
1锨天、記錄日志信息:
記錄手機信息、內(nèi)存信息剃毒、Crash日志病袄、屏幕截圖等
2、讓崩潰更友好一些:
系統(tǒng)崩潰會直接閃退赘阀,可以通過自定義handler進(jìn)行處理益缠,重啟App頁面,減少直接退出App的場景基公。
需要注意的是幅慌,重啟app時,需要退出原來的進(jìn)程轰豆,防止出現(xiàn)其它問題胰伍。
Intent intent = new Intent(BaseApplication.this, MainActivity.class);
intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK
| Intent.FLAG_ACTIVITY_CLEAR_TASK |
Intent.FLAG_ACTIVITY_RESET_TASK_IF_NEEDED);
if (intent.getComponent() != null) {
// 模擬從Launcher啟動
intent.setAction(Intent.ACTION_MAIN);
intent.addCategory(Intent.CATEGORY_LAUNCHER);
}
BaseApplication.this.startActivity(intent);
android.os.Process.killProcess(android.os.Process.myPid());
System.exit(10);
3、不崩潰:
在crash過程中通過在主線程中重啟looper酸休,防止App崩潰骂租。
原理:系統(tǒng)出現(xiàn)未捕捉的異常后,會將異常一層層向上拋斑司,我們知道主線程開啟了looper循環(huán)渗饮,異常會導(dǎo)致循環(huán)退出,最終通過jvm調(diào)用到uncaughtException()方法。此時在主線程中通過Looper.loop()重啟loop抽米,即可繼續(xù)處理App中的各種事件特占。
注意:當(dāng)在Activity展示過程中crash時,系統(tǒng)會出現(xiàn)黑屏云茸。 可以通過hook替換ActivityThread.mH.mCallback是目,對Activity的生命周期進(jìn)行try catch,如果有異常的話标捺,直接關(guān)閉準(zhǔn)備顯示的Activity懊纳。
public class CrashHandler implements Thread.UncaughtExceptionHandler {
@Override
public void uncaughtException(@NonNull Thread thread, @NonNull Throwable ex) {
handleExceptionReocrd(ex); // 自動記錄日志
try { // 交給用戶記錄日志
if (listener != null) listener.recordException(ex);
} catch (Throwable e) {
e.printStackTrace();
}
try { // 是否重啟APP,重啟APP亡容,需要殺掉進(jìn)程
if (listener != null && listener.restartApp()) return;
} catch (Exception e) {
Log.d(TAG, "uncaughtException->handleByUser:" + Log.getStackTraceString(e));
}
// 未重啟嗤疯,是否開啟安全模式
if (safeModelEnable) {
enterSafeModel(thread);
} else if (mDefaultHandler != null) {
// 交給系統(tǒng)處理
Log.d(TAG, "uncaughtException 交給系統(tǒng)處理");
mDefaultHandler.uncaughtException(thread, ex);
} else {
// 沒有系統(tǒng)的處理器,直接退出進(jìn)程
Log.w(TAG, "uncaughtException 退出進(jìn)程");
android.os.Process.killProcess(android.os.Process.myPid());
System.exit(10);
}
}
public void enterSafeModel(Thread thread) {
Log.w(CrashHandler.TAG, "setSafe--- thread-----" + thread.getName());
if (thread == Looper.getMainLooper().getThread()) {
while (true) { //開啟一個循環(huán)
try {
Log.e(TAG, "safeMode: 檢測到異常退出闺兢,開啟looper");
Looper.loop();
} catch (Throwable e) {
Log.e(TAG, "safeMode: 檢測到異常退出:" + Log.getStackTraceString(e));
}
}
}
}
}