Mono源碼閱讀-崩潰機(jī)制
# 簡介
本文主要針對mono源碼中關(guān)于崩潰信號量處理的相關(guān)源碼進(jìn)行閱讀和研究锉走,源碼涉及的代碼文件如下:
mini.c
mini-posix.c
mini-exceptions.c
exceptions-arm.c
Install Signal Handler
add_signal_handler
所有信號handler的注冊函數(shù)都是調(diào)用 add_signal_handler的.
mono代碼里一共調(diào)用這個(gè)函數(shù)來注冊信號量的函數(shù)有:
interp.c
- mono_runtime_install_handlers
mini-posix.c
- mono_runtime_posix_install_handlers
- mono_runtime_setup_stat_profiler (SIGPROF)
mono_runtime_posix_install_handlers
這里主要關(guān)注mini目錄下的信號量注冊.
Mono捕捉的信號:
- SIGINT (if handle sigint)
- SIGFPE
- SIGQUIT
- SIGILL
- SIGBUS
- SIGUSR2(if mono_jit_trace_calls != null)
- SIGUSR1 -> mono_thread_get_abort_signal(0
- SIGABRT
- SIGSEGV
常量 | 解釋 |
---|---|
SIGSEGV | 非法內(nèi)存訪問(段錯(cuò)誤),試圖訪問未分配給自己的內(nèi)存, 或試圖往沒有寫權(quán)限的內(nèi)存地址寫數(shù)據(jù). |
SIGINT | 外部中斷糙俗,通常為用戶所發(fā)動(dòng), 程序終止(interrupt)信號, 在用戶鍵入INTR字符(通常是Ctrl-C)時(shí)發(fā)出赋咽,用于通知前臺進(jìn)程組終止進(jìn)程芦劣。 |
SIGILL | 非法程序映像,例如非法指令, 執(zhí)行了非法指令. 通常是因?yàn)榭蓤?zhí)行文件本身出現(xiàn)錯(cuò)誤, 或者試圖執(zhí)行數(shù)據(jù)段. 堆棧溢出時(shí)也有可能產(chǎn)生這個(gè)信號泵殴。 |
SIGABRT | 異常終止條件棚赔,例如 abort() 所起始的 |
SIGFPE | 在發(fā)生致命的算術(shù)運(yùn)算錯(cuò)誤時(shí)發(fā)出. 不僅包括浮點(diǎn)運(yùn)算錯(cuò)誤, 還包括溢出及除數(shù)為0等其它所有的算術(shù)的錯(cuò)誤。 |
SIGQUIT | 和SIGINT類似, 但由QUIT字符(通常是Ctrl-\)來控制. 進(jìn)程在因收到SIGQUIT退出時(shí)會產(chǎn)生core文件, 在這個(gè)意義上類似于一個(gè)程序錯(cuò)誤信號系吩。 |
SIGBUS | 非法地址, 包括內(nèi)存地址對齊(alignment)出錯(cuò)来庭。比如訪問一個(gè)四個(gè)字長的整數(shù), 但其地址不是4的倍數(shù)。它與SIGSEGV的區(qū)別在于后者是由于對合法存儲地址的非法訪問觸發(fā)的(如訪問不屬于自己存儲空間或只讀存儲空間)穿挨。 |
SIGUSR1 | 留給用戶使用 |
SIGUSR1 | 留給用戶使用 |
注冊信號量的代碼:
void
mono_runtime_posix_install_handlers (void)
{
sigset_t signal_set;
if (mini_get_debug_options ()->handle_sigint)
add_signal_handler (SIGINT, mono_sigint_signal_handler);
add_signal_handler (SIGFPE, mono_sigfpe_signal_handler);
add_signal_handler (SIGQUIT, sigquit_signal_handler);
add_signal_handler (SIGILL, mono_sigill_signal_handler);
add_signal_handler (SIGBUS, mono_sigsegv_signal_handler);
if (mono_jit_trace_calls != NULL)
add_signal_handler (SIGUSR2, sigusr2_signal_handler);
add_signal_handler (mono_thread_get_abort_signal (), sigusr1_signal_handler);
/* it seems to have become a common bug for some programs that run as parents
* of many processes to block signal delivery for real time signals.
* We try to detect and work around their breakage here.
*/
sigemptyset (&signal_set);
sigaddset (&signal_set, mono_thread_get_abort_signal ());
sigprocmask (SIG_UNBLOCK, &signal_set, NULL);
signal (SIGPIPE, SIG_IGN);
#ifndef MONO_CROSS_COMPILE
add_signal_handler (SIGABRT, sigabrt_signal_handler);
/* catch SIGSEGV */
add_signal_handler (SIGSEGV, mono_sigsegv_signal_handler);
#endif
}
Signal Handler
所有的信號handler都是使用 SIG_HANDLER_SIGNATURE 宏來定義的:
mini-posix.c
- sigabrt_signal_handler
- sigprof_signal_handler
- sigquit_signal_handler
- siguser1_signal_handler
- sigusr2_signal_handler
mini.c
- mono_sigfpe_signal_handler
- mono_sigill_signal_handler
- mono_sigsegv_signal_handler
- mono_sigint_signal_handler
SIGINT
void
SIG_HANDLER_SIGNATURE (mono_sigint_signal_handler)
{
MonoException *exc;
GET_CONTEXT;
exc = mono_get_exception_execution_engine ("Interrupted (SIGINT).");
mono_arch_handle_exception (ctx, exc, FALSE);
}
mono_arch_handle_exception
/*
* This is the function called from the signal handler
*/
gboolean
mono_arch_handle_exception (void *ctx, gpointer obj, gboolean test_only)
{
MonoContext mctx;
gboolean result;
mono_arch_sigctx_to_monoctx (ctx, &mctx);
result = mono_handle_exception (&mctx, obj, (gpointer)mctx.eip, test_only);
/* restore the context so that returning from the signal handler will invoke
* the catch clause
*/
mono_arch_monoctx_to_sigctx (&mctx, ctx);
return result;
}
SEGV
如果沒有 mono_domain_get() 或者沒有 jit_tls 則可以認(rèn)為該線程非管理線程, 則調(diào)用
mono_chain_signal 來調(diào)用注冊的chian signal handler 去處理, 如果該handler返回true, 則mono直接return不做任何處理 , 否則mono會調(diào)用
如果是管理線程, 那么和在C#里面Throw Exception的邏輯一樣, 調(diào)用mono_handle_exception去處理C#的異常.
mono_handle_native_sigsegv來打印堆棧并最后調(diào)用 abort()
這里的chain_signal_handler就是mono在注冊信號量的時(shí)候預(yù)先保存了之前的signal_handler
saved_handler
void
SIG_HANDLER_SIGNATURE (mono_sigsegv_signal_handler)
{
MonoJitInfo *ji;
MonoJitTlsData *jit_tls = TlsGetValue (mono_jit_tls_id);
gpointer ip;
GET_CONTEXT;
#if defined(MONO_ARCH_SOFT_DEBUG_SUPPORTED) && defined(HAVE_SIG_INFO)
if (mono_arch_is_single_step_event (info, ctx)) {
mono_debugger_agent_single_step_event (ctx);
return;
} else if (mono_arch_is_breakpoint_event (info, ctx)) {
mono_debugger_agent_breakpoint_hit (ctx);
return;
}
#endif
#if !defined(PLATFORM_WIN32) && defined(HAVE_SIG_INFO)
if (mono_aot_is_pagefault (info->si_addr)) {
mono_aot_handle_pagefault (info->si_addr);
return;
}
#endif
/* The thread might no be registered with the runtime */
if (!mono_domain_get () || !jit_tls) {
if (mono_chain_signal (SIG_HANDLER_PARAMS))
return;
mono_handle_native_sigsegv (SIGSEGV, ctx);
}
ip = mono_arch_ip_from_context (ctx);
#ifdef _WIN64
/* Sometimes on win64 we get null IP, but the previous frame is a valid managed frame */
/* So pop and try again */
if (!ip && ctx)
{
MonoContext *context = (MonoContext*)ctx;
gpointer *sp = context->rsp;
if (sp)
{
ip = context->rip = *sp;
context->rsp += sizeof(gpointer);
}
}
#endif
ji = mono_jit_info_table_find (mono_domain_get (), ip);
#ifdef MONO_ARCH_SIGSEGV_ON_ALTSTACK
if (mono_handle_soft_stack_ovf (jit_tls, ji, ctx, (guint8*)info->si_addr))
return;
/* The hard-guard page has been hit: there is not much we can do anymore
* Print a hopefully clear message and abort.
*/
if (jit_tls->stack_size &&
ABS ((guint8*)info->si_addr - ((guint8*)jit_tls->end_of_stack - jit_tls->stack_size)) < 32768) {
const char *method;
/* we don't do much now, but we can warn the user with a useful message */
fprintf (stderr, "Stack overflow: IP: %p, fault addr: %p\n", mono_arch_ip_from_context (ctx), (gpointer)info->si_addr);
if (ji && ji->method)
method = mono_method_full_name (ji->method, TRUE);
else
method = "Unmanaged";
fprintf (stderr, "At %s\n", method);
_exit (1);
} else {
/* The original handler might not like that it is executed on an altstack... */
if (!ji && mono_chain_signal (SIG_HANDLER_PARAMS))
return;
mono_arch_handle_altstack_exception (ctx, info->si_addr, FALSE);
}
#else
if (!ji) {
if (mono_chain_signal (SIG_HANDLER_PARAMS))
return;
mono_handle_native_sigsegv (SIGSEGV, ctx);
}
mono_arch_handle_exception (ctx, NULL, FALSE);
#endif
}
mono_handle_native_sigsegv
幾個(gè)關(guān)鍵點(diǎn):
- mono_backtrace_from_context (OS X) 從sig context里轉(zhuǎn)成MonoContext, 并且回溯堆棧的每一個(gè)PC
- backtrace (非OS X) 也是回溯堆棧的每一個(gè)PC值
- backtrace_symbols 將每個(gè)PC值轉(zhuǎn)換成函數(shù)名(符號名稱)
然后將堆棧打印到stderr
然后通過GDB獲取更詳細(xì)的調(diào)試信息, 并打印到stderr.
最后去掉監(jiān)聽ABRT信號量, 然后調(diào)用 abort() 函數(shù)來退出程序.
Throw Exception
mono_arm_throw_exception
exceptions-arm.c
拋出異常的代碼:
void
mono_arm_throw_exception (MonoObject *exc, unsigned long eip, unsigned long esp, gulong *int_regs, gdouble *fp_regs)
{
static void (*restore_context) (MonoContext *);
MonoContext ctx;
gboolean rethrow = eip & 1;
if (!restore_context)
restore_context = mono_get_restore_context ();
eip &= ~1; /* clear the optional rethrow bit */
/* adjust eip so that it point into the call instruction */
eip -= 4;
/*printf ("stack in throw: %p\n", esp);*/
MONO_CONTEXT_SET_BP (&ctx, int_regs [ARMREG_FP - 4]);
MONO_CONTEXT_SET_SP (&ctx, esp);
MONO_CONTEXT_SET_IP (&ctx, eip);
memcpy (((guint8*)&ctx.regs) + (4 * 4), int_regs, sizeof (gulong) * 8);
/* memcpy (&ctx.fregs, fp_regs, sizeof (double) * MONO_SAVED_FREGS); */
if (mono_object_isinst (exc, mono_defaults.exception_class)) {
MonoException *mono_ex = (MonoException*)exc;
if (!rethrow)
mono_ex->stack_trace = NULL;
}
mono_handle_exception (&ctx, exc, (gpointer)(eip + 4), FALSE);
restore_context (&ctx);
g_assert_not_reached ();
}
保存context
mono_handle_exception
還原context
mono_handle_exception
mini-exceptions.c
- mono_handle_exception
- mono_handle_exception_internal
MonoContext
Mono為了做平臺兼容性, 將sig_context全部統(tǒng)一成 MonoContext 結(jié)構(gòu)體, 主要包括寄存器的各類值, 例如ARM下保存了PC, FP, SP和R0-R15
typedef struct {
gulong eip; // pc
gulong ebp; // fp
gulong esp; // sp
gulong regs [16];
double fregs [MONO_SAVED_FREGS];
} MonoContext;
eip -> sigctx.arm_pc (R15)
esp -> sigctx.arm_sp (RR13)
ebp -> sigctx.arm_fp (R11)
regs -> sigctx.arm_r0, sizeof(gulong) * 16 (R0 ~ R15)
http://www.mono-project.com/docs/debug+profile/debug/
http://www.mono-project.com/docs/advanced/embedding/
define mono_backtrace select-frame 0 set $i = 0 while ($i < $arg0) set $foo = (char*) mono_pmip ($pc) if ($foo) printf "#%d %p in %s\n", $i, $pc, $foo else frame end up-silently set $i = $i + 1 end end
define mono_stack set $mono_thread = mono_thread_current () if ($mono_thread == 0x00) printf "No mono thread associated with this thread\n" else set $ucp = malloc (sizeof (ucontext_t)) call (void) getcontext ($ucp) call (void) mono_print_thread_dump ($ucp) call (void) free ($ucp) end end
mono_chain_signal 調(diào)用原h(huán)andler
mono_handle_native_sigsegv 打印堆棧 and abort()
mono_arch_handle_exception // exceptions-arm.c
mono_handle_exception_internal // mini-exceptions.c
if (!ji) {
if (mono_chain_signal (SIG_HANDLER_PARAMS))
return;
mono_handle_native_sigsegv (SIGSEGV, ctx);
}
mono_arch_handle_exception (ctx, exc, FALSE);
mini.c
SIG_HANDLER_SIGNATURE (mono_sigfpe_signal_handler)
SIG_HANDLER_SIGNATURE (mono_sigill_signal_handler)
SIG_HANDLER_SIGNATURE (mono_sigsegv_signal_handler)
SIG_HANDLER_SIGNATURE (mono_sigint_signal_handler)
mini-posix.c
SIG_HANDLER_SIGNATURE (sigabrt_signal_handler)
SIG_HANDLER_SIGNATURE (sigusr1_signal_handler)
SIG_HANDLER_SIGNATURE (sigprof_signal_handler)
SIG_HANDLER_SIGNATURE (sigquit_signal_handler)
SIG_HANDLER_SIGNATURE (sigusr2_signal_handler)
在非管理線程, 無法獲取 tls, 主要是兩個(gè):
- mono_domain_get()
- jit_tls
就算可以通過ptrace獲取tls, 但因?yàn)楸仨氄{(diào)用如下幾個(gè)函數(shù)來walk stack,
- mono_jit_walk_stack_from_ctx
- mono_walk_stack
這里函數(shù)里面都去訪問了tls, 因此無法傳值進(jìn)去, 如果自己去實(shí)現(xiàn)這兩個(gè)函數(shù), 又因?yàn)楹芏嘟Y(jié)構(gòu)體無法訪問到, 因此不能自己去實(shí)現(xiàn)stack walker
就算只希望拿到最后一個(gè)pc, 去獲取c#函數(shù)名, 因?yàn)樗械腃#的函數(shù)信息都存放在 domain 里的jitInfoTable 里, 如果可以獲取到 current domain對象, 那么也可以通過
mono_jit_info_table_find(domain, addr) 函數(shù)來獲取到 MonoJitInfo, 然后用 mono_jit_info_get_method 獲取 MonoMethod, 最后通過 mono_method_full_name 來得到函數(shù)名.
NOTE ATTRIBUTES
Created Date: 2018-05-25 00:55:50
Last Evernote Update Date: 2020-05-23 07:15:03