tombstone與debuggerd相關(guān)流程

tombstone的抓取與debuggerd的有關(guān)系是一個守護進(jìn)程送淆,用來檢測程序的崩潰税产,將程序崩潰前進(jìn)程的狀態(tài)記錄下來,保存在/data/tombstone文件夾下偷崩,最多10個辟拷;本質(zhì)上是對程序崩潰時某些信號的攔截

相關(guān)流程

客戶端流程

首先,Android程序的入口有一個linker的操作阐斜,大致流程如下:

bionic/linker/arch/arm64/begin.S
31ENTRY(_start)
32  mov x0, sp
33  bl __linker_init
34
35  /* linker init returns the _entry address in the main image */
36  br x0
37END(_start)

bionic/linker/linker.cpp
4442/*
4443 * This is the entry point for the linker, called from begin.S. This
4444 * method is responsible for fixing the linker's own relocations, and
4445 * then calling __linker_init_post_relocation().
4446 *
4447 * Because this method is called before the linker has fixed it's own
4448 * relocations, any attempt to reference an extern variable, extern
4449 * function, or other GOT reference will generate a segfault.
4450 */
4451extern "C" ElfW(Addr) __linker_init(void* raw_args) {
          ...
4522  // We have successfully fixed our own relocations. It's safe to run
4523  // the main part of the linker now.
4524  args.abort_message_ptr = &g_abort_message;
4525  ElfW(Addr) start_address = __linker_init_post_relocation(args, linker_addr);
4526
4527  INFO("[ Jumping to _start (%p)... ]", reinterpret_cast<void*>(start_address));
4528
4529  // Return the address that the calling assembly stub should jump to.
4530  return start_address;
4531}

4195/*
4196 * This code is called after the linker has linked itself and
4197 * fixed it's own GOT. It is safe to make references to externs
4198 * and other non-local data at this point.
4199 */
4200static ElfW(Addr) __linker_init_post_relocation(KernelArgumentBlock& args, ElfW(Addr) linker_base) {
4201#if TIMING
4202  struct timeval t0, t1;
4203  gettimeofday(&t0, 0);
4204#endif
4205
4206  // Sanitize the environment.
4207  __libc_init_AT_SECURE(args);
4208
4209  // Initialize system properties
4210  __system_properties_init(); // may use 'environ'
4211
4212  debuggerd_init();
4213
4214  // Get a few environment variables.
4215  const char* LD_DEBUG = getenv("LD_DEBUG");
4216  if (LD_DEBUG != nullptr) {
4217    g_ld_debug_verbosity = atoi(LD_DEBUG);
4218  }
           ...
4412}
bionic/linker/debugger.cpp
302__LIBC_HIDDEN__ void debuggerd_init() {
303  struct sigaction action;
304  memset(&action, 0, sizeof(action));
305  sigemptyset(&action.sa_mask);
306  action.sa_sigaction = debuggerd_signal_handler;
307  action.sa_flags = SA_RESTART | SA_SIGINFO;
308
309  // Use the alternate signal stack if available so we can catch stack overflows.
310  action.sa_flags |= SA_ONSTACK;
311
312  sigaction(SIGABRT, &action, nullptr);
313  sigaction(SIGBUS, &action, nullptr);
314  sigaction(SIGFPE, &action, nullptr);
315  sigaction(SIGILL, &action, nullptr);
316  sigaction(SIGSEGV, &action, nullptr);
317#if defined(SIGSTKFLT)
318  sigaction(SIGSTKFLT, &action, nullptr);
319#endif
320  sigaction(SIGTRAP, &action, nullptr);
321}

為上面這幾個信號注冊信號處理函數(shù)衫冻,也就是說只有這幾個信號會生成tombstone

SIGILL(非法指令異常)

SIGABRT(abort退出異常)

SIGBUS(硬件訪問異常)

SIGFPE(浮點運算異常)

SIGSEGV(內(nèi)存訪問異常)

SIGSTKFLT(協(xié)處理器棧異常)

SIGTRAP(這是什么?好像不常見)

信號處理函數(shù)為:

258/*
259 * Catches fatal signals so we can ask debuggerd to ptrace us before
260 * we crash.
261 */
262static void debuggerd_signal_handler(int signal_number, siginfo_t* info, void*) {
263  // It's possible somebody cleared the SA_SIGINFO flag, which would mean
264  // our "info" arg holds an undefined value.
265  if (!have_siginfo(signal_number)) {
266    info = nullptr;
267  }
268
269  log_signal_summary(signal_number, info);
270
271  send_debuggerd_packet(info); //發(fā)送請求 第一次接受到信號是向debuggerd服務(wù)端發(fā)送請求,等待回應(yīng)表示鏈接上了
272
273  // We need to return from the signal handler so that debuggerd can dump the
274  // thread that crashed, but returning here does not guarantee that the signal
275  // will be thrown again, even for SIGSEGV and friends, since the signal could
276  // have been sent manually. Resend the signal with rt_tgsigqueueinfo(2) to
277  // preserve the SA_SIGINFO contents.
278  signal(signal_number, SIG_DFL); //將信號處理函數(shù)置空
279
280  struct siginfo si;
281  if (!info) {
282    memset(&si, 0, sizeof(si));
283    si.si_code = SI_USER;
284    si.si_pid = getpid();
285    si.si_uid = getuid();
286    info = &si;
287  } else if (info->si_code >= 0 || info->si_code == SI_TKILL) {
288    // rt_tgsigqueueinfo(2)'s documentation appears to be incorrect on kernels
289    // that contain commit 66dd34a (3.9+). The manpage claims to only allow
290    // negative si_code values that are not SI_TKILL, but 66dd34a changed the
291    // check to allow all si_code values in calls coming from inside the house.
292  }
293
294  int rc = syscall(SYS_rt_tgsigqueueinfo, getpid(), gettid(), signal_number, info); //給自己的相關(guān)線程再發(fā)送一次信號
295  if (rc != 0) {
296    __libc_format_log(ANDROID_LOG_FATAL, "libc", "failed to resend signal during crash: %s",
297                      strerror(errno));
298    _exit(0);
299  }
300}

客戶端向denggerd發(fā)送信息谒出,并等待回應(yīng)隅俘,通過socket的write & read

208static void send_debuggerd_packet(siginfo_t* info) {
209  // Mutex to prevent multiple crashing threads from trying to talk
210  // to debuggerd at the same time.
211  static pthread_mutex_t crash_mutex = PTHREAD_MUTEX_INITIALIZER;
212  int ret = pthread_mutex_trylock(&crash_mutex);
213  if (ret != 0) {
214    if (ret == EBUSY) {
215      __libc_format_log(ANDROID_LOG_INFO, "libc",
216          "Another thread contacted debuggerd first; not contacting debuggerd.");
217      // This will never complete since the lock is never released.
218      pthread_mutex_lock(&crash_mutex);
219    } else {
220      __libc_format_log(ANDROID_LOG_INFO, "libc",
221                        "pthread_mutex_trylock failed: %s", strerror(ret));
222    }
223    return;
224  }
225
226  int s = socket_abstract_client(DEBUGGER_SOCKET_NAME, SOCK_STREAM | SOCK_CLOEXEC);
227  if (s == -1) {
228    __libc_format_log(ANDROID_LOG_FATAL, "libc", "Unable to open connection to debuggerd: %s",
229                      strerror(errno));
230    return;
231  }
232
233  // debuggerd knows our pid from the credentials on the
234  // local socket but we need to tell it the tid of the crashing thread.
235  // debuggerd will be paranoid and verify that we sent a tid
236  // that's actually in our process.
237  debugger_msg_t msg;
238  msg.action = DEBUGGER_ACTION_CRASH;
239  msg.tid = gettid();
240  msg.abort_msg_address = reinterpret_cast<uintptr_t>(g_abort_message);
241  msg.original_si_code = (info != nullptr) ? info->si_code : 0;
242  ret = TEMP_FAILURE_RETRY(write(s, &msg, sizeof(msg)));
243  if (ret == sizeof(msg)) {
244    char debuggerd_ack;
245    ret = TEMP_FAILURE_RETRY(read(s, &debuggerd_ack, 1));
246    int saved_errno = errno;
247    notify_gdb_of_libraries();
248    errno = saved_errno;
249  } else {
250    // read or write failed -- broken connection?
251    __libc_format_log(ANDROID_LOG_FATAL, "libc", "Failed while talking to debuggerd: %s",
252                      strerror(errno));
253  }
254
255  close(s);
256}

debuggerd服務(wù)端啟動邻奠,dump流程

debuggerd守護進(jìn)程如何啟動,可以通過debuggerd -b 啟動为居,我們暫且不去說他碌宴,就說正常的啟動模式

941int main(int argc, char** argv) {
942  union selinux_callback cb;
943  if (argc == 1) {
944    cb.func_audit = audit_callback;
945    selinux_set_callback(SELINUX_CB_AUDIT, cb);
946    cb.func_log = selinux_log_callback;
947    selinux_set_callback(SELINUX_CB_LOG, cb);
948    return do_server();
949  }
950
951  bool dump_backtrace = false;
952  bool have_tid = false;
953  pid_t tid = 0;
954  for (int i = 1; i < argc; i++) {
955    if (!strcmp(argv[i], "-b")) {
956      dump_backtrace = true;
957    } else if (!have_tid) {
958      tid = atoi(argv[i]);
959      have_tid = true;
960    } else {
961      usage();
962      return 1;
963    }
964  }
965  if (!have_tid) {
966    usage();
967    return 1;
968  }
969  return do_explicit_dump(tid, dump_backtrace);
970}

啟動一個debuggerd服務(wù)端

849static int do_server() {
850  // debuggerd crashes can't be reported to debuggerd.
851  // Reset all of the crash handlers.
852  signal(SIGABRT, SIG_DFL);
853  signal(SIGBUS, SIG_DFL);
854  signal(SIGFPE, SIG_DFL);
855  signal(SIGILL, SIG_DFL);
856  signal(SIGSEGV, SIG_DFL);
857#ifdef SIGSTKFLT
858  signal(SIGSTKFLT, SIG_DFL);
859#endif
860  signal(SIGTRAP, SIG_DFL);
861
862  // Ignore failed writes to closed sockets
863  signal(SIGPIPE, SIG_IGN); //將debuggerd本身的crash忽略
864
865  // Block SIGCHLD so we can sigtimedwait for it.
866  sigset_t sigchld;
867  sigemptyset(&sigchld);
868  sigaddset(&sigchld, SIGCHLD);
869  sigprocmask(SIG_SETMASK, &sigchld, nullptr);
870
871  int s = socket_local_server(SOCKET_NAME, ANDROID_SOCKET_NAMESPACE_ABSTRACT,
872                              SOCK_STREAM | SOCK_CLOEXEC); //創(chuàng)建一個服務(wù)端,等待客戶端連接
873  if (s == -1) return 1;
874
875  typedef void (*NativeDebugInit)(void);
876  static NativeDebugInit s_func_ptr = NULL;
877  if(!s_func_ptr) {
878    void* handle = dlopen("libmiuindbg.so",RTLD_NOW);
879    if(handle) {
880      s_func_ptr = (NativeDebugInit)dlsym(handle,"hook_context_do_hook");
881    }
882  }
883
884  if(s_func_ptr) {
885    s_func_ptr();
886  }
887
888  // Fork a process that stays root, and listens on a pipe to pause and resume the target.
889  if (!start_signal_sender()) {
890    ALOGE("debuggerd: failed to fork signal sender");
891    return 1;
892  }
893
894  ALOGI("debuggerd: starting\n");
895
896  for (;;) {
897    sockaddr_storage ss;
898    sockaddr* addrp = reinterpret_cast<sockaddr*>(&ss);
899    socklen_t alen = sizeof(ss);
900
901    ALOGV("waiting for connection\n");
902    int fd = accept4(s, addrp, &alen, SOCK_CLOEXEC);
903    if (fd == -1) {
904      ALOGE("accept failed: %s\n", strerror(errno));
905      continue;
906    }
907
908    handle_request(fd); //處理客戶端的請求
909  }
910  return 0;
911}

處理客戶端發(fā)來的請求

808static void handle_request(int fd) {
809  ALOGV("handle_request(%d)\n", fd);
810
811  ScopedFd closer(fd);
812  debugger_request_t request;
813  memset(&request, 0, sizeof(request));
814  int status = read_request(fd, &request); //讀取客戶端的請求
815  if (status != 0) {
816    return;
817  }
818
819  ALOGW("debuggerd: handling request: pid=%d uid=%d gid=%d tid=%d\n", request.pid, request.uid,
820        request.gid, request.tid);
821
822#if defined(__LP64__)
823  // On 64 bit systems, requests to dump 32 bit and 64 bit tids come
824  // to the 64 bit debuggerd. If the process is a 32 bit executable,
825  // redirect the request to the 32 bit debuggerd.
826  if (is32bit(request.tid)) {
827    // Only dump backtrace and dump tombstone requests can be redirected.
828    if (request.action == DEBUGGER_ACTION_DUMP_BACKTRACE ||
829        request.action == DEBUGGER_ACTION_DUMP_TOMBSTONE) {
830      redirect_to_32(fd, &request);
831    } else {
832      ALOGE("debuggerd: Not allowed to redirect action %d to 32 bit debuggerd\n", request.action);
833    }
834    return;
835  }
836#endif
837
838  // Fork a child to handle the rest of the request.
839  pid_t fork_pid = fork();
840  if (fork_pid == -1) {
841    ALOGE("debuggerd: failed to fork: %s\n", strerror(errno));
842  } else if (fork_pid == 0) {
843    worker_process(fd, request); //處理request
844  } else {
845    monitor_worker_process(fork_pid, request);
846  }
847}

read客戶端發(fā)來的信息

197static int read_request(int fd, debugger_request_t* out_request) {
198  ucred cr;
199  socklen_t len = sizeof(cr);
200  int status = getsockopt(fd, SOL_SOCKET, SO_PEERCRED, &cr, &len);
201  if (status != 0) {
202    ALOGE("cannot get credentials");
203    return -1;
204  }
205
206  ALOGV("reading tid");
207  fcntl(fd, F_SETFL, O_NONBLOCK);
208
209  pollfd pollfds[1];
210  pollfds[0].fd = fd;
211  pollfds[0].events = POLLIN;
212  pollfds[0].revents = 0;
213  status = TEMP_FAILURE_RETRY(poll(pollfds, 1, 3000)); //輪詢fd句柄
215    ALOGE("timed out reading tid (from pid=%d uid=%d)\n", cr.pid, cr.uid);
216    return -1;
217  }
218
219  debugger_msg_t msg;
220  memset(&msg, 0, sizeof(msg));
221  status = TEMP_FAILURE_RETRY(read(fd, &msg, sizeof(msg))); //讀取客戶端信息
222  if (status < 0) {
223    ALOGE("read failure? %s (pid=%d uid=%d)\n", strerror(errno), cr.pid, cr.uid);
224    return -1;
225  }
226  if (status != sizeof(debugger_msg_t)) {
227    ALOGE("invalid crash request of size %d (from pid=%d uid=%d)\n", status, cr.pid, cr.uid);
228    return -1;
229  }
230
231  out_request->action = static_cast<debugger_action_t>(msg.action);
232  out_request->tid = msg.tid;
233  out_request->pid = cr.pid;
234  out_request->uid = cr.uid;
235  out_request->gid = cr.gid;
236  out_request->abort_msg_address = msg.abort_msg_address;
237  out_request->original_si_code = msg.original_si_code;
238
239  if (msg.action == DEBUGGER_ACTION_CRASH) {
240    // Ensure that the tid reported by the crashing process is valid.
241    // This check needs to happen again after ptracing the requested thread to prevent a race.
242    if (!pid_contains_tid(out_request->pid, out_request->tid)) {
243      ALOGE("tid %d does not exist in pid %d. ignoring debug request\n", out_request->tid,
244            out_request->pid);
245      return -1;
246    }
247  } else if (cr.uid == 0 || (cr.uid == AID_SYSTEM && msg.action == DEBUGGER_ACTION_DUMP_BACKTRACE)) {
248    // Only root or system can ask us to attach to any process and dump it explicitly.
249    // However, system is only allowed to collect backtraces but cannot dump tombstones.
250    status = get_process_info(out_request->tid, &out_request->pid,
251                              &out_request->uid, &out_request->gid);
252    if (status < 0) {
253      ALOGE("tid %d does not exist. ignoring explicit dump request\n", out_request->tid);
254      return -1;
255    }
256
257    if (!selinux_action_allowed(fd, out_request))
258      return -1;
259  } else {
260    // No one else is allowed to dump arbitrary processes.
261    return -1;
262  }
263  return 0;
264}

整體的dump流程

566static void worker_process(int fd, debugger_request_t& request) {
567  // Open the tombstone file if we need it.
568  std::string tombstone_path;
569  int tombstone_fd = -1;
570  switch (request.action) {
571    case DEBUGGER_ACTION_DUMP_TOMBSTONE:
572    case DEBUGGER_ACTION_CRASH:
573      tombstone_fd = open_tombstone(&tombstone_path); 
574      if (tombstone_fd == -1) {
575        ALOGE("debuggerd: failed to open tombstone file: %s\n", strerror(errno));
576        exit(1);
577      }
578      break;
579
580    case DEBUGGER_ACTION_DUMP_BACKTRACE:
581      break;
582
583    default:
584      ALOGE("debuggerd: unexpected request action: %d", request.action);
585      exit(1);
586  }
587
588  // At this point, the thread that made the request is blocked in
589  // a read() call.  If the thread has crashed, then this gives us
590  // time to PTRACE_ATTACH to it before it has a chance to really fault.
591  //
592  // The PTRACE_ATTACH sends a SIGSTOP to the target process, but it
593  // won't necessarily have stopped by the time ptrace() returns.  (We
594  // currently assume it does.)  We write to the file descriptor to
595  // ensure that it can run as soon as we call PTRACE_CONT below.
596  // See details in bionic/libc/linker/debugger.c, in function
597  // debugger_signal_handler().
598
599  // Attach to the target process.
        //通過ptrace監(jiān)控子進(jìn)程(要crash的應(yīng)用進(jìn)程)蒙畴,此時debuggerd變?yōu)槠涓高M(jìn)程贰镣,向應(yīng)用進(jìn)程發(fā)送sigstop;以后應(yīng)用進(jìn)程接受到的signal會先發(fā)到父進(jìn)程
600  if (!ptrace_attach_thread(request.pid, request.tid)) {
601    ALOGE("debuggerd: ptrace attach failed: %s", strerror(errno));
602    exit(1);
603  }
604
605  // DEBUGGER_ACTION_CRASH requests can come from arbitrary processes and the tid field in the
606  // request is sent from the other side. If an attacker can cause a process to be spawned with the
607  // pid of their process, they could trick debuggerd into dumping that process by exiting after
608  // sending the request. Validate the trusted request.uid/gid to defend against this.
609  if (request.action == DEBUGGER_ACTION_CRASH) {
610    pid_t pid;
611    uid_t uid;
612    gid_t gid;
613    if (get_process_info(request.tid, &pid, &uid, &gid) != 0) {
614      ALOGE("debuggerd: failed to get process info for tid '%d'", request.tid);
615      exit(1);
616    }
617
618    if (pid != request.pid || uid != request.uid || gid != request.gid) {
619      ALOGE(
620        "debuggerd: attached task %d does not match request: "
621        "expected pid=%d,uid=%d,gid=%d, actual pid=%d,uid=%d,gid=%d",
622        request.tid, request.pid, request.uid, request.gid, pid, uid, gid);
623      exit(1);
624    }
625  }
626
627  // Don't attach to the sibling threads if we want to attach gdb.
628  // Supposedly, it makes the process less reliable.
629  bool attach_gdb = should_attach_gdb(request);
630  if (attach_gdb) {
631    // Open all of the input devices we need to listen for VOLUMEDOWN before dropping privileges.
632    if (init_getevent() != 0) {
633      ALOGE("debuggerd: failed to initialize input device, not waiting for gdb");
634      attach_gdb = false;
635    }
636
637  }
638
639  std::set<pid_t> siblings;
640  if (!attach_gdb) {
641    ptrace_siblings(request.pid, request.tid, siblings);
642  }
643
644  // Generate the backtrace map before dropping privileges.
645  std::unique_ptr<BacktraceMap> backtrace_map(BacktraceMap::Create(request.pid));
646
647  int amfd = -1;
648  std::unique_ptr<std::string> amfd_data;
649  if (request.action == DEBUGGER_ACTION_CRASH) {
650    // Connect to the activity manager before dropping privileges.
651    amfd = activity_manager_connect();
652    amfd_data.reset(new std::string);
653  }
654
655  // Collect the list of open files.
656  OpenFilesList open_files;
657  populate_open_files_list(request.pid, &open_files);
658
659  bool succeeded = false;
660
661  // Now that we've done everything that requires privileges, we can drop them.
662  if (!drop_privileges()) {
663    ALOGE("debuggerd: failed to drop privileges, exiting");
664    _exit(1);
665  }
666
667  int crash_signal = SIGKILL;
668  succeeded = perform_dump(request, fd, tombstone_fd, backtrace_map.get(), siblings,
669                           &crash_signal, &open_files, amfd_data.get());
670  if (succeeded) {
671    if (request.action == DEBUGGER_ACTION_DUMP_TOMBSTONE) {
672      if (!tombstone_path.empty()) {
673        android::base::WriteFully(fd, tombstone_path.c_str(), tombstone_path.length()); //將dump結(jié)果寫到相關(guān)路徑下
674      }
675    }
676  }
677
678  if (attach_gdb || request.action == DEBUGGER_ACTION_CRASH) {
679    // Before detach we must send SIGSTOP to the target.
680    // Tell the signal process to send SIGSTOP to the target.
681    if (!send_signal(request.pid, 0, SIGSTOP)) {
682      ALOGE("debuggerd: failed to stop process for gdb attach: %s", strerror(errno));
683      attach_gdb = false;
684    }
685  }
686
687  if (!attach_gdb) {
688    // Tell the Activity Manager about the crashing process. If we are
689    // waiting for gdb to attach, do not send this or Activity Manager
690    // might kill the process before anyone can attach.
691    activity_manager_write(request.pid, crash_signal, amfd, *amfd_data.get());
692  }
693
694  if (ptrace(PTRACE_DETACH, request.tid, 0, 0) != 0) { //detach客戶端
695    ALOGE("debuggerd: ptrace detach from %d failed: %s", request.tid, strerror(errno));
696  }
697
698  for (pid_t sibling : siblings) {
699    ptrace(PTRACE_DETACH, sibling, 0, 0);
700  }
701
702  // Send the signal back to the process if it crashed and we're not waiting for gdb.
703  if (!attach_gdb && request.action == DEBUGGER_ACTION_CRASH) {
704    if (!send_signal(request.pid, request.tid, crash_signal)) {
705      ALOGE("debuggerd: failed to kill process %d: %s", request.pid, strerror(errno));
706    }
707  }
708
709  // Wait for gdb, if requested.
710  if (attach_gdb) {
711    wait_for_user_action(request);
712
713    // Now tell the activity manager about this process.
714    activity_manager_write(request.pid, crash_signal, amfd, *amfd_data.get());
715
716    // Tell the signal process to send SIGCONT to the target.
717    if (!send_signal(request.pid, 0, SIGCONT)) {
718      ALOGE("debuggerd: failed to resume process %d: %s", request.pid, strerror(errno));
719    }
720
721    uninit_getevent();
722  }
723
724  close(amfd);
725
726  exit(!succeeded);
727}

perform_dump:進(jìn)行dump的過程

484static bool perform_dump(const debugger_request_t& request, int fd, int tombstone_fd,
485                         BacktraceMap* backtrace_map, const std::set<pid_t>& siblings,
486                         int* crash_signal, OpenFilesList* open_files, std::string* amfd_data) {
487  if (TEMP_FAILURE_RETRY(write(fd, "\0", 1)) != 1) { //向應(yīng)用進(jìn)程(客戶端返回一個值)膳凝,表示連上了八孝,可以開始dump了
488    ALOGE("debuggerd: failed to respond to client: %s\n", strerror(errno));
489    return false;
490  }
491
492  int total_sleep_time_usec = 0;
493  while (true) {
494    int signal = wait_for_signal(request.tid, &total_sleep_time_usec); //因為此時已經(jīng)被ptrace_attach了,所以第二次客戶端發(fā)給自己的信號會在這里被接收
495    switch (signal) {
496      case -1:
497        ALOGE("debuggerd: timed out waiting for signal");
498        return false;
499
500      case SIGSTOP: //這里是attach時向客戶端發(fā)送的sigstop信號
501        if (request.action == DEBUGGER_ACTION_DUMP_TOMBSTONE) {
502          ALOGV("debuggerd: stopped -- dumping to tombstone");
503          engrave_tombstone(tombstone_fd, backtrace_map, request.pid, request.tid, siblings, signal,
504                            request.original_si_code, request.abort_msg_address, open_files, amfd_data); 
505        } else if (request.action == DEBUGGER_ACTION_DUMP_BACKTRACE) {
506          ALOGV("debuggerd: stopped -- dumping to fd");
507          dump_backtrace(fd, backtrace_map, request.pid, request.tid, siblings, nullptr);
508        } else {
509          ALOGV("debuggerd: stopped -- continuing");
              //此時通過debuggerd用PTRACE_CONT命令讓應(yīng)用繼續(xù)執(zhí)行鸠项,
              // 這樣應(yīng)用的read系統(tǒng)調(diào)用就可以返回到用戶態(tài)干跛,繼續(xù)執(zhí)行debuggerd_signal_handler()
               // 此時,debuggerd進(jìn)入下一次循環(huán)祟绊,block在wait_for_signal楼入,繼續(xù)等待應(yīng)用的下一個信號
510          if (ptrace(PTRACE_CONT, request.tid, 0, 0) != 0) {
511            ALOGE("debuggerd: ptrace continue failed: %s", strerror(errno));
512            return false;
513          }
514          continue;  // loop again //注意,這里是繼續(xù)循環(huán)牧抽,等待客戶端的第二次信號
515        }
516        break;
517
518      case SIGABRT:
519      case SIGBUS:
520      case SIGFPE:
521      case SIGILL:
522      case SIGSEGV:
523#ifdef SIGSTKFLT
524      case SIGSTKFLT:
525#endif
526      case SIGSYS:
527      case SIGTRAP:
528        ALOGV("stopped -- fatal signal\n");
529        *crash_signal = signal;
530        engrave_tombstone(tombstone_fd, backtrace_map, request.pid, request.tid, siblings, signal,
531                          request.original_si_code, request.abort_msg_address, open_files, amfd_data); //客戶端發(fā)的第二次信號被debuggerd接受嘉熊,開始dump
532        break; //dump完之后跳出循環(huán),執(zhí)行下面的操作
533
534      default:
535        ALOGE("debuggerd: process stopped due to unexpected signal %d\n", signal);
536        break;
537    }
538    break;
539  }
540
541  return true;
542}

本質(zhì)上有兩次通信扬舒;
第一次通信是進(jìn)程的signal handler通過socket與啟動的dubuggerd服務(wù)端進(jìn)行通信阐肤,客戶端向debuggerd寫request,服務(wù)端獲取request并返回一個值表示收到讲坎;同時attach到客戶端孕惜,作為父進(jìn)程;同時發(fā)送一個SIGSTOP信號晨炕,被接收時衫画,此時通過debuggerd用PTRACE_CONT命令讓應(yīng)用繼續(xù)執(zhí)行,這樣應(yīng)用的read系統(tǒng)調(diào)用就可以返回到用戶態(tài)瓮栗,繼續(xù)執(zhí)行debuggerd_signal_handler削罩,debuggerd進(jìn)入下一次循環(huán),block在wait_for_signal费奸,繼續(xù)等待應(yīng)用的下一個信號

客戶端收到答復(fù)之后弥激,將注冊的信號處理函數(shù)去掉,(這樣再接收到信號就可以正常的走kernel流程了)愿阐,然后再次發(fā)送一個信號

這里就是第二次通信微服,信號被父進(jìn)程debuggerd攔截,開始dump操作换况,dump操作完后進(jìn)行detach操作职辨,不再作為客戶端的父進(jìn)程

此時客戶端會進(jìn)入到默認(rèn)的信號處理邏輯中

2173int get_signal(struct ksignal *ksig)
2174{
2175    struct sighand_struct *sighand = current->sighand;
2176    struct signal_struct *signal = current->signal;
2177    int signr;
2178
2179    if (unlikely(current->task_works))
2180        task_work_run();
2181
2182    if (unlikely(uprobe_deny_signal()))
2183        return 0;
2184
2185    /*
2186     * Do this once, we can't return to user-mode if freezing() == T.
2187     * do_signal_stop() and ptrace_stop() do freezable_schedule() and
2188     * thus do not need another check after return.
2189     */
2190    try_to_freeze();
2191
2192relock:
2193    spin_lock_irq(&sighand->siglock);
2194    /*
2195     * Every stopped thread goes here after wakeup. Check to see if
2196     * we should notify the parent, prepare_signal(SIGCONT) encodes
2197     * the CLD_ si_code into SIGNAL_CLD_MASK bits.
2198     */
2199    if (unlikely(signal->flags & SIGNAL_CLD_MASK)) {
2200        int why;
2201
2202        if (signal->flags & SIGNAL_CLD_CONTINUED)
2203            why = CLD_CONTINUED;
2204        else
2205            why = CLD_STOPPED;
2206
2207        signal->flags &= ~SIGNAL_CLD_MASK;
2208
2209        spin_unlock_irq(&sighand->siglock);
2210
2211        /*
2212         * Notify the parent that we're continuing.  This event is
2213         * always per-process and doesn't make whole lot of sense
2214         * for ptracers, who shouldn't consume the state via
2215         * wait(2) either, but, for backward compatibility, notify
2216         * the ptracer of the group leader too unless it's gonna be
2217         * a duplicate.
2218         */
2219        read_lock(&tasklist_lock);
2220        do_notify_parent_cldstop(current, false, why);
2221
2222        if (ptrace_reparented(current->group_leader))
2223            do_notify_parent_cldstop(current->group_leader,
2224                        true, why);
2225        read_unlock(&tasklist_lock);
2226
2227        goto relock;
2228    }
2229
2230    for (;;) {
2231        struct k_sigaction *ka;
2232
2233        if (unlikely(current->jobctl & JOBCTL_STOP_PENDING) &&
2234            do_signal_stop(0))
2235            goto relock;
2236
2237        if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
2238            do_jobctl_trap();
2239            spin_unlock_irq(&sighand->siglock);
2240            goto relock;
2241        }
2242
2243        signr = dequeue_signal(current, &current->blocked, &ksig->info);
2244
2245        if (!signr)
2246            break; /* will return 0 */
2247
2248        if (unlikely(current->ptrace) && signr != SIGKILL) {
2249            signr = ptrace_signal(signr, &ksig->info);
2250            if (!signr)
2251                continue;
2252        }
2253
2254        ka = &sighand->action[signr-1];
2255
2256        /* Trace actually delivered signals. */
2257        trace_signal_deliver(signr, &ksig->info, ka);
2258
2259        if (ka->sa.sa_handler == SIG_IGN) /* Do nothing.  */
2260            continue;
2261        if (ka->sa.sa_handler != SIG_DFL) {
2262            /* Run the handler.  */
2263            ksig->ka = *ka;
2264
2265            if (ka->sa.sa_flags & SA_ONESHOT)
2266                ka->sa.sa_handler = SIG_DFL;
2267
2268            break; /* will return non-zero "signr" value */
2269        }
2270
2271        /*
2272         * Now we are doing the default action for this signal.
2273         */
2274        if (sig_kernel_ignore(signr)) /* Default is nothing. */
2275            continue;
2276
2277        /*
2278         * Global init gets no signals it doesn't want.
2279         * Container-init gets no signals it doesn't want from same
2280         * container.
2281         *
2282         * Note that if global/container-init sees a sig_kernel_only()
2283         * signal here, the signal must have been generated internally
2284         * or must have come from an ancestor namespace. In either
2285         * case, the signal cannot be dropped.
2286         */
2287        if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
2288                !sig_kernel_only(signr))
2289            continue;
2290
2291        if (sig_kernel_stop(signr)) {
2292            /*
2293             * The default action is to stop all threads in
2294             * the thread group.  The job control signals
2295             * do nothing in an orphaned pgrp, but SIGSTOP
2296             * always works.  Note that siglock needs to be
2297             * dropped during the call to is_orphaned_pgrp()
2298             * because of lock ordering with tasklist_lock.
2299             * This allows an intervening SIGCONT to be posted.
2300             * We need to check for that and bail out if necessary.
2301             */
2302            if (signr != SIGSTOP) {
2303                spin_unlock_irq(&sighand->siglock);
2304
2305                /* signals can be posted during this window */
2306
2307                if (is_current_pgrp_orphaned())
2308                    goto relock;
2309
2310                spin_lock_irq(&sighand->siglock);
2311            }
2312
2313            if (likely(do_signal_stop(ksig->info.si_signo))) {
2314                /* It released the siglock.  */
2315                goto relock;
2316            }
2317
2318            /*
2319             * We didn't actually stop, due to a race
2320             * with SIGCONT or something like that.
2321             */
2322            continue;
2323        }
2324
2325        spin_unlock_irq(&sighand->siglock);
2326
2327        /*
2328         * Anything else is fatal, maybe with a core dump.
2329         */
2330        current->flags |= PF_SIGNALED;
2331
2332        if (sig_kernel_coredump(signr)) {
2333            if (print_fatal_signals)
2334                print_fatal_signal(ksig->info.si_signo);
2335            proc_coredump_connector(current);
2336            /*
2337             * If it was able to dump core, this kills all
2338             * other threads in the group and synchronizes with
2339             * their demise.  If we lost the race with another
2340             * thread getting here, it set group_exit_code
2341             * first and our do_group_exit call below will use
2342             * that value and ignore the one we pass it.
2343             */
2344            do_coredump(&ksig->info);
2345        }
2346
2347        /*
2348         * Death signals, no core dump.
2349         */
2350        do_group_exit(ksig->info.si_signo);
2351        /* NOTREACHED */
2352    }
2353    spin_unlock_irq(&sighand->siglock);
2354
2355    ksig->sig = signr;
2356    return ksig->sig > 0;
2357}
412#define sig_kernel_coredump(sig) \
413 (((sig) < SIGRTMIN) && siginmask(sig, SIG_KERNEL_COREDUMP_MASK))
399        rt_sigmask(SIGQUIT)   |  rt_sigmask(SIGILL)    | \
400 rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGABRT)   | \
401        rt_sigmask(SIGFPE)    |  rt_sigmask(SIGSEGV)   | \
402 rt_sigmask(SIGBUS)    |  rt_sigmask(SIGSYS)    | \
403        rt_sigmask(SIGXCPU)   |  rt_sigmask(SIGXFSZ)   | \
404 SIGEMT_MASK 

可見coredump相應(yīng)的信號比tombstone多盗蟆,tombstone響應(yīng)的為coredump的子集戈二,能響應(yīng)coredump的信號如下舒裤,參考default action列表:

 *      +--------------------+------------------+
 *      |  POSIX signal      |  default action  |
 *      +--------------------+------------------+
 *      |  SIGHUP            |  terminate       |
 *      |  SIGINT            |  terminate       |
 *      |  SIGQUIT           |  coredump        |
 *      |  SIGILL            |  coredump        |
 *      |  SIGTRAP           |  coredump        |
 *      |  SIGABRT/SIGIOT    |  coredump        |
 *      |  SIGBUS            |  coredump        |
 *      |  SIGFPE            |  coredump        |
 *      |  SIGKILL           |  terminate(+)    |
 *      |  SIGUSR1           |  terminate       |
 *      |  SIGSEGV           |  coredump        |
 *      |  SIGUSR2           |  terminate       |
 *      |  SIGPIPE           |  terminate       |
 *      |  SIGALRM           |  terminate       |
 *      |  SIGTERM           |  terminate       |
 *      |  SIGCHLD           |  ignore          |
 *      |  SIGCONT           |  ignore(*)       |
 *      |  SIGSTOP           |  stop(*)(+)      |
 *      |  SIGTSTP           |  stop(*)         |
 *      |  SIGTTIN           |  stop(*)         |
 *      |  SIGTTOU           |  stop(*)         |
 *      |  SIGURG            |  ignore          |
 *      |  SIGXCPU           |  coredump        |
 *      |  SIGXFSZ           |  coredump        |
 *      |  SIGVTALRM         |  terminate       |
 *      |  SIGPROF           |  terminate       |
 *      |  SIGPOLL/SIGIO     |  terminate       |
 *      |  SIGSYS/SIGUNUSED  |  coredump        |
 *      |  SIGSTKFLT         |  terminate       |
 *      |  SIGWINCH          |  ignore          |
 *      |  SIGPWR            |  terminate       |
 *      |  SIGRTMIN-SIGRTMAX |  terminate       |
 *      +--------------------+------------------+
 *      |  non-POSIX signal  |  default action  |
 *      +--------------------+------------------+
 *      |  SIGEMT            |  coredump        |
 *      +--------------------+------------------+

那么如何tombstone添加一個信號呢?

拓展

debuggerd_init打不出log?

原因:
bionic/linker/linker_main.cpp

/*
211 * This code is called after the linker has linked itself and
212 * fixed it's own GOT. It is safe to make references to externs
213 * and other non-local data at this point.
214 */
215static ElfW(Addr) __linker_init_post_relocation(KernelArgumentBlock& args) {
216  ProtectedDataGuard guard;
217
218#if TIMING
219  struct timeval t0, t1;
220  gettimeofday(&t0, 0);
221#endif
222
223  // Sanitize the environment.
224  __libc_init_AT_SECURE(args);
225
226  // Initialize system properties
227  __system_properties_init(); // may use 'environ'
228
229  // Register the debuggerd signal handler.
230#ifdef __ANDROID__
231  debuggerd_callbacks_t callbacks = {
232    .get_abort_message = []() {
233      return g_abort_message;
234    },
235    .post_dump = &notify_gdb_of_libraries,
236  };
237  debuggerd_init(&callbacks); //此時LD_DEBUG還沒有初始化
238#endif
239
240  g_linker_logger.ResetState();
241
242  // Get a few environment variables.
243  const char* LD_DEBUG = getenv("LD_DEBUG");
244  if (LD_DEBUG != nullptr) {
245    g_ld_debug_verbosity = atoi(LD_DEBUG);
246  }

bionic/linker/linker_debug.h

63#if LINKER_DEBUG_TO_LOG
64#define _PRINTVF(v, x...) \
65    do { \
66      if (g_ld_debug_verbosity > (v)) async_safe_format_log(5-(v), "linker", x); \
67    } while (0)
68#else /* !LINKER_DEBUG_TO_LOG */
69#define _PRINTVF(v, x...) \
70    do { \
71      if (g_ld_debug_verbosity > (v)) { async_safe_format_fd(1, x); write(1, "\n", 1); } \
72    } while (0)
73#endif /* !LINKER_DEBUG_TO_LOG */
74
75#define PRINT(x...)          _PRINTVF(-1, x)
76#define INFO(x...)           _PRINTVF(0, x)
77#define TRACE(x...)          _PRINTVF(1, x)

所以用INFO等等觉吭,級別不夠腾供,可以直接用async_safe_format_log進(jìn)行打印,就一定能打出來

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末鲜滩,一起剝皮案震驚了整個濱河市伴鳖,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌徙硅,老刑警劉巖榜聂,帶你破解...
    沈念sama閱讀 219,589評論 6 508
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異嗓蘑,居然都是意外死亡须肆,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,615評論 3 396
  • 文/潘曉璐 我一進(jìn)店門桩皿,熙熙樓的掌柜王于貴愁眉苦臉地迎上來豌汇,“玉大人,你說我怎么就攤上這事泄隔【芗” “怎么了?”我有些...
    開封第一講書人閱讀 165,933評論 0 356
  • 文/不壞的土叔 我叫張陵佛嬉,是天一觀的道長逻澳。 經(jīng)常有香客問我,道長暖呕,這世上最難降的妖魔是什么赡盘? 我笑而不...
    開封第一講書人閱讀 58,976評論 1 295
  • 正文 為了忘掉前任,我火速辦了婚禮缰揪,結(jié)果婚禮上陨享,老公的妹妹穿的比我還像新娘。我一直安慰自己钝腺,他們只是感情好抛姑,可當(dāng)我...
    茶點故事閱讀 67,999評論 6 393
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著艳狐,像睡著了一般定硝。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上毫目,一...
    開封第一講書人閱讀 51,775評論 1 307
  • 那天蔬啡,我揣著相機與錄音诲侮,去河邊找鬼。 笑死箱蟆,一個胖子當(dāng)著我的面吹牛沟绪,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播空猜,決...
    沈念sama閱讀 40,474評論 3 420
  • 文/蒼蘭香墨 我猛地睜開眼绽慈,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了辈毯?” 一聲冷哼從身側(cè)響起坝疼,我...
    開封第一講書人閱讀 39,359評論 0 276
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎谆沃,沒想到半個月后钝凶,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,854評論 1 317
  • 正文 獨居荒郊野嶺守林人離奇死亡唁影,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 38,007評論 3 338
  • 正文 我和宋清朗相戀三年耕陷,在試婚紗的時候發(fā)現(xiàn)自己被綠了。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片夭咬。...
    茶點故事閱讀 40,146評論 1 351
  • 序言:一個原本活蹦亂跳的男人離奇死亡啃炸,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出卓舵,到底是詐尸還是另有隱情南用,我是刑警寧澤,帶...
    沈念sama閱讀 35,826評論 5 346
  • 正文 年R本政府宣布掏湾,位于F島的核電站裹虫,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏融击。R本人自食惡果不足惜筑公,卻給世界環(huán)境...
    茶點故事閱讀 41,484評論 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望尊浪。 院中可真熱鬧匣屡,春花似錦、人聲如沸拇涤。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,029評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽鹅士。三九已至券躁,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背也拜。 一陣腳步聲響...
    開封第一講書人閱讀 33,153評論 1 272
  • 我被黑心中介騙來泰國打工以舒, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人慢哈。 一個月前我還...
    沈念sama閱讀 48,420評論 3 373
  • 正文 我出身青樓蔓钟,卻偏偏與公主長得像,于是被迫代替她去往敵國和親岸军。 傳聞我的和親對象是個殘疾皇子奋刽,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 45,107評論 2 356

推薦閱讀更多精彩內(nèi)容

  • Spring Cloud為開發(fā)人員提供了快速構(gòu)建分布式系統(tǒng)中一些常見模式的工具(例如配置管理瓦侮,服務(wù)發(fā)現(xiàn)艰赞,斷路器,智...
    卡卡羅2017閱讀 134,672評論 18 139
  • 用兩張圖告訴你肚吏,為什么你的 App 會卡頓? - Android - 掘金 Cover 有什么料方妖? 從這篇文章中你...
    hw1212閱讀 12,734評論 2 59
  • 必備的理論基礎(chǔ) 1.操作系統(tǒng)作用: 隱藏丑陋復(fù)雜的硬件接口,提供良好的抽象接口罚攀。 管理調(diào)度進(jìn)程党觅,并將多個進(jìn)程對硬件...
    drfung閱讀 3,541評論 0 5
  • 我第一次讀到趙周老師的《這樣讀書就夠了》了的這本書的時候,內(nèi)心觸動特別大斋泄!甚至可以說讀過之后感到非常震撼杯瞻,原來讀書...
    梅飛菲閱讀 153評論 0 8
  • 妹妹今天入手新手機一部,下午和媽媽聊微信炫掐,媽媽說本來不想給她買這么貴的手機(其實不算貴)魁莉,我說你肯定是又舍不得花錢...
    LittltSunshines閱讀 197評論 0 2