Mach-O 學(xué)習(xí)小結(jié)(四)
最近學(xué)習(xí)了一下 Mach-O ,這里做個(gè)筆記記錄章郁,整理思路鸿吆,加深理解搏屑。
附上下文所用demo
概述
第一章 描述了 Mach-O 文件的基本結(jié)構(gòu)低缩;
第二章 概述了符號(hào)堪旧,分析了符號(hào)表(symbol table)糠雨。
第三章 探尋動(dòng)態(tài)鏈接才睹。
第四章 分析fishhook。
第五章 分析BeeHive甘邀。
第六章 App啟動(dòng)時(shí)間琅攘。
Fishhook簡(jiǎn)介
關(guān)于 fishhook,官方資料是這么介紹的:
fishhook is a very simple library that enables dynamically rebinding symbols in Mach-O binaries running on iOS in the simulator and on device.
簡(jiǎn)單來(lái)說(shuō)松邪,通過(guò) fishhook乎澄,可以對(duì)動(dòng)態(tài)鏈接的符號(hào)進(jìn)行重新綁定。
Fishhook源碼分析
Fishhook接口
從 fishhook.h 的 API 上看测摔,它定義了兩個(gè)函數(shù):
struct rebinding {
const char *name; // 目標(biāo)符號(hào)名
void *replacement; // 要替換的符號(hào)值(地址值)
void **replaced; // 用來(lái)存放原來(lái)的符號(hào)值(地址值)
};
// 操作的對(duì)象是進(jìn)程的所有鏡像(image)
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel);
// 操作的對(duì)象是某個(gè)指定的鏡像(image)
int rebind_symbols_image(void *header,
intptr_t slide,
struct rebinding rebindings[],
size_t rebindings_nel);
一般都只是使用前者置济。本文也只是對(duì)rebind_symbols()
展開進(jìn)一步描述,它有兩參數(shù)锋八,rebindings
是一個(gè)rebinding
數(shù)組浙于,rebindings_nel
描述數(shù)組的長(zhǎng)度。
Fishhook Example
舉例描述 fishhook 的使用挟纱,這個(gè) case 要做的事情是重定位printf
函數(shù)的符號(hào)羞酗,讓它指向到自定義函數(shù),代碼如下:
// main.c
#include <stdio.h>
#include <stdarg.h>
#include "fishhook.h"
static int (*ori_printf)(const char *, ...);
int yzk_printf(const char *format, ...)
{
int ret = 0;
ori_printf("yzk print before\n");
va_list arg;
va_start(arg, format);
ret = vprintf(format, arg);
va_end(arg);
ori_printf("yzk print after\n");
return ret;
}
int main(void)
{
// rebind `printf` 符號(hào)紊服,讓它指向到自定義的 `yzk_printf` 函數(shù)
struct rebinding printf_rebinding = { "printf", yzk_printf, (void *)&ori_printf };
rebind_symbols((struct rebinding[1]){printf_rebinding}, 1);
// 調(diào)用 `printf`檀轨,實(shí)際執(zhí)行的邏輯是 `god_printf` 定義的邏輯
printf("aaa bbb test %d\n", 42);
return 0;
}
運(yùn)行程序:
可以看到,結(jié)果和預(yù)期一樣欺嗤,對(duì)printf
的調(diào)用参萄,實(shí)際上執(zhí)行的是yzk_printf()
。
rebind_symbols_for_image
fishhook.c 簡(jiǎn)短的源碼中煎饼,執(zhí)行 rebind 邏輯的核心函數(shù)有兩個(gè):rebind_symbols_for_image
和perform_rebinding_with_section
讹挎;前者負(fù)責(zé)找到目標(biāo) section,后者在 section 里根據(jù)符號(hào)進(jìn)行真正的 rebind。
先看rebind_symbols_for_image
筒溃,代碼如下:
static void rebind_symbols_for_image(struct rebindings_entry *rebindings,
const struct mach_header *header,
intptr_t slide) {
Dl_info info;
if (dladdr(header, &info) == 0) {
return;
}
segment_command_t *cur_seg_cmd;
segment_command_t *linkedit_segment = NULL;
struct symtab_command* symtab_cmd = NULL;
struct dysymtab_command* dysymtab_cmd = NULL;
uintptr_t cur = (uintptr_t)header + sizeof(mach_header_t); //command起始地址
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) { //遍歷header下所有的command
cur_seg_cmd = (segment_command_t *)cur; //當(dāng)前遍歷到的 command
if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
//如果 command 是 LC_SEGMENT_64 或 LC_SEGMENT(依賴cpu架構(gòu)马篮,后文都以64位舉例)
if (strcmp(cur_seg_cmd->segname, SEG_LINKEDIT) == 0) {
// 當(dāng)前 command 為 LC_SEGMENT_64(__LINKEDIT)
linkedit_segment = cur_seg_cmd;
}
} else if (cur_seg_cmd->cmd == LC_SYMTAB) {
// 當(dāng)前 command 為 LC_SYMTAB
symtab_cmd = (struct symtab_command*)cur_seg_cmd;
} else if (cur_seg_cmd->cmd == LC_DYSYMTAB) {
// 當(dāng)前 command 為 LC_DYSYMTAB
dysymtab_cmd = (struct dysymtab_command*)cur_seg_cmd;
}
}
if (!symtab_cmd || !dysymtab_cmd || !linkedit_segment ||
!dysymtab_cmd->nindirectsyms) {
return;
}
// Find base symbol/string table addresses
uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff; //獲取LINK段基虛擬內(nèi)存地址
nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff); //獲取符號(hào)表虛擬內(nèi)存地址
char *strtab = (char *)(linkedit_base + symtab_cmd->stroff); //獲取string table虛擬內(nèi)存地址
// Get indirect symbol table (array of uint32_t indices into symbol table)
uint32_t *indirect_symtab = (uint32_t *)(linkedit_base + dysymtab_cmd->indirectsymoff); //獲取間接符號(hào)表虛擬內(nèi)存地址
cur = (uintptr_t)header + sizeof(mach_header_t);
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
cur_seg_cmd = (segment_command_t *)cur;
if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
// 過(guò)濾 segment,只在 __DATA怜奖、__DATA_CONST segment 里尋找
if (strcmp(cur_seg_cmd->segname, SEG_DATA) != 0 &&
strcmp(cur_seg_cmd->segname, SEG_DATA_CONST) != 0) {
continue;
}
// 在 segment 中浑测,遍歷尋找指定section
for (uint j = 0; j < cur_seg_cmd->nsects; j++) {
section_t *sect =
(section_t *)(cur + sizeof(segment_command_t)) + j;
if ((sect->flags & SECTION_TYPE) == S_LAZY_SYMBOL_POINTERS) {
// 找到后,執(zhí)行 `perform_rebinding_with_section()`修改對(duì)應(yīng)section中的內(nèi)容
perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
}
if ((sect->flags & SECTION_TYPE) == S_NON_LAZY_SYMBOL_POINTERS) {
perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
}
}
}
}
}
可以看到歪玲,fishhook 通過(guò) section type 匹配來(lái)尋找目標(biāo) section尽爆,前文介紹了 section 結(jié)構(gòu)體,其中有一個(gè)flags
字段读慎,該字段含有描述 section type 的信息漱贱,如下羅列了上文 main.out 的__nl_symbol_ptr
、__got
夭委、__la_symbol_ptr
的flags
值信息:
Section
sectname __nl_symbol_ptr
segname __DATA
flags 0x00000006
Section
sectname __got
segname __DATA
flags 0x00000006
Section
sectname __la_symbol_ptr
segname __DATA
flags 0x00000007
0x06幅狮、0x07 分別對(duì)應(yīng)的宏是S_NON_LAZY_SYMBOL_POINTERS
、S_LAZY_SYMBOL_POINTERS
株灸,前者指該 section 用于存儲(chǔ) non-lazy 型符號(hào)地址信息崇摄,后者指該 section 用于存儲(chǔ) lazy 型符號(hào)地址信息。
至此慌烧,可以得到兩點(diǎn)重要信息逐抑。
其一,fishhook 尋找__la_symbol_ptr
等 section 的邏輯并不是通過(guò) name 匹配屹蚊,而是通過(guò) section type 匹配厕氨。
其二,fishhook rebind 的對(duì)象不光是函數(shù)型符號(hào)汹粤,還包括數(shù)據(jù)型符號(hào)命斧,因?yàn)榭梢孕薷腳_got section。
perform_rebinding_with_section
最后嘱兼,再看看perform_rebinding_with_section
:
static void perform_rebinding_with_section(struct rebindings_entry *rebindings,
section_t *section,
intptr_t slide,
nlist_t *symtab,
char *strtab,
uint32_t *indirect_symtab) {
// indirect_symtab 是一個(gè) uint32_t 的索引數(shù)組国葬,section->reserved1即為當(dāng)前section在indirect_symtab表中第一個(gè)的索引
uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1;
// 當(dāng)前section(S_NON_LAZY_SYMBOL_POINTERS、S_LAZY_SYMBOL_POINTERS) 數(shù)據(jù)芹壕,section中每條都是一個(gè)指針
void **indirect_symbol_bindings = (void **)((uintptr_t)slide + section->addr);
// 遍歷 section 中每個(gè)指針
for (uint i = 0; i < section->size / sizeof(void *); i++) {
uint32_t symtab_index = indirect_symbol_indices[i]; //從indirect_symtab取出 符號(hào)表的索引
if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL ||
symtab_index == (INDIRECT_SYMBOL_LOCAL | INDIRECT_SYMBOL_ABS)) {
//過(guò)濾INDIRECT_SYMBOL_ABS和INDIRECT_SYMBOL_LOCAL
continue;
}
uint32_t strtab_offset = symtab[symtab_index].n_un.n_strx; //根據(jù)symtab_index汇四,從符號(hào)表取出對(duì)應(yīng)符號(hào),獲取在string table中的偏移量
char *symbol_name = strtab + strtab_offset; // 從string table中踢涌,根據(jù)偏移量取出對(duì)應(yīng)的符號(hào)名
// 遍歷需要 Hook 的鏈表
struct rebindings_entry *cur = rebindings;
while (cur) {
for (uint j = 0; j < cur->rebindings_nel; j++) {
// 判斷符號(hào)名是否相等通孽,它把符號(hào)真正的 name 的第一個(gè)字符給去掉了,所以上文匹配printf對(duì)應(yīng)的符號(hào)時(shí)斯嚎,符號(hào)名無(wú)需寫成_printf
if (strlen(symbol_name) > 1 &&
strcmp(&symbol_name[1], cur->rebindings[j].name) == 0) {
if (cur->rebindings[j].replaced != NULL &&
indirect_symbol_bindings[i] != cur->rebindings[j].replacement) {
*(cur->rebindings[j].replaced) = indirect_symbol_bindings[i]; //將舊符號(hào)地址寫入replaced
}
indirect_symbol_bindings[i] = cur->rebindings[j].replacement; // 將新符號(hào)地址寫入section中
goto symbol_loop;
}
}
cur = cur->next;
}
symbol_loop:;
}
}
小結(jié)
結(jié)合上文的分析利虫,對(duì) fishhook 做個(gè)小結(jié):
- fishhook 能 hook 的符號(hào)必須存在于動(dòng)態(tài)庫(kù)中挨厚,換句話說(shuō)堡僻,它無(wú)法對(duì)本地符號(hào)進(jìn)行 hook
- fishhook 既能處理函數(shù)型符號(hào)糠惫,也能處理數(shù)據(jù)型符號(hào)(無(wú)論是全局變量還是全局常量)
- 使用 fishhook 處理符號(hào)時(shí),傳參中的符號(hào)名并不是真正的符號(hào)名钉疫,譬如你想對(duì)
_printf
符號(hào)進(jìn)行 rebind硼讽,傳入"printf"
即可
Fishhook的問(wèn)題
Fishhook 會(huì)在 14.5 以上的 iOS 系統(tǒng)且 A12 芯片的設(shè)備上必崩潰。原因在于
indirect_symbol_bindings[i] = cur->rebindings[j].replacement; // 將新符號(hào)地址寫入section中
這一步牲阁,fishhook 直接用賦值進(jìn)行內(nèi)存寫操作固阁,如果遇到內(nèi)存沒(méi)有寫權(quán)限時(shí)候,就會(huì)直接崩潰城菊。低版本上的__DATA_CONST
section映射的內(nèi)存是可讀寫的备燃,所以不會(huì)觸發(fā)。在iOS 14.5 beta上發(fā)現(xiàn)大部分變成了只讀凌唬,當(dāng)賦值時(shí)即會(huì)崩潰并齐。
修復(fù)為:
// 獲取指定內(nèi)存的讀寫權(quán)限
static vm_prot_t get_protection(void *sectionStart) {
mach_port_t task = mach_task_self();
vm_size_t size = 0;
vm_address_t address = (vm_address_t)sectionStart;
memory_object_name_t object;
#ifdef __LP64__
mach_msg_type_number_t count = VM_REGION_BASIC_INFO_COUNT_64;
vm_region_basic_info_data_64_t info;
kern_return_t info_ret = vm_region_64(
task, &address, &size, VM_REGION_BASIC_INFO_64, (vm_region_info_64_t)&info, &count, &object);
#else
mach_msg_type_number_t count = VM_REGION_BASIC_INFO_COUNT;
vm_region_basic_info_data_t info;
kern_return_t info_ret = vm_region(task, &address, &size, VM_REGION_BASIC_INFO, (vm_region_info_t)&info, &count, &object);
#endif
if (info_ret == KERN_SUCCESS) {
return info.protection;
} else {
return VM_PROT_READ;
}
}
static void perform_rebinding_with_section(struct rebindings_entry *rebindings,
section_t *section,
intptr_t slide,
nlist_t *symtab,
char *strtab,
uint32_t *indirect_symtab) {
const bool isDataConst = strcmp(section->segname, SEG_DATA_CONST) == 0;
const bool isAuthConst = strcmp(section->segname, SEG_AUTH_CONST) == 0;
uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1;
void **indirect_symbol_bindings = (void **)((uintptr_t)slide + section->addr);
vm_prot_t oldProtection = VM_PROT_READ;
vm_size_t trunc_address = (vm_size_t)indirect_symbol_bindings;
vm_size_t trunc_size = 0;
if (isDataConst || isAuthConst) { // 如果是__DATA_CONST section 或 __AUTH_CONST section,處理權(quán)限
trunc_address = trunc_page((vm_size_t)indirect_symbol_bindings); //mprotect 函數(shù)一直要求頁(yè)對(duì)齊,這里進(jìn)行頁(yè)對(duì)齊
trunc_size =(vm_size_t)indirect_symbol_bindings -trunc_address;
pthread_mutex_lock(&mutex); //加鎖客税,防止多線程時(shí)况褪,權(quán)限判斷錯(cuò)誤,導(dǎo)致對(duì)只讀內(nèi)存進(jìn)行寫操作
oldProtection = get_protection((void *)trunc_address); //保存舊的權(quán)限更耻,操作完成后测垛,恢復(fù)舊權(quán)限
mprotect((void *)trunc_address, section->size+trunc_size, PROT_READ | PROT_WRITE); //將權(quán)限改完可讀可寫
}
for (uint i = 0; i < section->size / sizeof(void *); i++) {
uint32_t symtab_index = indirect_symbol_indices[i];
if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL ||
symtab_index == (INDIRECT_SYMBOL_LOCAL | INDIRECT_SYMBOL_ABS)) {
continue;
}
uint32_t strtab_offset = symtab[symtab_index].n_un.n_strx;
char *symbol_name = strtab + strtab_offset;
bool symbol_name_longer_than_1 = symbol_name[0] && symbol_name[1];
struct rebindings_entry *cur = rebindings;
while (cur) {
for (uint j = 0; j < cur->rebindings_nel; j++) {
if (symbol_name_longer_than_1 &&
strcmp(&symbol_name[1], cur->rebindings[j].name) == 0) {
if (cur->rebindings[j].replaced != NULL &&
indirect_symbol_bindings[i] != cur->rebindings[j].replacement) {
*(cur->rebindings[j].replaced) = indirect_symbol_bindings[i];
}
indirect_symbol_bindings[i] = cur->rebindings[j].replacement;
goto symbol_loop;
}
}
cur = cur->next;
}
symbol_loop:;
}
if (isDataConst || isAuthConst) {
int protection = 0;
if (oldProtection & VM_PROT_READ) {
protection |= PROT_READ;
}
if (oldProtection & VM_PROT_WRITE) {
protection |= PROT_WRITE;
}
if (oldProtection & VM_PROT_EXECUTE) {
protection |= PROT_EXEC;
}
mprotect((void *)trunc_address, section->size+trunc_size, protection); //將權(quán)限復(fù)原
pthread_mutex_unlock(&mutex); //解鎖
}
}