1. fishhook源碼閱讀
1.1 fishhok原理
dyld
通過(guò)更新Mach-O二進(jìn)制文件中特定__DATA
段的指針來(lái)綁定惰性和非惰性符號(hào)既们。fishhook通過(guò)傳遞給rebind_symbols
的符號(hào)名來(lái)確定需要更新的位置钞支,然后用相應(yīng)的替換項(xiàng)重新綁定這些符號(hào)。
對(duì)于給定的鏡像阅懦,__DATA
段可以包含與動(dòng)態(tài)符號(hào)綁定相關(guān)的兩個(gè)部分:__nl_symbol_ptr
和__la_symbol_ptr
。
__nl_symbol_ptr
是指向非延遲綁定數(shù)據(jù)的指針數(shù)組(這些指針在加載庫(kù)時(shí)綁定)理茎。__la_symbol_ptr
是指向?qū)牒瘮?shù)的指針數(shù)組熬苍,通常在第一次調(diào)用該符號(hào)時(shí)由名為dyld_stub_binder
的例程填充(也可以在啟動(dòng)時(shí)告訴dyld
綁定這些指針)。
為了找到對(duì)應(yīng)于這些部分中某個(gè)特定位置的符號(hào)的名稱地技,我們需要通過(guò)幾個(gè)間接層來(lái)進(jìn)行查看蜈七。
對(duì)于兩個(gè)相關(guān)部分,
section header
(<mach-o/loader.h>
中聲明的struct section
)提供一個(gè)偏移量(在reserved1
字段中)到所謂的間接符號(hào)表中乓土。間接符號(hào)表位于二進(jìn)制文件的
__LINKEDIT
段中宪潮,它只是符號(hào)表(也在__LINKEDIT
中)中的索引數(shù)組,其順序與非惰性和惰性符號(hào)部分中的指針順序相同趣苏。因此狡相,struct section nl_symbol_ptr
,該部分中第一個(gè)地址的符號(hào)表中的對(duì)應(yīng)索引是indirect_symbol_table[nl_symbol_ptr->reserved1]
食磕。符號(hào)表本身是一個(gè)
struct nlist
數(shù)組(請(qǐng)參見(jiàn)<mach-o/nlist.h>
)尽棕,每個(gè)nlist
都包含一個(gè)指向__LINKEDIT
中字符串表的索引,其中存儲(chǔ)了實(shí)際的符號(hào)名彬伦。因此滔悉,對(duì)于每個(gè)指針__nl_symbol_ptr
和__la_symbol_ptr
,我們都可以找到相應(yīng)的符號(hào)单绑,然后找到相應(yīng)的字符串與請(qǐng)求的符號(hào)名進(jìn)行比較回官,如果有匹配項(xiàng),我們用替換項(xiàng)替換節(jié)中的指針搂橙。
1.2 測(cè)試代碼
//---------------------------------更改NSLog-----------
//函數(shù)指針
static void(*sys_nslog)(NSString * format,...);
//定義一個(gè)新的函數(shù)
void my_nslog(NSString * format,...){
format = [format stringByAppendingString:@"你咋又來(lái)了 \n"];
//調(diào)用原始的
sys_nslog(format);
}
@implementation ViewController
- (void)viewDidLoad {
[super viewDidLoad];
NSLog(@"log來(lái)了歉提,老弟");
struct rebinding nslog;
nslog.name = "NSLog";
nslog.replacement = my_nslog;
nslog.replaced = (void *)&sys_nslog;
struct rebinding rebs[1] = {nslog};
rebind_symbols(rebs, 1);
NSLog(@"log來(lái)了,老弟");
}
@end
運(yùn)行結(jié)果:
2020-03-16 09:47:38.526862+0800 Demo[28657:5210895] log來(lái)了区转,老弟
2020-03-16 09:47:38.536892+0800 Demo[28657:5210895] log來(lái)了苔巨,老弟你咋又來(lái)了
1.3 Mach-O附著
MachOView會(huì)彈出輸入框讓你輸入PID。
這個(gè)PID在Xcode的Show the Debug navigator菜單下废离,可以用? + 7快速切過(guò)來(lái)侄泽。這里我們可以看到進(jìn)程的PID,輸入到上面的框中蜻韭。
1.4 MachOView與源碼閱讀驗(yàn)證
頂部數(shù)據(jù)定義與初始化
struct rebindings_entry {
struct rebinding *rebindings;
size_t rebindings_nel;
struct rebindings_entry *next;
};
static struct rebindings_entry *_rebindings_head;
// 給需要rebinding的方法結(jié)構(gòu)體開(kāi)辟出對(duì)應(yīng)的空間
// 生成對(duì)應(yīng)的鏈表結(jié)構(gòu)(rebindings_entry)
static int prepend_rebindings(struct rebindings_entry **rebindings_head,
struct rebinding rebindings[],
size_t nel) {
// 開(kāi)辟一個(gè)rebindings_entry大小的空間
struct rebindings_entry *new_entry = (struct rebindings_entry *) malloc(sizeof(struct rebindings_entry));
if (!new_entry) {
return -1;
}
// 一共有nel個(gè)rebinding
new_entry->rebindings = (struct rebinding *) malloc(sizeof(struct rebinding) * nel);
if (!new_entry->rebindings) {
free(new_entry);
return -1;
}
// 將rebinding賦值給new_entry->rebindings
memcpy(new_entry->rebindings, rebindings, sizeof(struct rebinding) * nel);
// 繼續(xù)賦值nel
new_entry->rebindings_nel = nel;
// 每次都將new_entry插入頭部
new_entry->next = *rebindings_head;
// rebindings_head重新指向頭部
*rebindings_head = new_entry;
return 0;
}
這里定義了rebindings_entry鏈表悼尾。每次進(jìn)行綁定的時(shí)候,會(huì)傳入struct rebinding rebindings[]數(shù)組湘捎,創(chuàng)建一個(gè)新的rebindings_entry結(jié)構(gòu)诀豁,然后把這個(gè)結(jié)構(gòu)插入鏈表頭部。
兩個(gè)公開(kāi)方法
static void _rebind_symbols_for_image(const struct mach_header *header, intptr_t slide) {
// 找到對(duì)應(yīng)的符號(hào)窥妇,進(jìn)行重綁定
rebind_symbols_for_image(_rebindings_head, header, slide);
}
// 在知道確定的MachO舷胜,可以使用該方法
int rebind_symbols_image(void *header,
intptr_t slide,
struct rebinding rebindings[],
size_t rebindings_nel) {
struct rebindings_entry *rebindings_head = NULL;
int retval = prepend_rebindings(&rebindings_head, rebindings, rebindings_nel);
rebind_symbols_for_image(rebindings_head, (const struct mach_header *) header, slide);
if (rebindings_head) {
free(rebindings_head->rebindings);
}
free(rebindings_head);
return retval;
}
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
if (retval < 0) {
return retval;
}
// 如果這是第一次調(diào)用,請(qǐng)為image添加注冊(cè)回調(diào)(這也會(huì)為現(xiàn)有image調(diào)用,否則烹骨,只在現(xiàn)有image上運(yùn)行
if (!_rebindings_head->next) {
// 向每個(gè)image注冊(cè)_rebind_symbols_for_image函數(shù)翻伺,并且立即觸發(fā)一次
_dyld_register_func_for_add_image(_rebind_symbols_for_image);
} else {
// _dyld_image_count() 獲取image數(shù)量
uint32_t c = _dyld_image_count();
for (uint32_t i = 0; i < c; i++) {
// _dyld_get_image_header(i) 獲取第i個(gè)image的header指針
// _dyld_get_image_vmaddr_slide(i) 獲取第i個(gè)image的基址
_rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
}
}
return retval;
}
rebind_symbols_image和rebind_symbols是兩個(gè)公開(kāi)的方法,用于重新綁定符號(hào)沮焕。rebind_symbols_image用于指定鏡像的符號(hào)綁定吨岭,rebind_symbols對(duì)所有鏡像進(jìn)行處理。
不管是哪個(gè)方法峦树,最后都是調(diào)用rebind_symbols_for_image去獲取相關(guān)部分的地址辣辫。
相關(guān)部分的地址
static void rebind_symbols_for_image(struct rebindings_entry *rebindings,
const struct mach_header *header,
intptr_t slide) {
Dl_info info;
// 判斷當(dāng)前macho是否在進(jìn)程里,如果不在則直接返回
if (dladdr(header, &info) == 0) {
return;
}
// 定義好幾個(gè)變量魁巩,后面去遍歷查找
segment_command_t *cur_seg_cmd;
// MachO中Load Commons中的linkedit
segment_command_t *linkedit_segment = NULL;
// MachO中LC_SYMTAB
struct symtab_command* symtab_cmd = NULL;
// MachO中LC_DYSYMTAB
struct dysymtab_command* dysymtab_cmd = NULL;
// header的首地址+mach_header的內(nèi)存大小
// 得到跳過(guò)mach_header的地址,也就是直接到Load Commons的地址
uintptr_t cur = (uintptr_t)header + sizeof(mach_header_t);
// 遍歷Load Commons 找到上面三個(gè)遍歷
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
cur_seg_cmd = (segment_command_t *)cur;
// 如果是LC_SEGMENT_64
if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
// 找到linkedit
if (strcmp(cur_seg_cmd->segname, SEG_LINKEDIT) == 0) {
linkedit_segment = cur_seg_cmd;
}
}
// 如果是LC_SYMTAB,就找到了symtab_cmd
else if (cur_seg_cmd->cmd == LC_SYMTAB) {
symtab_cmd = (struct symtab_command*)cur_seg_cmd;
}
// 如果是LC_DYSYMTAB,就找到了dysymtab_cmd
else if (cur_seg_cmd->cmd == LC_DYSYMTAB) {
dysymtab_cmd = (struct dysymtab_command*)cur_seg_cmd;
}
}
// 下面其中任何一個(gè)值沒(méi)有都直接return
// 因?yàn)閕mage不是需要找的image
if (!symtab_cmd || !dysymtab_cmd || !linkedit_segment ||
!dysymtab_cmd->nindirectsyms) {
return;
}
// Find base symbol/string table addresses
// 找到linkedit的頭地址
uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff;
// 獲取symbol_table的真實(shí)地址
nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff);
// 獲取string_table的真實(shí)地址
char *strtab = (char *)(linkedit_base + symtab_cmd->stroff);
// Get indirect symbol table (array of uint32_t indices into symbol table)
// 獲取indirect_symtab的真實(shí)地址
uint32_t *indirect_symtab = (uint32_t *)(linkedit_base + dysymtab_cmd->indirectsymoff);
// 同樣的急灭,得到跳過(guò)mach_header的地址,得到Load Commons的地址
cur = (uintptr_t)header + sizeof(mach_header_t);
// 遍歷Load Commons,找到對(duì)應(yīng)符號(hào)進(jìn)行重新綁定
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
cur_seg_cmd = (segment_command_t *)cur;
if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
// 如果不是__DATA段谷遂,也不是__DATA_CONST段葬馋,直接跳過(guò)
if (strcmp(cur_seg_cmd->segname, SEG_DATA) != 0 &&
strcmp(cur_seg_cmd->segname, SEG_DATA_CONST) != 0) {
continue;
}
// 遍歷所有的section
for (uint j = 0; j < cur_seg_cmd->nsects; j++) {
section_t *sect = (section_t *)(cur + sizeof(segment_command_t)) + j;
// 找懶加載表S_LAZY_SYMBOL_POINTERS
if ((sect->flags & SECTION_TYPE) == S_LAZY_SYMBOL_POINTERS) {
// 重綁定的真正函數(shù)
perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
}
// 找非懶加載表S_NON_LAZY_SYMBOL_POINTERS
if ((sect->flags & SECTION_TYPE) == S_NON_LAZY_SYMBOL_POINTERS) {
// 重綁定的真正函數(shù)
perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
}
}
}
}
}
最上面,通過(guò)header
指針和header
大小獲取到加載指令的基址肾扰。然后遍歷獲取3個(gè)數(shù)據(jù)結(jié)構(gòu):
// MachO中Load Commons中的linkedit
segment_command_t *linkedit_segment = NULL;
// MachO中LC_SYMTAB
struct symtab_command* symtab_cmd = NULL;
// MachO中LC_DYSYMTAB
struct dysymtab_command* dysymtab_cmd = NULL;
下面是比較核心的代碼:
// 找到linkedit的頭地址
uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff;
我們來(lái)看看linkedit_segment->vmaddr對(duì)應(yīng)4294995968
畴嘶,linkedit_segment->fileoff對(duì)應(yīng)28672
。這樣可能看不太出來(lái)這是基地址集晚,我們格式化一下:
(lldb) p/x 4294995968
(long) $0 = 0x0000000100007000
(lldb) p/x 28672
(int) $1 = 0x00007000
(lldb) p/x 4294995968 - 28672
(long) $2 = 0x0000000100000000
我們可以看出這個(gè)部分就是拿到了image對(duì)應(yīng)的內(nèi)存基址窗悯。
// 獲取symbol_table的真實(shí)地址
nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff);
// 獲取string_table的真實(shí)地址
char *strtab = (char *)(linkedit_base + symtab_cmd->stroff);
從struct symtab_command結(jié)構(gòu)中獲取到符號(hào)表的字符表的偏移量,然后加載基址就是內(nèi)存中兩個(gè)表的地址了偷拔。
(lldb) p/x 0x0000000100000000 + 30200
(long) $3 = 0x00000001000075f8
(lldb) p/x 0x0000000100000000 + 33408
(long) $4 = 0x0000000100008280
通過(guò)MachOView我們也驗(yàn)證了這兩個(gè)地址是正確的蟀瞧。
// 獲取indirect_symtab的真實(shí)地址
uint32_t *indirect_symtab = (uint32_t *)(linkedit_base + dysymtab_cmd->indirectsymoff);
通過(guò)struct dysymtab_command獲取間接符號(hào)表。
(lldb) p/x 0x0000000100000000 + 33224
(long) $5 = 0x00000001000081c8
間接符號(hào)表的地址我們也獲得了条摸。
與動(dòng)態(tài)符號(hào)綁定相關(guān)的兩個(gè)部分
// 同樣的,得到跳過(guò)mach_header的地址,得到Load Commons的地址
cur = (uintptr_t)header + sizeof(mach_header_t);
// 遍歷Load Commons铸屉,找到對(duì)應(yīng)符號(hào)進(jìn)行重新綁定
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
cur_seg_cmd = (segment_command_t *)cur;
if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
// 如果不是__DATA段钉蒲,也不是__DATA_CONST段,直接跳過(guò)
if (strcmp(cur_seg_cmd->segname, SEG_DATA) != 0 &&
strcmp(cur_seg_cmd->segname, SEG_DATA_CONST) != 0) {
continue;
}
// 遍歷所有的section
for (uint j = 0; j < cur_seg_cmd->nsects; j++) {
section_t *sect = (section_t *)(cur + sizeof(segment_command_t)) + j;
// 找懶加載表S_LAZY_SYMBOL_POINTERS
if ((sect->flags & SECTION_TYPE) == S_LAZY_SYMBOL_POINTERS) {
// 重綁定的真正函數(shù)
perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
}
// 找非懶加載表S_NON_LAZY_SYMBOL_POINTERS
if ((sect->flags & SECTION_TYPE) == S_NON_LAZY_SYMBOL_POINTERS) {
// 重綁定的真正函數(shù)
perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
}
}
}
}
對(duì)于給定的image
彻坛,__DATA
段包含與動(dòng)態(tài)符號(hào)綁定相關(guān)的兩個(gè)部分:__nl_symbol_ptr
和__la_symbol_ptr
顷啼。遍歷找到這個(gè)兩個(gè)部分,然后進(jìn)行符號(hào)重新綁定昌屉。
符號(hào)重新綁定
static void perform_rebinding_with_section(struct rebindings_entry *rebindings,
section_t *section,
intptr_t slide,
nlist_t *symtab,
char *strtab,
uint32_t *indirect_symtab) {
// reserved1對(duì)應(yīng)的的是indirect_symbol中的offset钙蒙,也就是indirect_symbol的真實(shí)地址
// indirect_symtab+offset就是indirect_symbol_indices(indirect_symbol的數(shù)組)
uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1;
// 函數(shù)地址,addr就是section的偏移地址
void **indirect_symbol_bindings = (void **)((uintptr_t)slide + section->addr);
// 遍歷section中的每個(gè)符號(hào)
for (uint i = 0; i < section->size / sizeof(void *); i++) {
// 訪問(wèn)indirect_symbol间驮,symtab_index就是indirect_symbol中data的值
uint32_t symtab_index = indirect_symbol_indices[i];
if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL ||
symtab_index == (INDIRECT_SYMBOL_LOCAL | INDIRECT_SYMBOL_ABS)) {
continue;
}
// 訪問(wèn)symbol_table躬厌,根據(jù)symtab_index獲取到symbol_table中的偏移offset
uint32_t strtab_offset = symtab[symtab_index].n_un.n_strx;
// 訪問(wèn)string_table,根據(jù)strtab_offset獲取symbol_name
char *symbol_name = strtab + strtab_offset;
// string_table中的所有函數(shù)名都是以"."開(kāi)始的竞帽,所以一個(gè)函數(shù)一定有兩個(gè)字符
bool symbol_name_longer_than_1 = symbol_name[0] && symbol_name[1];
struct rebindings_entry *cur = rebindings;
// 已經(jīng)存入的rebindings_entry
while (cur) {
// 循環(huán)每個(gè)entry中需要重綁定的函數(shù)
for (uint j = 0; j < cur->rebindings_nel; j++) {
// 判斷symbol_name是否是一個(gè)正確的函數(shù)名
// 需要被重綁定的函數(shù)名是否與當(dāng)前symbol_name相等
if (symbol_name_longer_than_1 &&
strcmp(&symbol_name[1], cur->rebindings[j].name) == 0) {
// 判斷replaced是否存在
// 判斷replaced和老的函數(shù)是否是一樣的
if (cur->rebindings[j].replaced != NULL &&
indirect_symbol_bindings[i] != cur->rebindings[j].replacement) {
// 將原函數(shù)的地址給新函數(shù)replaced
*(cur->rebindings[j].replaced) = indirect_symbol_bindings[i];
}
// 將replacement賦值給剛剛找到的
indirect_symbol_bindings[i] = cur->rebindings[j].replacement;
goto symbol_loop;
}
}
// 繼續(xù)下一個(gè)需要綁定的函數(shù)
cur = cur->next;
}
symbol_loop:;
}
}
這個(gè)部分就像fishhook
原理里面提到的:
-
indirect_symbol_indices[nl_symbol_ptr->reserved1]
拿到間接符號(hào)表的函數(shù)起始地址扛施。 -
indirect_symbol_bindings
是nl_symbol_ptr
中對(duì)應(yīng)的函數(shù)指針數(shù)組鸿捧。 - 依次遍歷間接符號(hào)表拿到符號(hào)表索引值,并取出符號(hào)表中對(duì)應(yīng)索引值的結(jié)構(gòu)疙渣,拿到字符表中的偏移量匙奴。
- 通過(guò)字符表和偏移量獲取到函數(shù)名的字符數(shù)組首地址。
- 字符表中的函數(shù)名都是
.
開(kāi)頭的妄荔,所以至少有2個(gè)字符泼菌。symbol_name[1]是去掉開(kāi)頭.
的字符串。 - 循環(huán)遍歷我們要綁定的鏈表啦租,對(duì)比函數(shù)名和symbol_name[1]是否相等哗伯,將原來(lái)的函數(shù)地址給
replaced
中的函數(shù)指針,再將原來(lái)函數(shù)的地址替換為我們要綁定的replacement
函數(shù)地址刷钢。