引言
本文是對(duì)程序員的自我修養(yǎng):鏈接毫捣、裝載與庫(kù)中第3章的實(shí)踐總結(jié)(和結(jié)構(gòu)相關(guān)的示意圖都是用Gliffy Diagrams畫的??)蝎宇,通過(guò)使用工具readelf技潘、objdump對(duì)目標(biāo)文件進(jìn)行解析襟齿,學(xué)習(xí)目標(biāo)文件的結(jié)構(gòu)铸本。
1. 目標(biāo)文件
1.1 目標(biāo)文件的定義
編譯器編譯源代碼后生成的文件叫做目標(biāo)文件喇潘。在Linux下体斩,使用gcc -c xxxx.c編譯生成.o文件。
1.2 編譯過(guò)程回顧
目標(biāo)文件的文件類型為ELF颖低,在Linux下對(duì)應(yīng)文件后綴為.o的文件絮吵,Window下對(duì)應(yīng)文件后綴為.obj的文件。使用file命令可以查看到.o和.obj文件均為ELF類型忱屑。
ckt@ubuntu:~/work/elf$ file simple.o
simple.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
ckt@ubuntu:~/work/elf$ file mp4_player.obj
mp4_player.obj: ELF 32-bit LSB relocatable, ARM, version 1 (SYSV), not stripped
目標(biāo)文件只是ELF文件的可重定位文件(Relocatable file)蹬敲,ELF文件一共有4種類型:Relocatable file、Executable file莺戒、Shared object file和Core Dump file
-
示例
在Linux下伴嗡,使用命令 gcc -c xxxx.c就可以編譯生成.o文件
ckt@ubuntu:~/work/elf$ gcc -c simple.c
ckt@ubuntu:~/work/elf$ ls
simple.c simple.o
在 simple.c中,我們只加入了下面這一個(gè)函數(shù)fun从铲,函數(shù)內(nèi)容為空
void fun()
{
}
使用UltraEdit將simple.o打開瘪校,里面的內(nèi)容有機(jī)器指令代碼、數(shù)據(jù)等,我們的程序就是由這些字節(jié)組成的阱扬。對(duì)于程序員來(lái)說(shuō)泣懊,使用高級(jí)語(yǔ)言(C/C++,Java等)實(shí)現(xiàn)的代碼是最容易閱讀和理解的麻惶,但是對(duì)于計(jì)算機(jī)來(lái)說(shuō)馍刮,它只懂得機(jī)器語(yǔ)言,它更喜歡二進(jìn)制用踩,將0轉(zhuǎn)換為低電平渠退,1轉(zhuǎn)換成高電平,這樣一個(gè)程序就可以跑起來(lái)了脐彩。
我們可以使用工具readelf 和objdump對(duì)目標(biāo)文件simple.o進(jìn)行分析碎乃。為了加深對(duì)目標(biāo)文件的理解,在使用readelf & objdump進(jìn)行前惠奸,需要先要了解ELF文件的結(jié)構(gòu)梅誓。
00000000h: 7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 00 00 ; ?ELF............
00000010h: 01 00 3E 00 01 00 00 00 00 00 00 00 00 00 00 00 ; ..>.............
00000020h: 00 00 00 00 00 00 00 00 08 01 00 00 00 00 00 00 ; ................
00000030h: 00 00 00 00 40 00 00 00 00 00 40 00 0B 00 08 00 ; ....@.....@.....
00000040h: 55 48 89 E5 5D C3 00 00 00 47 43 43 3A 20 28 55 ; UH夊]?..GCC: (U
00000050h: 62 75 6E 74 75 2F 4C 69 6E 61 72 6F 20 34 2E 36 ; buntu/Linaro 4.6
00000060h: 2E 33 2D 31 75 62 75 6E 74 75 35 29 20 34 2E 36 ; .3-1ubuntu5) 4.6
00000070h: 2E 33 00 00 00 00 00 00 14 00 00 00 00 00 00 00 ; .3..............
00000080h: 01 7A 52 00 01 78 10 01 1B 0C 07 08 90 01 00 00 ; .zR..x......?..
00000090h: 1C 00 00 00 1C 00 00 00 00 00 00 00 06 00 00 00 ; ................
000000a0h: 00 41 0E 10 86 02 43 0D 06 41 0C 07 08 00 00 00 ; .A..?C..A......
000000b0h: 00 2E 73 79 6D 74 61 62 00 2E 73 74 72 74 61 62 ; ..symtab..strtab
000000c0h: 00 2E 73 68 73 74 72 74 61 62 00 2E 74 65 78 74 ; ..shstrtab..text
000000d0h: 00 2E 64 61 74 61 00 2E 62 73 73 00 2E 63 6F 6D ; ..data..bss..com
000000e0h: 6D 65 6E 74 00 2E 6E 6F 74 65 2E 47 4E 55 2D 73 ; ment..note.GNU-s
000000f0h: 74 61 63 6B 00 2E 72 65 6C 61 2E 65 68 5F 66 72 ; tack..rela.eh_fr
00000100h: 61 6D 65 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ame.............
00000110h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000120h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000130h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000140h: 00 00 00 00 00 00 00 00 1B 00 00 00 01 00 00 00 ; ................
00000150h: 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000160h: 40 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 ; @...............
00000170h: 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 ; ................
00000180h: 00 00 00 00 00 00 00 00 21 00 00 00 01 00 00 00 ; ........!.......
00000190h: 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000001a0h: 48 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; H...............
000001b0h: 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 ; ................
000001c0h: 00 00 00 00 00 00 00 00 27 00 00 00 08 00 00 00 ; ........'.......
000001d0h: 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000001e0h: 48 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; H...............
000001f0h: 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 ; ................
00000200h: 00 00 00 00 00 00 00 00 2C 00 00 00 01 00 00 00 ; ........,.......
00000210h: 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; 0...............
00000220h: 48 00 00 00 00 00 00 00 2B 00 00 00 00 00 00 00 ; H.......+.......
00000230h: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ; ................
00000240h: 01 00 00 00 00 00 00 00 35 00 00 00 01 00 00 00 ; ........5.......
00000250h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000260h: 73 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; s...............
00000270h: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ; ................
00000280h: 00 00 00 00 00 00 00 00 4A 00 00 00 01 00 00 00 ; ........J.......
00000290h: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000002a0h: 78 00 00 00 00 00 00 00 38 00 00 00 00 00 00 00 ; x.......8.......
000002b0h: 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ; ................
000002c0h: 00 00 00 00 00 00 00 00 45 00 00 00 04 00 00 00 ; ........E.......
000002d0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000002e0h: B0 04 00 00 00 00 00 00 18 00 00 00 00 00 00 00 ; ?..............
000002f0h: 09 00 00 00 06 00 00 00 08 00 00 00 00 00 00 00 ; ................
00000300h: 18 00 00 00 00 00 00 00 11 00 00 00 03 00 00 00 ; ................
00000310h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000320h: B0 00 00 00 00 00 00 00 54 00 00 00 00 00 00 00 ; ?......T.......
00000330h: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ; ................
00000340h: 00 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00 ; ................
00000350h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000360h: C8 03 00 00 00 00 00 00 D8 00 00 00 00 00 00 00 ; ?......?......
00000370h: 0A 00 00 00 08 00 00 00 08 00 00 00 00 00 00 00 ; ................
00000380h: 18 00 00 00 00 00 00 00 09 00 00 00 03 00 00 00 ; ................
00000390h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000003a0h: A0 04 00 00 00 00 00 00 0E 00 00 00 00 00 00 00 ; ?..............
000003b0h: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ; ................
000003c0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000003d0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000003e0h: 01 00 00 00 04 00 F1 FF 00 00 00 00 00 00 00 00 ; ......?........
000003f0h: 00 00 00 00 00 00 00 00 00 00 00 00 03 00 01 00 ; ................
00000400h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000410h: 00 00 00 00 03 00 02 00 00 00 00 00 00 00 00 00 ; ................
00000420h: 00 00 00 00 00 00 00 00 00 00 00 00 03 00 03 00 ; ................
00000430h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000440h: 00 00 00 00 03 00 05 00 00 00 00 00 00 00 00 00 ; ................
00000450h: 00 00 00 00 00 00 00 00 00 00 00 00 03 00 06 00 ; ................
00000460h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00000470h: 00 00 00 00 03 00 04 00 00 00 00 00 00 00 00 00 ; ................
00000480h: 00 00 00 00 00 00 00 00 0A 00 00 00 12 00 01 00 ; ................
00000490h: 00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 ; ................
000004a0h: 00 73 69 6D 70 6C 65 2E 63 00 66 75 6E 00 00 00 ; .simple.c.fun...
000004b0h: 20 00 00 00 00 00 00 00 02 00 00 00 02 00 00 00 ; ...............
000004c0h: 00 00 00 00 00 00 00 00 ; ........
ELF文件結(jié)構(gòu)
和class文件類似,ELF文件存放數(shù)據(jù)的格式也是固定的佛南,計(jì)算機(jī)在解析目標(biāo)文件時(shí)梗掰,就是按照它每個(gè)字段的數(shù)據(jù)結(jié)構(gòu)進(jìn)行逐字解析的。ELF文件結(jié)構(gòu)信息定義在/usr/include/elf.h中嗅回,整個(gè)ELF文件的結(jié)構(gòu)如下圖:
-
ELF Header
ELF Header是ELF文件的第一部分及穗,64 bit的ELF文件頭的結(jié)構(gòu)體如下:
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf64_Half e_type; /* Object file type */
Elf64_Half e_machine; /* Architecture */
Elf64_Word e_version; /* Object file version */
Elf64_Addr e_entry; /* Entry point virtual address */
Elf64_Off e_phoff; /* Program header table file offset */
Elf64_Off e_shoff; /* Section header table file offset */
Elf64_Word e_flags; /* Processor-specific flags */
Elf64_Half e_ehsize; /* ELF header size in bytes */
Elf64_Half e_phentsize; /* Program header table entry size */
Elf64_Half e_phnum; /* Program header table entry count */
Elf64_Half e_shentsize; /* Section header table entry size */
Elf64_Half e_shnum; /* Section header table entry count */
Elf64_Half e_shstrndx; /* Section header string table index */
} Elf64_Ehdr;
接下來(lái)我們會(huì)使用到第一個(gè)分析目標(biāo)文件的工具readelf,通過(guò)man readelf命令绵载,我們可以查到readelf的作用就是用來(lái)顯示ELF文件的信息
DESCRIPTION
readelf displays information about one or more ELF format object files.
使用readelf -h simple.o來(lái)進(jìn)行對(duì)Header的解析埂陆,通過(guò)man readelf命令同樣可以查詢到對(duì)-h參數(shù)的說(shuō)明,
-h用來(lái)顯示ELF header的相關(guān)信息娃豹。
OPTIONS
-h
--file-header
Displays the information contained in the ELF header at the start of the file.
Header中主要存放的是一些基本信息焚虱,通過(guò)Header中的信息,我們可以確定后面其他字段的大小和起始地址懂版,通常比較關(guān)心的部分是:ELF文件類型鹃栽、是32bit還是64bit、Header部分大小躯畴、Section部分大小和擁有Section的個(gè)數(shù)等民鼓。
結(jié)合Elf64_Ehdr來(lái)看,對(duì)應(yīng)解析結(jié)果如下:
-
Section
完成了對(duì)Header的解析蓬抄,再接著分析Section部分摹察,Section對(duì)應(yīng)結(jié)構(gòu)體如下:
typedef struct
{
Elf64_Word sh_name; /* Section name (string tbl index) */
Elf64_Word sh_type; /* Section type */
Elf64_Xword sh_flags; /* Section flags */
Elf64_Addr sh_addr; /* Section virtual addr at execution */
Elf64_Off sh_offset; /* Section file offset */
Elf64_Xword sh_size; /* Section size in bytes */
Elf64_Word sh_link; /* Link to another section */
Elf64_Word sh_info; /* Additional section information */
Elf64_Xword sh_addralign; /* Section alignment */
Elf64_Xword sh_entsize; /* Entry size if section holds table */
} Elf64_Shdr;
Section部分主要存放的是機(jī)器指令代碼和數(shù)據(jù),執(zhí)行命令readelf -S -W simple.o對(duì)Section部分的解析倡鲸,解析結(jié)果和Elf64_Shdr也是一一對(duì)應(yīng)的供嚎。
ckt@ubuntu:~/work/elf$ readelf -S -W simple.o
There are 11 section headers, starting at offset 0x108:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000000000 000040 000006 00 AX 0 0 4
[ 2] .data PROGBITS 0000000000000000 000048 000000 00 WA 0 0 4
[ 3] .bss NOBITS 0000000000000000 000048 000000 00 WA 0 0 4
[ 4] .comment PROGBITS 0000000000000000 000048 00002b 01 MS 0 0 1
[ 5] .note.GNU-stack PROGBITS 0000000000000000 000073 000000 00 0 0 1
[ 6] .eh_frame PROGBITS 0000000000000000 000078 000038 00 A 0 0 8
[ 7] .rela.eh_frame RELA 0000000000000000 0004b0 000018 18 9 6 8
[ 8] .shstrtab STRTAB 0000000000000000 0000b0 000054 00 0 0 1
[ 9] .symtab SYMTAB 0000000000000000 0003c8 0000d8 18 10 8 8
[10] .strtab STRTAB 0000000000000000 0004a0 00000e 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
對(duì)于這部分內(nèi)容,通常我們比較的Section是.text(存放代碼)、.data(存放全局靜態(tài)變量和局部靜態(tài)變量)和.bss(存未初始化的全局變量和局部靜態(tài)變量) 克滴,在后面會(huì)對(duì)這幾個(gè)段分別分進(jìn)行解析逼争。
根據(jù)readelf -S -W simple.o的輸出結(jié)果,我們可以算出整個(gè)simple.o的組成部分和起始地址劝赔,使用ls -l 命令查看simple.o的大小誓焦,和simple.o結(jié)束地址0x0000048c是吻合的。
ckt@ubuntu:~/work/elf$ ls -l simple.o
-rw-rw-r-- 1 ckt ckt 1224 Apr 12 18:42 simple.o
解析目標(biāo)文件
分析完ELF文件結(jié)構(gòu)着帽,接著來(lái)解析一個(gè)目標(biāo)文件杂伟。首先,準(zhǔn)備好源碼SimpleSection.c仍翰,執(zhí)行命令gcc -c SimpleSection.c生成目標(biāo)文件SimpleSection.o赫粥。
int printf(const char* format, ...);
int global_init_var = 84;
int global_uninit_var;
void func1(int i)
{
printf("%d\n", i);
}
int main(void)
{
static int static_var = 85;
static int static_var2;
int a = 1;
int b;
func1(static_var + static_var2 + a + b);
return 0;
}
在這部分,我們會(huì)使用另外一個(gè)命令objdump予借,使用man objdump查看該命令越平,objdump是用來(lái)顯示目標(biāo)文件相關(guān)信息的。
DESCRIPTION
objdump displays information about one or more object files.
-
查看目標(biāo)文件的Section
執(zhí)行命令objdump -h SimpleSection.o對(duì)Section部分進(jìn)行解析灵迫,我們可以得到每個(gè)段的大小
ckt@ubuntu:~/work/elf$ objdump -h SimpleSection.o
SimpleSection.o: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000052 0000000000000000 0000000000000000 00000040 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000008 0000000000000000 0000000000000000 00000094 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 0000000000000000 0000000000000000 0000009c 2**2
ALLOC
3 .rodata 00000004 0000000000000000 0000000000000000 0000009c 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .comment 0000002b 0000000000000000 0000000000000000 000000a0 2**0
CONTENTS, READONLY
5 .note.GNU-stack 00000000 0000000000000000 0000000000000000 000000cb 2**0
CONTENTS, READONLY
6 .eh_frame 00000058 0000000000000000 0000000000000000 000000d0 2**3
CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
我們的代碼是存放到.text中秦叛,已初始化全局變量和局部靜態(tài)變量存放在.data中,未初始化全局變量和局部靜態(tài)變量存放在.bss中
-
代碼段
執(zhí)行命令objdump -s -d SimpleSection.o對(duì)代碼段(.text)的解析結(jié)果如下:
-
數(shù)據(jù)段和只讀數(shù)據(jù)段
執(zhí)行命令objdump -s -d SimpleSection.o對(duì)數(shù)據(jù)段和只讀數(shù)據(jù)段解析結(jié)果如下:
-
BSS段
執(zhí)行命令objdump -x -s -d SimpleSection.o打印出目標(biāo)文件的符號(hào)表瀑粥,通過(guò)符號(hào)表我們可以知道各個(gè)變量的存放位置挣跋,只有未初始化的局部靜態(tài)變量static_var2被放到了.bss段,而global_uninit_var被放入了comment段
另外狞换,被初始化為0的靜態(tài)變量也會(huì)被放入.bss段浆劲,因?yàn)槲闯跏甲兞康闹狄彩?,經(jīng)過(guò)優(yōu)化后被放入.bss段哀澈,這樣可以節(jié)省磁盤空間,因?yàn)?bss不占磁盤空間
例如度气,下面的代碼中x1會(huì)被放入.bss段割按,而x2被放入.data段
static int x1 = 0;
static int x2 = 12;