準(zhǔn)備知識(shí)
-
ELF文件
"Executable and Linkable Format" 的簡(jiǎn)稱碰缔。當(dāng)編譯和鏈接一個(gè) C 程序的時(shí)候考余,編譯器將每個(gè) C 源碼文件 (.c) 轉(zhuǎn)為一個(gè)對(duì)象文件 (.o) 羔挡,對(duì)象文件中存放的是機(jī)器能理解的二進(jìn)制格式的匯編語(yǔ)言指令来庭。然后愕秫,鏈接器 (linker) 將所有對(duì)象文件結(jié)合為一個(gè)二進(jìn)制映像 (image) 文件承疲,即ELF文件。 -
硬盤布局
bootloader (boot.S and main.c) 存放在啟動(dòng)盤的第一個(gè) sector
kernel (必須為 elf 文件)存放在第二個(gè) sector - 啟動(dòng)步驟
- 將BIOS讀入內(nèi)存并執(zhí)行
- BIOS將初始化設(shè)備冲呢,設(shè)置好中斷舍败,將設(shè)備的第一個(gè)sector讀入內(nèi)存并跳轉(zhuǎn)。
- 執(zhí)行到bootloader時(shí)敬拓,
boot.S
將開啟保護(hù)模式邻薯,并設(shè)置好棧指針使得系統(tǒng)可以執(zhí)行 C 程序。然后執(zhí)行bootmain()
乘凸。 -
main.c
中的bootmain
會(huì)讀入 kernel 并且跳轉(zhuǎn)厕诡。
閱讀代碼
用到了在 inc/elf.h 中定義的兩個(gè)結(jié)構(gòu)體
struct Elf { // ELF文件頭
uint32_t e_magic; // must equal ELF_MAGIC
uint8_t e_elf[12];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
uint32_t e_entry;
uint32_t e_phoff; // program header起始位置
uint32_t e_shoff; // section header起始位置
uint32_t e_flags;
uint16_t e_ehsize; // ELF文件頭本身大小
uint16_t e_phentsize;
uint16_t e_phnum; // program header個(gè)數(shù)
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
};
struct Proghdr { // 程序頭表
uint32_t p_type;
uint32_t p_offset; // 段相對(duì)于ELF文件開頭的偏移
uint32_t p_va;
uint32_t p_pa; // 物理地址
uint32_t p_filesz;
uint32_t p_memsz; // 在內(nèi)存中的大小
uint32_t p_flags; // 讀,寫营勤,執(zhí)行權(quán)限
uint32_t p_align;
};
首先灵嫌,ELF文件格式提供了兩種視圖,分別是鏈接視圖和執(zhí)行視圖葛作。
鏈接視圖是以節(jié)(section)為單位寿羞,執(zhí)行視圖是以段(segment)為單位。鏈接視圖就是在鏈接時(shí)用到的視圖进鸠,而執(zhí)行視圖則是在執(zhí)行時(shí)用到的視圖稠曼。上圖左側(cè)的視角是從鏈接來(lái)看的,右側(cè)的視角是執(zhí)行來(lái)看的∠挤可以看出漠吻,一個(gè)segment可以包含數(shù)個(gè)section。
本文關(guān)注執(zhí)行司恳,結(jié)構(gòu)體Proghdr是用于描述段 (segment) 的 program header途乃,可有多個(gè)。
bootmain()函數(shù)
#include <inc/x86.h>
#include <inc/elf.h>
/**********************************************************************
* This a dirt simple boot loader, whose sole job is to boot
* an ELF kernel image from the first IDE hard disk.
*
* DISK LAYOUT
* * This program(boot.S and main.c) is the bootloader. It should
* be stored in the first sector of the disk.
*
* * The 2nd sector onward holds the kernel image.
*
* * The kernel image must be in ELF format.
*
* BOOT UP STEPS
* * when the CPU boots it loads the BIOS into memory and executes it
*
* * the BIOS intializes devices, sets of the interrupt routines, and
* reads the first sector of the boot device(e.g., hard-drive)
* into memory and jumps to it.
*
* * Assuming this boot loader is stored in the first sector of the
* hard-drive, this code takes over...
*
* * control starts in boot.S -- which sets up protected mode,
* and a stack so C code then run, then calls bootmain()
*
* * bootmain() in this file takes over, reads in the kernel and jumps to it.
**********************************************************************/
// 扇區(qū)(sector)大小512
#define SECTSIZE 512
// 將0x10000設(shè)為內(nèi)核起始地址
#define ELFHDR ((struct Elf *) 0x10000) // scratch space
void readsect(void*, uint32_t);
void readseg(uint32_t, uint32_t, uint32_t);
void
bootmain(void)
{
struct Proghdr *ph, *eph;
// read 1st page off disk
// 從 0 開始讀取 8*512 = 4096 byte 的內(nèi)容到 ELFHDR
readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);
// is this a valid ELF?
if (ELFHDR->e_magic != ELF_MAGIC)
goto bad;
// load each program segment (ignores ph flags)
// 獲得程序頭表的起始位置 ph
ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
// 獲取程序頭表結(jié)束的位置 eph
eph = ph + ELFHDR->e_phnum;
for (; ph < eph; ph++)
// p_pa is the load address of this segment (as well
// as the physical address)
// 根據(jù)每個(gè) program header 讀取 segment
// 從 p_offset 開始拷貝 p_memsz 個(gè) byte 到 p_pa
readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
// call the entry point from the ELF header
// note: does not return!
((void (*)(void)) (ELFHDR->e_entry))();
bad:
outw(0x8A00, 0x8A00);
outw(0x8A00, 0x8E00);
while (1)
/* do nothing */;
}
語(yǔ)法難點(diǎn)解析
ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
首先將ELFHDR轉(zhuǎn)為 uint8_t 型指針扔傅,做加法的時(shí)候按照 byte 加耍共,獲得程序頭表的起始位置,再將這個(gè)位置轉(zhuǎn)為 Proghdr 型指針 ph猎塞。((void (*)(void)) (ELFHDR->e_entry))();
將ELFHDR->e_entry
轉(zhuǎn)為一個(gè)無(wú)參數(shù)试读,無(wú)返回值的函數(shù)指針,并執(zhí)行該函數(shù)荠耽。
讀取segment
(只從邏輯分析钩骇,忽略readsect和waitdisk函數(shù))
// Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
// Might copy more than asked
void
readseg(uint32_t pa, uint32_t count, uint32_t offset)
{
uint32_t end_pa;
end_pa = pa + count;
// round down to sector boundary
// 將pa按扇區(qū)對(duì)齊
pa &= ~(SECTSIZE - 1);
// translate from bytes to sectors, and kernel starts at sector 1
// 將以byte為單位的offset轉(zhuǎn)為以sector為單位
offset = (offset / SECTSIZE) + 1;
// If this is too slow, we could read lots of sectors at a time.
// We'd write more to memory than asked, but it doesn't matter --
// we load in increasing order.
while (pa < end_pa) {
// Since we haven't enabled paging yet and we're using
// an identity segment mapping (see boot.S), we can
// use physical addresses directly. This won't be the
// case once JOS enables the MMU.
// 此時(shí),offset已經(jīng)被轉(zhuǎn)為以扇區(qū)(sector)為單位
// 始終是以一個(gè) sector 為單位讀取
readsect((uint8_t*) pa, offset);
pa += SECTSIZE;
offset++;
}
}
void
waitdisk(void)
{
// wait for disk reaady
while ((inb(0x1F7) & 0xC0) != 0x40)
/* do nothing */;
}
void
readsect(void *dst, uint32_t offset)
{
// wait for disk to be ready
waitdisk();
outb(0x1F2, 1); // count = 1
outb(0x1F3, offset);
outb(0x1F4, offset >> 8);
outb(0x1F5, offset >> 16);
outb(0x1F6, (offset >> 24) | 0xE0);
outb(0x1F7, 0x20); // cmd 0x20 - read sectors
// wait for disk to be ready
waitdisk();
// read a sector
insl(0x1F0, dst, SECTSIZE/4);
}
語(yǔ)法難點(diǎn)解析
-
pa &= ~(SECTSIZE - 1);
對(duì)應(yīng)匯編碼:
(gdb)
=> 0x7cf1: and $0xfffffe00,%ebx
0x00007cf1 in ?? ()
uint32_t 512的十六進(jìn)制表示為0x00000200
铝量,減1后為0x000001ff
倘屹,按位取反得0xfffffe00
,可以看出作用是將pa的后9 bit全部置0慢叨。
附錄1. main.c生成的匯編代碼
為了分析exercise 3纽匙,有必要對(duì)各個(gè)函數(shù)的匯編碼進(jìn)行一個(gè)review。
// 調(diào)用 bootmain()
=> 0x7c45: call 0x7d15
=> 0x7d15: push %ebp
=> 0x7d16: mov %esp,%ebp
=> 0x7d18: push %esi
=> 0x7d19: push %ebx
// 從右向左壓入?yún)?shù)
=> 0x7d1a: push $0x0
=> 0x7d1c: push $0x1000
=> 0x7d21: push $0x10000
// 調(diào)用 readseg((uint32_t) ELFHDR, SECTSIZE*8, 0)
=> 0x7d26: call 0x7cdc
=> 0x7cdc: push %ebp
=> 0x7cdd: mov %esp,%ebp
=> 0x7cdf: push %edi
=> 0x7ce0: push %esi
// 利用偏移獲取各參數(shù)
// ebp+8 位置是arg1
// ebp+12 位置是arg2
// ebp+16 位置是arg3
=> 0x7ce1: mov 0x10(%ebp),%edi
=> 0x7ce4: push %ebx
=> 0x7ce5: mov 0xc(%ebp),%esi
=> 0x7ce8: mov 0x8(%ebp),%ebx
=> 0x7ceb: shr $0x9,%edi // (offset / SECTSIZE)
=> 0x7cee: add %ebx,%esi
=> 0x7cf0: inc %edi
=> 0x7cf1: and $0xfffffe00,%ebx
=> 0x7cf7: cmp %esi,%ebx // while 語(yǔ)句比較 pa 和 end_pa
=> 0x7cf9: jae 0x7d0d // 大于等于則跳轉(zhuǎn)
=> 0x7cfb: push %edi
=> 0x7cfc: push %ebx
=> 0x7cfd: inc %edi // offset++
=> 0x7cfe: add $0x200,%ebx // pa += SECTSIZE
// 調(diào)用 readsect((uint8_t*) pa, offset)
=> 0x7d04: call 0x7c7c
=> 0x7c7c: push %ebp
=> 0x7c7d: mov %esp,%ebp
=> 0x7c7f: push %edi
=> 0x7c80: mov 0xc(%ebp),%ecx
// 調(diào)用 waitdisk(void)
=> 0x7c83: call 0x7c6a
=> 0x7c6a: push %ebp
=> 0x7c6b: mov $0x1f7,%edx
=> 0x7c70: mov %esp,%ebp
=> 0x7c72: in (%dx),%al
=> 0x7c73: and $0xffffffc0,%eax
=> 0x7c76: cmp $0x40,%al
=> 0x7c78: jne 0x7c72
=> 0x7c7a: pop %ebp
=> 0x7c7b: ret
// waitdisk 結(jié)束拍谐,返回 readsect 函數(shù)繼續(xù)執(zhí)行
=> 0x7c88: mov $0x1f2,%edx
=> 0x7c8d: mov $0x1,%al
=> 0x7c8f: out %al,(%dx)
=> 0x7c90: mov $0x1f3,%edx
=> 0x7c95: mov %cl,%al
=> 0x7c97: out %al,(%dx)
=> 0x7c98: mov %ecx,%eax
=> 0x7c9a: mov $0x1f4,%edx
=> 0x7c9f: shr $0x8,%eax
=> 0x7ca2: out %al,(%dx)
=> 0x7ca3: mov %ecx,%eax
=> 0x7ca5: mov $0x1f5,%edx
=> 0x7caa: shr $0x10,%eax
=> 0x7cad: out %al,(%dx)
=> 0x7cae: mov %ecx,%eax
=> 0x7cb0: mov $0x1f6,%edx
=> 0x7cb5: shr $0x18,%eax
=> 0x7cb8: or $0xffffffe0,%eax
=> 0x7cbb: out %al,(%dx)
=> 0x7cbc: mov $0x1f7,%edx
=> 0x7cc1: mov $0x20,%al
=> 0x7cc3: out %al,(%dx)
// 調(diào)用 waitdisk(void)
=> 0x7cc4: call 0x7c6a
=> 0x7c6a: push %ebp
=> 0x7c6b: mov $0x1f7,%edx
=> 0x7c70: mov %esp,%ebp
=> 0x7c72: in (%dx),%al
=> 0x7c73: and $0xffffffc0,%eax
=> 0x7c76: cmp $0x40,%al
=> 0x7c78: jne 0x7c72
=> 0x7c7a: pop %ebp
=> 0x7c7b: ret
=> 0x7cc9: mov 0x8(%ebp),%edi
=> 0x7ccc: mov $0x80,%ecx
=> 0x7cd1: mov $0x1f0,%edx
=> 0x7cd6: cld
=> 0x7cd7: repnz insl (%dx),%es:(%edi) //repeats instruction while Z flag is cleared
=> 0x7cd9: pop %edi
=> 0x7cda: pop %ebp
=> 0x7cdb: ret // 退出 readsect 函數(shù)
=> 0x7d09: pop %eax
=> 0x7d0a: pop %edx
// 返回 while 語(yǔ)句判斷
=> 0x7d0b: jmp 0x7cf7
... // 重復(fù)直到跳出 while 循環(huán)
=> 0x7d0d: lea -0xc(%ebp),%esp
=> 0x7d10: pop %ebx
=> 0x7d11: pop %esi
=> 0x7d12: pop %edi
=> 0x7d13: pop %ebp
=> 0x7d14: ret // 退出 readseg 函數(shù)
=> 0x7d2b: add $0xc,%esp
=> 0x7d2e: cmpl $0x464c457f,0x10000 // 判斷 e_magic
=> 0x7d38: jne 0x7d71
=> 0x7d3a: mov 0x1001c,%eax
=> 0x7d3f: movzwl 0x1002c,%esi
=> 0x7d46: lea 0x10000(%eax),%ebx
=> 0x7d4c: shl $0x5,%esi
=> 0x7d4f: add %ebx,%esi // ebx存放ph烛缔,esi存放eph
=> 0x7d51: cmp %esi,%ebx // for 語(yǔ)句中比較 ph 與 eph
=> 0x7d53: jae 0x7d6b // 大于等于則跳轉(zhuǎn)
// 壓入?yún)?shù)
=> 0x7d55: pushl 0x4(%ebx)
=> 0x7d58: pushl 0x14(%ebx)
=> 0x7d5b: add $0x20,%ebx
=> 0x7d5e: pushl -0x14(%ebx)
// 調(diào)用 readseg(ph->p_pa, ph->p_memsz, ph->p_offset)
=> 0x7d61: call 0x7cdc
... // 重復(fù)直到跳出for循環(huán)
附錄2. ELF詳細(xì)介紹
-
ELF executable
可看作包含加載信息的文件頭 (header) 以及一些程序段 (program section)。每個(gè)程序段是相鄰的代碼塊或數(shù)據(jù)塊赠尾,需要被加載到內(nèi)存的特定位置力穗。boot loader 不更改代碼或數(shù)據(jù)毅弧,只是加載到內(nèi)存并且執(zhí)行气嫁。 -
ELF binary
以一個(gè)定長(zhǎng) ELF header 開頭,然后是變長(zhǎng)的 program header够坐,包含了所有需要加載的程序段寸宵。 -
program section
只關(guān)注三個(gè)會(huì)用到的section。 - .text
程序的可執(zhí)行指令元咙。 - .rodata
只讀數(shù)據(jù)梯影。例如 C 編譯器產(chǎn)生的 ASCII 字符串常量。 - .data
保存程序的初始數(shù)據(jù)庶香。例如某個(gè)有初始值的全局變量int x = 5;
甲棍。
~/OS/lab/obj/kern$ objdump -h kernel
kernel: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00001871 f0100000 00100000 00001000 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 00000714 f0101880 00101880 00002880 2**5
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .stab 000038d1 f0101f94 00101f94 00002f94 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .stabstr 000018bb f0105865 00105865 00006865 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .data 0000a300 f0108000 00108000 00009000 2**12
CONTENTS, ALLOC, LOAD, DATA
5 .bss 00000644 f0112300 00112300 00013300 2**5
ALLOC
6 .comment 00000034 00000000 00000000 00013300 2**0
CONTENTS, READONLY
重點(diǎn)關(guān)注的是 .text 部分的 VMA (link address) 和 LMA (load address)。link address 是開始執(zhí)行該 section 的內(nèi)存地址赶掖。而 load address 則顧名思義感猛,是加載該 section 的內(nèi)存地址七扰。一般而言這兩者是相同的。
boot loader 利用 ELF program header 來(lái)決定如何加載 section陪白,而 program header 指定應(yīng)該讀取 ELF 對(duì)象的哪個(gè)部分進(jìn)內(nèi)存颈走,以及應(yīng)該放在哪里。