AT&T Assembly Syntax [ AT&T 匯編語(yǔ)法 ]
vivek, Mon, 2003-09-01 23:53
Updated: May/10 '06
harry翻譯 星期四 2018-06-28
This article is a 'quick-n-dirty' introduction to the AT&T assembly language syntax, as implemented in the GNU Assembler as(1). For the first timer the AT&T syntax may seem a bit confusing, but if you have any kind of assembly language programming background, it's easy to catch up once you have a few rules in mind. I assume you have some familiarity to what is commonly referred to as the INTEL-syntax for assembly language instructions, as described in the x86 manuals. Due to its simplicity, I use the NASM (Netwide Assembler) variant of the INTEL-syntax to cite differences between the formats.
這篇文章是 '簡(jiǎn)單粗暴' 的介紹關(guān)于AT&T 匯編語(yǔ)言語(yǔ)法粗合,作為GNU Assembler as (1) 的實(shí)現(xiàn),對(duì)于初學(xué)者批什,AT&T語(yǔ)法可能看起來(lái)有點(diǎn)混亂畦木,但如果你有任意一種匯編語(yǔ)言編程背景救巷,記住一些規(guī)則就會(huì)比較容易上手垫卤。我假設(shè)您對(duì)通常所說(shuō)的intel語(yǔ)法的匯編語(yǔ)言指令有一定的了解横堡,好比在x86手冊(cè)里描述的那樣莺丑。因?yàn)樗鄬?duì)比較簡(jiǎn)單玻靡,我用NASM(Netwide Assemble) INTEL語(yǔ)法的變體來(lái)引用一下它們之間的語(yǔ)法格式上的差別结榄。
The GNU assembler is a part of the GNU Binary Utilities (binutils), and a back-end to the GNU Compiler Collection. Although as is not the preferred assembler for writing reasonably big assembler programs, its a vital part of contemporary Unix-like systems, especially for kernel-level hacking. Often criticised for its cryptic AT&T-style syntax, it is argued that as was written with an emphasis on being used as a back-end to GCC, with little concern for "developer-friendliness". If you are an assembler programmer hailing from an INTEL-Syntax background, you'll experience a degree of stifling with regard to code-readability and code-generation. Nevertheless, it must be stated that, many operating systems' code-base depend on “as” as the assembler for generating low-level code.
The GNU 匯編是GNU 二進(jìn)制工具包(binutils)的一部分,也是基于后GNU編譯器的集合囤捻。盡管它不是編寫(xiě)大型的匯編程序首選臼朗,但它仍是當(dāng)代Unix-like 系統(tǒng)的重要組成部分,特別對(duì)于內(nèi)核的黑客們來(lái)說(shuō)蝎土。它神秘的AT&T-Style 的語(yǔ)法經(jīng)常遭到批評(píng)视哑,這種批評(píng)源于它用于強(qiáng)調(diào)編寫(xiě)GCC的后端,而對(duì)于開(kāi)發(fā)友好不是很關(guān)心誊涯。所以如果你是來(lái)自于INTEL語(yǔ)法背景的匯編語(yǔ)言程序員挡毅,你將感覺(jué)非常鬧心(窒息)在代碼可讀性和代碼生成方面。然而暴构,不得不說(shuō)的是跪呈,很多操作系統(tǒng)的代碼庫(kù)依賴(lài) AS(GNU assembler)來(lái)生成低層(很LOW)的代碼。
The Basic Format [基礎(chǔ)語(yǔ)法格式]
The structure of a program in AT&T-syntax is similar to any other assembler-syntax, consisting of a series of directives, labels, instructions - composed of a mnemonic followed by a maximum of three operands. The most prominent difference in the AT&T-syntax stems from the ordering of the operands.
AT&A匯編程序的語(yǔ)法結(jié)構(gòu)與其他匯編的語(yǔ)法很類(lèi)似取逾,都包括指令序列耗绿,標(biāo)簽,指令包含助記符跟最多三個(gè)操作數(shù)菌赖。最顯著的不同在AT&A語(yǔ)法源于操作的順序缭乘。
For example, the general format of a basic data movement instruction in INTEL-syntax is,
例子:在INTEL中普通的數(shù)據(jù)移動(dòng)指令如下
mnemonic destination, source
(助記符) 目標(biāo)操作數(shù),源操作數(shù)
whereas, in the case of AT&T, the general format is
然而,在AT&T的場(chǎng)景下琉用,格式為
mnemonic source, destination
(助記符) 源操作數(shù)堕绩,目標(biāo)操作數(shù)
To some (including myself), this format is more intuitive. The following sections describe the types of operands to AT&T assembler instructions for the x86 architecture.
對(duì)于大多數(shù)人來(lái)說(shuō)(包括我自己)這個(gè)語(yǔ)法格式非常直觀,下面介紹在X86體系結(jié)構(gòu)中AT&A匯編指令的操作數(shù)類(lèi)型邑时。
Registers (寄存器)
All register names of the IA-32 architecture must be prefixed by a '%' sign, eg. %al,%bx, %ds, %cr0 etc.
在IA-32體系結(jié)構(gòu)中的所有的寄存名必須以‘%’作為前綴(以%開(kāi)頭)奴紧,如 %al,%bl,$ds,%cr0 等等。
mov %ax, %bx
The above example is the mov instruction that moves the value from the 16-bit register AX to 16-bit register BX.
上面這個(gè)例子是"mov"指令晶丘,將16位的寄存器AX的內(nèi)容移至BX中
Literal Values 常量
All literal values must be prefixed by a '$' sign. For example,
所有常量必須以'$'開(kāi)頭黍氮,例如:
mov $100, %bx
mov $A, %al
The first instruction moves the the value 100 into the register AX and the second one moves the numerical value of the ascii A into the AL register. To make things clearer, note that the below example is not a valid instruction,
第一個(gè)指令將100移至寄存器AX中唐含,第二個(gè)指令將ASCII A的值移至AL寄存器中。
為了描述更清楚沫浆,注釋一下捷枯,下邊的例子是不合法的
mov %bx, $100
as it just tries to move the value in register bx to a literal value. It just doesn't make any sense.
它只是嘗試將寄存器bx中的值移至常量值中,這個(gè)指無(wú)任何意義专执。
Memory Addressing 內(nèi)存地址
In the AT&T Syntax, memory is referenced in the following way,
在AT&T的語(yǔ)法當(dāng)中淮捆,內(nèi)存的引用以下面的方式出現(xiàn):
segment-override:signed-offset(base,index,scale)
parts of which can be omitted depending on the address you want.
%es:100(%eax,%ebx,2)
Please note that the offsets and the scale should not be prefixed by '$'. A few more examples with their equivalent NASM-syntax, should make things clearer。
請(qǐng)注意這里的offsets和scale 不應(yīng)該以$開(kāi)頭本股,一些和INTEL 語(yǔ)法對(duì)比的例子攀痊,可能會(huì)解釋的更清楚。
GAS memory operand | NASM memory operand |
---|---|
100 | [100] |
%es:100 | [es:100] |
(%eax) | [eax] |
(%eax,%ebx) | [eax+ebx] |
(%ecx,%ebx,2) | [ecx+ebx*2] |
(,%ebx,2) | [ebx*2] |
-10(%eax) | [eax-10] |
%ds:-10(%ebp) | [ds:ebp-10] |
Example instructions,
mov %ax, 100
mov %eax, -100(%eax)
The first instruction moves the value in register AX into offset 100 of the data segment register (by default), and the second one moves the value in eax register to [eax-100].
第一個(gè)指令將AX的值拄显,移至數(shù)據(jù)段(默認(rèn))的偏移量為100的地址中苟径,第二個(gè)將eax中的值移至EAX-100 的物理地址中.
Operand Sizes
At times, especially when moving literal values to memory, it becomes neccessary to specify the size-of-transfer or the operand-size. For example the instruction,
有時(shí),特別是將常量值移給內(nèi)存的時(shí)侯在躬审,確認(rèn)操作數(shù)的size將變得非常必要棘街,舉個(gè)指令的例子:
mov $10, 100
only specfies that the value 10 is to be moved to the memory offset 100, but not the transfer size. In NASM this is done by adding the casting keyword byte/word/dword etc. to any of the operands. In AT&T syntax, this is done by adding a suffix - b/w/l - to the instruction. For example,
只需要將10移至內(nèi)存(offset 100)中,但沒(méi)有指定操作數(shù)size,在NASM中這種轉(zhuǎn)換被加到了類(lèi)型轉(zhuǎn)換關(guān)鍵字(byte/word/dword),那么在AT&T語(yǔ)法中承边,對(duì)于任務(wù)操作數(shù)蹬碧,這種轉(zhuǎn)換變?yōu)榻o指令加上對(duì)應(yīng)的后綴(b/w/l),舉個(gè)例子:
movb $10, %es:(%eax)
moves a byte value 10 to the memory location [ea:eax], whereas,
movl $10, %es:(%eax)
moves a long value (dword) 10 to the same place.
A few more examples,
movl $100, %ebx
pushl %eax
popw %ax
Control Transfer Instructions 控制跳轉(zhuǎn)指令
The jmp, call, ret, etc., instructions transfer the control from one part of a program to another. They can be classified as control transfers to the same code segment (near) or to different code segments (far). The possible types of branch addressing are - relative offset (label), register, memory operand, and segment-offset pointers.
jmp,call,ret 等指令跳轉(zhuǎn)控制代碼從一個(gè)部分跳到另一個(gè)部分,它可以被歸類(lèi)為相同代碼段的跳轉(zhuǎn)(near)或者不同代碼內(nèi)(far)的中轉(zhuǎn)炒刁。分去地址可能的類(lèi)型為
- 相對(duì)的偏移量
- 寄存器
- 內(nèi)存操作數(shù)
- 段偏移量指針
Relative offsets, are specified using labels, as shown below.
相對(duì)偏移量被指定為標(biāo)簽,如下所示
label1:
.
.
jmp label1
Branch addressing using registers or memory operands must be prefixed by a '*'. To specify a "far" control tranfers, a 'l' must be prefixed, as in 'ljmp', 'lcall', etc. For example,
分支地址用寄存器或內(nèi)存操作數(shù)必須以*作為前綴
- 指定為far 必須以'l' 作為前綴
GAS syntax | NASM syntax |
---|---|
jmp *100 | jmp near [100] |
call *100 | call near [100] |
jmp *%eax | jmp near eax |
jmp *%ecx | call near ecx |
jmp *(%eax) | jmp near [eax] |
call *(%ebx) | call near [ebx] |
ljmp *100 | jmp far [100] |
lcall *100 | call far [100] |
ljmp *(%eax) | jmp far [eax] |
lcall *(%ebx) | call far [ebx] |
ret | retn |
lret | retf |
lret $0x100 | retf 0x100 |
Segment-offset pointers are specified using the following format:
段編移量指針會(huì)用于如下格式:
jmp $segment, $offset
For example:
jmp $0x10, $0x100000
If you keep these few things in mind, you'll catch up real soon. As for more details on the GNU assembler, you could try the documentation.
如果你把這些記住誊稚,你將很快熟悉AT&T 匯編語(yǔ)法翔始,更多詳情請(qǐng)看更多文檔(略)