1 | .so mnx.mac
|
---|
2 | .TH AS 9
|
---|
3 | .\" unchecked (kjb)
|
---|
4 | .CD "as \(en assembler"
|
---|
5 | .SE "AS\(emASSEMBLER [IBM]"
|
---|
6 | .SP 1
|
---|
7 | .PP
|
---|
8 | This document describes the language accepted by the 80386 assembler
|
---|
9 | that is part of the Amsterdam Compiler Kit. Note that only the syntax is
|
---|
10 | described, only a few 386 instructions are shown as examples.
|
---|
11 | .SS "Tokens, Numbers, Character Constants, and Strings"
|
---|
12 | .PP
|
---|
13 | The syntax of numbers is the same as in C.
|
---|
14 | The constants 32, 040, and 0x20 all represent the same number, but are
|
---|
15 | written in decimal, octal, and hex, respectively.
|
---|
16 | The rules for character constants and strings are also the same as in C.
|
---|
17 | For example, \(fma\(fm is a character constant.
|
---|
18 | A typical string is "string".
|
---|
19 | Expressions may be formed with C operators, but must use [ and ] for
|
---|
20 | parentheses. (Normal parentheses are claimed by the operand syntax.)
|
---|
21 | .SS "Symbols"
|
---|
22 | .PP
|
---|
23 | Symbols contain letters and digits, as well as three special characters:
|
---|
24 | dot, tilde, and underscore.
|
---|
25 | The first character may not be a digit or tilde.
|
---|
26 | .PP
|
---|
27 | The names of the 80386 registers are reserved. These are:
|
---|
28 | .HS
|
---|
29 | ~~~al, bl, cl, dl
|
---|
30 | .br
|
---|
31 | ~~~ah, bh, ch, dh
|
---|
32 | .br
|
---|
33 | ~~~ax, bx, cx, dx, eax, ebx, ecx, edx
|
---|
34 | .br
|
---|
35 | ~~~si, di, bp, sp, esi, edi, ebp, esp
|
---|
36 | .br
|
---|
37 | ~~~cs, ds, ss, es, fs, gs
|
---|
38 | .HS
|
---|
39 | The xx and exx variants of the eight general registers are treated as
|
---|
40 | synonyms by the assembler. Normally "ax" is the 16-bit low half of the
|
---|
41 | 32-bit "eax" register. The assembler determines if a 16 or 32 bit
|
---|
42 | operation is meant solely by looking at the instruction or the
|
---|
43 | instruction prefixes. It is however best to use the proper registers
|
---|
44 | when writing assembly to not confuse those who read the code.
|
---|
45 | .HS
|
---|
46 | The last group of 6 segment registers are used for selector + offset mode
|
---|
47 | addressing, in which the effective address is at a given offset in one of
|
---|
48 | the 6 segments.
|
---|
49 | .PP
|
---|
50 | Names of instructions and pseudo-ops are not reserved.
|
---|
51 | Alphabetic characters in opcodes and pseudo-ops must be in lower case.
|
---|
52 | .SS "Separators"
|
---|
53 | .PP
|
---|
54 | Commas, blanks, and tabs are separators and can be interspersed freely
|
---|
55 | between tokens, but not within tokens.
|
---|
56 | Commas are only legal between operands.
|
---|
57 | .SS "Comments"
|
---|
58 | .PP
|
---|
59 | The comment character is \*(OQ!\*(CQ.
|
---|
60 | The rest of the line is ignored.
|
---|
61 | .SS "Opcodes"
|
---|
62 | .PP
|
---|
63 | The opcodes are listed below.
|
---|
64 | Notes: (1) Different names for the same instruction are separated by \*(OQ/\*(CQ.
|
---|
65 | (2) Square brackets ([]) indicate that 0 or 1 of the enclosed characters
|
---|
66 | can be included.
|
---|
67 | (3) Curly brackets ({}) work similarly, except that one of the
|
---|
68 | enclosed characters \fImust\fR be included.
|
---|
69 | Thus square brackets indicate an option, whereas curly brackets indicate
|
---|
70 | that a choice must be made.
|
---|
71 | .sp
|
---|
72 | .if t .ta 0.25i 1.2i 3i
|
---|
73 | .if n .ta 2 10 24
|
---|
74 | .nf
|
---|
75 | .B "Data Transfer"
|
---|
76 | .HS
|
---|
77 | mov[b] dest, source ! Move word/byte from source to dest
|
---|
78 | pop dest ! Pop stack
|
---|
79 | push source ! Push stack
|
---|
80 | xchg[b] op1, op2 ! Exchange word/byte
|
---|
81 | xlat ! Translate
|
---|
82 | o16 ! Operate on a 16 bit object instead of 32 bit
|
---|
83 |
|
---|
84 | .B "Input/Output"
|
---|
85 | .HS
|
---|
86 | in[b] source ! Input from source I/O port
|
---|
87 | in[b] ! Input from DX I/O port
|
---|
88 | out[b] dest ! Output to dest I/O port
|
---|
89 | out[b] ! Output to DX I/O port
|
---|
90 |
|
---|
91 | .B "Address Object"
|
---|
92 | .HS
|
---|
93 | lds reg,source ! Load reg and DS from source
|
---|
94 | les reg,source ! Load reg and ES from source
|
---|
95 | lea reg,source ! Load effect address of source to reg and DS
|
---|
96 | {cdsefg}seg ! Specify seg register for next instruction
|
---|
97 | a16 ! Use 16 bit addressing mode instead of 32 bit
|
---|
98 |
|
---|
99 | .B "Flag Transfer"
|
---|
100 | .HS
|
---|
101 | lahf ! Load AH from flag register
|
---|
102 | popf ! Pop flags
|
---|
103 | pushf ! Push flags
|
---|
104 | sahf ! Store AH in flag register
|
---|
105 |
|
---|
106 | .B "Addition"
|
---|
107 | .HS
|
---|
108 | aaa ! Adjust result of BCD addition
|
---|
109 | add[b] dest,source ! Add
|
---|
110 | adc[b] dest,source ! Add with carry
|
---|
111 | daa ! Decimal Adjust after addition
|
---|
112 | inc[b] dest ! Increment by 1
|
---|
113 |
|
---|
114 | .B "Subtraction"
|
---|
115 | .HS
|
---|
116 | aas ! Adjust result of BCD subtraction
|
---|
117 | sub[b] dest,source ! Subtract
|
---|
118 | sbb[b] dest,source ! Subtract with borrow from dest
|
---|
119 | das ! Decimal adjust after subtraction
|
---|
120 | dec[b] dest ! Decrement by one
|
---|
121 | neg[b] dest ! Negate
|
---|
122 | cmp[b] dest,source ! Compare
|
---|
123 |
|
---|
124 | .B "Multiplication"
|
---|
125 | .HS
|
---|
126 | aam ! Adjust result of BCD multiply
|
---|
127 | imul[b] source ! Signed multiply
|
---|
128 | mul[b] source ! Unsigned multiply
|
---|
129 |
|
---|
130 | .B "Division"
|
---|
131 | .HS
|
---|
132 | aad ! Adjust AX for BCD division
|
---|
133 | o16 cbw ! Sign extend AL into AH
|
---|
134 | o16 cwd ! Sign extend AX into DX
|
---|
135 | cwde ! Sign extend AX into EAX
|
---|
136 | cdq ! Sign extend EAX into EDX
|
---|
137 | idiv[b] source ! Signed divide
|
---|
138 | div[b] source ! Unsigned divide
|
---|
139 |
|
---|
140 | .B "Logical"
|
---|
141 | .HS
|
---|
142 | and[b] dest,source ! Logical and
|
---|
143 | not[b] dest ! Logical not
|
---|
144 | or[b] dest,source ! Logical inclusive or
|
---|
145 | test[b] dest,source ! Logical test
|
---|
146 | xor[b] dest,source ! Logical exclusive or
|
---|
147 |
|
---|
148 | .B "Shift"
|
---|
149 | .HS
|
---|
150 | sal[b]/shl[b] dest,CL ! Shift logical left
|
---|
151 | sar[b] dest,CL ! Shift arithmetic right
|
---|
152 | shr[b] dest,CL ! Shift logical right
|
---|
153 |
|
---|
154 | .B "Rotate"
|
---|
155 | .HS
|
---|
156 | rcl[b] dest,CL ! Rotate left, with carry
|
---|
157 | rcr[b] dest,CL ! Rotate right, with carry
|
---|
158 | rol[b] dest,CL ! Rotate left
|
---|
159 | ror[b] dest,CL ! Rotate right
|
---|
160 |
|
---|
161 | .B "String Manipulation"
|
---|
162 | .HS
|
---|
163 | cmps[b] ! Compare string element ds:esi with es:edi
|
---|
164 | lods[b] ! Load from ds:esi into AL, AX, or EAX
|
---|
165 | movs[b] ! Move from ds:esi to es:edi
|
---|
166 | rep ! Repeat next instruction until ECX=0
|
---|
167 | repe/repz ! Repeat next instruction until ECX=0 and ZF=1
|
---|
168 | repne/repnz ! Repeat next instruction until ECX!=0 and ZF=0
|
---|
169 | scas[b] ! Compare ds:esi with AL/AX/EAX
|
---|
170 | stos[b] ! Store AL/AX/EAX in es:edi
|
---|
171 |
|
---|
172 | .fi
|
---|
173 | .B "Control Transfer"
|
---|
174 | .PP
|
---|
175 | \fIAs\fR accepts a number of special jump opcodes that can assemble to
|
---|
176 | instructions with either a byte displacement, which can only reach to targets
|
---|
177 | within \(mi126 to +129 bytes of the branch, or an instruction with a 32-bit
|
---|
178 | displacement. The assembler automatically chooses a byte or word displacement
|
---|
179 | instruction.
|
---|
180 | .PP
|
---|
181 | The English translation of the opcodes should be obvious, with
|
---|
182 | \*(OQl(ess)\*(CQ and \*(OQg(reater)\*(CQ for signed comparisions, and
|
---|
183 | \*(OQb(elow)\*(CQ and \*(OQa(bove)*(CQ for unsigned comparisions. There are
|
---|
184 | lots of synonyms to allow you to write "jump if not that" instead of "jump
|
---|
185 | if this".
|
---|
186 | .PP
|
---|
187 | The \*(OQcall\*(CQ, \*(OQjmp\*(CQ, and \*(OQret\*(CQ instructions can be
|
---|
188 | either intrasegment or
|
---|
189 | intersegment. The intersegment versions are indicated with
|
---|
190 | the suffix \*(OQf\*(CQ.
|
---|
191 |
|
---|
192 | .if t .ta 0.25i 1.2i 3i
|
---|
193 | .if n .ta 2 10 24
|
---|
194 | .nf
|
---|
195 | .B Unconditional
|
---|
196 | .HS
|
---|
197 | jmp[f] dest ! jump to dest (8 or 32-bit displacement)
|
---|
198 | call[f] dest ! call procedure
|
---|
199 | ret[f] ! return from procedure
|
---|
200 |
|
---|
201 | .B "Conditional"
|
---|
202 | .HS
|
---|
203 | ja/jnbe ! if above/not below or equal (unsigned)
|
---|
204 | jae/jnb/jnc ! if above or equal/not below/not carry (uns.)
|
---|
205 | jb/jnae/jc ! if not above nor equal/below/carry (unsigned)
|
---|
206 | jbe/jna ! if below or equal/not above (unsigned)
|
---|
207 | jg/jnle ! if greater/not less nor equal (signed)
|
---|
208 | jge/jnl ! if greater or equal/not less (signed)
|
---|
209 | jl/jnqe ! if less/not greater nor equal (signed)
|
---|
210 | jle/jgl ! if less or equal/not greater (signed)
|
---|
211 | je/jz ! if equal/zero
|
---|
212 | jne/jnz ! if not equal/not zero
|
---|
213 | jno ! if overflow not set
|
---|
214 | jo ! if overflow set
|
---|
215 | jnp/jpo ! if parity not set/parity odd
|
---|
216 | jp/jpe ! if parity set/parity even
|
---|
217 | jns ! if sign not set
|
---|
218 | js ! if sign set
|
---|
219 |
|
---|
220 | .B "Iteration Control"
|
---|
221 | .HS
|
---|
222 | jcxz dest ! jump if ECX = 0
|
---|
223 | loop dest ! Decrement ECX and jump if CX != 0
|
---|
224 | loope/loopz dest ! Decrement ECX and jump if ECX = 0 and ZF = 1
|
---|
225 | loopne/loopnz dest ! Decrement ECX and jump if ECX != 0 and ZF = 0
|
---|
226 |
|
---|
227 | .B "Interrupt"
|
---|
228 | .HS
|
---|
229 | int n ! Software interrupt n
|
---|
230 | into ! Interrupt if overflow set
|
---|
231 | iretd ! Return from interrupt
|
---|
232 |
|
---|
233 | .B "Flag Operations"
|
---|
234 | .HS
|
---|
235 | clc ! Clear carry flag
|
---|
236 | cld ! Clear direction flag
|
---|
237 | cli ! Clear interrupt enable flag
|
---|
238 | cmc ! Complement carry flag
|
---|
239 | stc ! Set carry flag
|
---|
240 | std ! Set direction flag
|
---|
241 | sti ! Set interrupt enable flag
|
---|
242 |
|
---|
243 | .fi
|
---|
244 | .SS "Location Counter"
|
---|
245 | .PP
|
---|
246 | The special symbol \*(OQ.\*(CQ is the location counter and its value
|
---|
247 | is the address of the first byte of the instruction in which the symbol
|
---|
248 | appears and can be used in expressions.
|
---|
249 | .SS "Segments"
|
---|
250 | .PP
|
---|
251 | There are four different assembly segments: text, rom, data and bss.
|
---|
252 | Segments are declared and selected by the \fI.sect\fR pseudo-op. It is
|
---|
253 | customary to declare all segments at the top of an assembly file like
|
---|
254 | this:
|
---|
255 | .HS
|
---|
256 | ~~~.sect .text; .sect .rom; .sect .data; .sect .bss
|
---|
257 | .HS
|
---|
258 | The assembler accepts up to 16 different segments, but
|
---|
259 | .MX
|
---|
260 | expects only four to be used. Anything can in principle be assembled
|
---|
261 | into any segment, but the
|
---|
262 | .MX
|
---|
263 | bss segment may only contain uninitialized data.
|
---|
264 | Note that the \*(OQ.\*(CQ symbol refers to the location in the current
|
---|
265 | segment.
|
---|
266 | .SS "Labels"
|
---|
267 | .PP
|
---|
268 | There are two types: name and numeric. Name labels consist of a name
|
---|
269 | followed by a colon (:).
|
---|
270 | .PP
|
---|
271 | The numeric labels are single digits. The nearest 0: label may be
|
---|
272 | referenced as 0f in the forward direction, or 0b backwards.
|
---|
273 | .SS "Statement Syntax"
|
---|
274 | .PP
|
---|
275 | Each line consists of a single statement.
|
---|
276 | Blank or comment lines are allowed.
|
---|
277 | .SS "Instruction Statements"
|
---|
278 | .PP
|
---|
279 | The most general form of an instruction is
|
---|
280 | .HS
|
---|
281 | ~~~label: opcode operand1, operand2 ! comment
|
---|
282 | .HS
|
---|
283 | .SS "Expression Semantics"
|
---|
284 | .PP
|
---|
285 | .tr ~~
|
---|
286 | The following operators can be used:
|
---|
287 | + \(mi * / & | ^ ~ << (shift left) >> (shift right) \(mi (unary minus).
|
---|
288 | .tr ~
|
---|
289 | 32-bit integer arithmetic is used.
|
---|
290 | Division produces a truncated quotient.
|
---|
291 | .SS "Addressing Modes"
|
---|
292 | .PP
|
---|
293 | Below is a list of the addressing modes supported.
|
---|
294 | Each one is followed by an example.
|
---|
295 | .HS
|
---|
296 | .ta 0.25i 3i
|
---|
297 | .nf
|
---|
298 | constant mov eax, 123456
|
---|
299 | direct access mov eax, (counter)
|
---|
300 | register mov eax, esi
|
---|
301 | indirect mov eax, (esi)
|
---|
302 | base + disp. mov eax, 6(ebp)
|
---|
303 | scaled index mov eax, (4*esi)
|
---|
304 | base + index mov eax, (ebp)(2*esi)
|
---|
305 | base + index + disp. mov eax, 10(edi)(1*esi)
|
---|
306 | .HS
|
---|
307 | .fi
|
---|
308 | Any of the constants or symbols may be replacement by expressions. Direct
|
---|
309 | access, constants and displacements may be any type of expression. A scaled
|
---|
310 | index with scale 1 may be written without the \*(OQ1*\*(CQ.
|
---|
311 | .SS "Call and Jmp"
|
---|
312 | .PP
|
---|
313 | The \*(OQcall\*(CQ and \*(OQjmp\*(CQ instructions can be interpreted
|
---|
314 | as a load into the instruction pointer.
|
---|
315 | .HS
|
---|
316 | .ta 0.25i 3i
|
---|
317 | .nf
|
---|
318 | call _routine ! Direct, intrasegment
|
---|
319 | call (subloc) ! Indirect, intrasegment
|
---|
320 | call 6(ebp) ! Indirect, intrasegment
|
---|
321 | call ebx ! Direct, intrasegment
|
---|
322 | call (ebx) ! Indirect, intrasegment
|
---|
323 | callf (subloc) ! Indirect, intersegment
|
---|
324 | callf seg:offs ! Direct, intersegment
|
---|
325 | .HS
|
---|
326 | .fi
|
---|
327 | .SP 1
|
---|
328 | .SS "Symbol Assigment"
|
---|
329 | .SP 1
|
---|
330 | .PP
|
---|
331 | Symbols can acquire values in one of two ways.
|
---|
332 | Using a symbol as a label sets it to \*(OQ.\*(CQ for the current
|
---|
333 | segment with type relocatable.
|
---|
334 | Alternative, a symbol may be given a name via an assignment of the form
|
---|
335 | .HS
|
---|
336 | ~~~symbol = expression
|
---|
337 | .HS
|
---|
338 | in which the symbol is assigned the value and type of its arguments.
|
---|
339 | .SP 1
|
---|
340 | .SS "Storage Allocation"
|
---|
341 | .SP 1
|
---|
342 | .PP
|
---|
343 | Space can be reserved for bytes, words, and longs using pseudo-ops.
|
---|
344 | They take one or more operands, and for each generate a value
|
---|
345 | whose size is a byte, word (2 bytes) or long (4 bytes). For example:
|
---|
346 | .HS
|
---|
347 | .if t .ta 0.25i 3i
|
---|
348 | .if n .ta 2 24
|
---|
349 | .data1 2, 6 ! allocate 2 bytes initialized to 2 and 6
|
---|
350 | .br
|
---|
351 | .data2 3, 0x10 ! allocate 2 words initialized to 3 and 16
|
---|
352 | .br
|
---|
353 | .data4 010 ! allocate a longword initialized to 8
|
---|
354 | .br
|
---|
355 | .space 40 ! allocates 40 bytes of zeros
|
---|
356 | .HS
|
---|
357 | allocates 50 (decimal) bytes of storage, initializing the first two
|
---|
358 | bytes to 2 and 6, the next two words to 3 and 16, then one longword with
|
---|
359 | value 8 (010 octal), last 40 bytes of zeros.
|
---|
360 | .SS "String Allocation"
|
---|
361 | .PP
|
---|
362 | The pseudo-ops \fI.ascii\fR and \fI.asciz\fR
|
---|
363 | take one string argument and generate the ASCII character
|
---|
364 | codes for the letters in the string.
|
---|
365 | The latter automatically terminates the string with a null (0) byte.
|
---|
366 | For example,
|
---|
367 | .HS
|
---|
368 | ~~~.ascii "hello"
|
---|
369 | .br
|
---|
370 | ~~~.asciz "world\en"
|
---|
371 | .HS
|
---|
372 | .SS "Alignment"
|
---|
373 | .PP
|
---|
374 | Sometimes it is necessary to force the next item to begin at a word, longword
|
---|
375 | or even a 16 byte address boundary.
|
---|
376 | The \fI.align\fR pseudo-op zero or more null byte if the current location
|
---|
377 | is a multiple of the argument of .align.
|
---|
378 | .SS "Segment Control"
|
---|
379 | .PP
|
---|
380 | Every item assembled goes in one of the four segments: text, rom, data,
|
---|
381 | or bss. By using the \fI.sect\fR pseudo-op with argument
|
---|
382 | \fI.text, .rom, .data\fR or \fI.bss\fR, the programmer can force the
|
---|
383 | next items to go in a particular segment.
|
---|
384 | .SS "External Names"
|
---|
385 | .PP
|
---|
386 | A symbol can be given global scope by including it in a \fI.define\fR pseudo-op.
|
---|
387 | Multiple names may be listed, separate by commas.
|
---|
388 | It must be used to export symbols defined in the current program.
|
---|
389 | Names not defined in the current program are treated as "undefined
|
---|
390 | external" automatically, although it is customary to make this explicit
|
---|
391 | with the \fI.extern\fR pseudo-op.
|
---|
392 | .SS "Common"
|
---|
393 | .PP
|
---|
394 | The \fI.comm\fR pseudo-op declares storage that can be common to more than
|
---|
395 | one module. There are two arguments: a name and an absolute expression giving
|
---|
396 | the size in bytes of the area named by the symbol.
|
---|
397 | The type of the symbol becomes
|
---|
398 | external. The statement can appear in any segment.
|
---|
399 | If you think this has something to do with FORTRAN, you are right.
|
---|
400 | .SS "Examples"
|
---|
401 | .PP
|
---|
402 | In the kernel directory, there are several assembly code files that are
|
---|
403 | worth inspecting as examples.
|
---|
404 | However, note that these files, are designed to first be
|
---|
405 | run through the C preprocessor. (The very first character is a # to signal
|
---|
406 | this.) Thus they contain numerous constructs
|
---|
407 | that are not pure assembler.
|
---|
408 | For true assembler examples, compile any C program provided with
|
---|
409 | .MX
|
---|
410 | using the \fB\(enS\fR flag.
|
---|
411 | This will result in an assembly language file with a suffix with the same
|
---|
412 | name as the C source file, but ending with the .s suffix.
|
---|