[9] | 1 | .so mnx.mac
|
---|
| 2 | .TH AS 9
|
---|
| 3 | .\" unchecked (kjb)
|
---|
| 4 | .CD "as \(en assembler"
|
---|
| 5 | .SE "AS\(emASSEMBLER [IBM]"
|
---|
| 6 | .SP 1
|
---|
| 7 | .PP
|
---|
| 8 | This document describes the language accepted by the 80386 assembler
|
---|
| 9 | that is part of the Amsterdam Compiler Kit. Note that only the syntax is
|
---|
| 10 | described, only a few 386 instructions are shown as examples.
|
---|
| 11 | .SS "Tokens, Numbers, Character Constants, and Strings"
|
---|
| 12 | .PP
|
---|
| 13 | The syntax of numbers is the same as in C.
|
---|
| 14 | The constants 32, 040, and 0x20 all represent the same number, but are
|
---|
| 15 | written in decimal, octal, and hex, respectively.
|
---|
| 16 | The rules for character constants and strings are also the same as in C.
|
---|
| 17 | For example, \(fma\(fm is a character constant.
|
---|
| 18 | A typical string is "string".
|
---|
| 19 | Expressions may be formed with C operators, but must use [ and ] for
|
---|
| 20 | parentheses. (Normal parentheses are claimed by the operand syntax.)
|
---|
| 21 | .SS "Symbols"
|
---|
| 22 | .PP
|
---|
| 23 | Symbols contain letters and digits, as well as three special characters:
|
---|
| 24 | dot, tilde, and underscore.
|
---|
| 25 | The first character may not be a digit or tilde.
|
---|
| 26 | .PP
|
---|
| 27 | The names of the 80386 registers are reserved. These are:
|
---|
| 28 | .HS
|
---|
| 29 | ~~~al, bl, cl, dl
|
---|
| 30 | .br
|
---|
| 31 | ~~~ah, bh, ch, dh
|
---|
| 32 | .br
|
---|
| 33 | ~~~ax, bx, cx, dx, eax, ebx, ecx, edx
|
---|
| 34 | .br
|
---|
| 35 | ~~~si, di, bp, sp, esi, edi, ebp, esp
|
---|
| 36 | .br
|
---|
| 37 | ~~~cs, ds, ss, es, fs, gs
|
---|
| 38 | .HS
|
---|
| 39 | The xx and exx variants of the eight general registers are treated as
|
---|
| 40 | synonyms by the assembler. Normally "ax" is the 16-bit low half of the
|
---|
| 41 | 32-bit "eax" register. The assembler determines if a 16 or 32 bit
|
---|
| 42 | operation is meant solely by looking at the instruction or the
|
---|
| 43 | instruction prefixes. It is however best to use the proper registers
|
---|
| 44 | when writing assembly to not confuse those who read the code.
|
---|
| 45 | .HS
|
---|
| 46 | The last group of 6 segment registers are used for selector + offset mode
|
---|
| 47 | addressing, in which the effective address is at a given offset in one of
|
---|
| 48 | the 6 segments.
|
---|
| 49 | .PP
|
---|
| 50 | Names of instructions and pseudo-ops are not reserved.
|
---|
| 51 | Alphabetic characters in opcodes and pseudo-ops must be in lower case.
|
---|
| 52 | .SS "Separators"
|
---|
| 53 | .PP
|
---|
| 54 | Commas, blanks, and tabs are separators and can be interspersed freely
|
---|
| 55 | between tokens, but not within tokens.
|
---|
| 56 | Commas are only legal between operands.
|
---|
| 57 | .SS "Comments"
|
---|
| 58 | .PP
|
---|
| 59 | The comment character is \*(OQ!\*(CQ.
|
---|
| 60 | The rest of the line is ignored.
|
---|
| 61 | .SS "Opcodes"
|
---|
| 62 | .PP
|
---|
| 63 | The opcodes are listed below.
|
---|
| 64 | Notes: (1) Different names for the same instruction are separated by \*(OQ/\*(CQ.
|
---|
| 65 | (2) Square brackets ([]) indicate that 0 or 1 of the enclosed characters
|
---|
| 66 | can be included.
|
---|
| 67 | (3) Curly brackets ({}) work similarly, except that one of the
|
---|
| 68 | enclosed characters \fImust\fR be included.
|
---|
| 69 | Thus square brackets indicate an option, whereas curly brackets indicate
|
---|
| 70 | that a choice must be made.
|
---|
| 71 | .sp
|
---|
| 72 | .if t .ta 0.25i 1.2i 3i
|
---|
| 73 | .if n .ta 2 10 24
|
---|
| 74 | .nf
|
---|
| 75 | .B "Data Transfer"
|
---|
| 76 | .HS
|
---|
| 77 | mov[b] dest, source ! Move word/byte from source to dest
|
---|
| 78 | pop dest ! Pop stack
|
---|
| 79 | push source ! Push stack
|
---|
| 80 | xchg[b] op1, op2 ! Exchange word/byte
|
---|
| 81 | xlat ! Translate
|
---|
| 82 | o16 ! Operate on a 16 bit object instead of 32 bit
|
---|
| 83 |
|
---|
| 84 | .B "Input/Output"
|
---|
| 85 | .HS
|
---|
| 86 | in[b] source ! Input from source I/O port
|
---|
| 87 | in[b] ! Input from DX I/O port
|
---|
| 88 | out[b] dest ! Output to dest I/O port
|
---|
| 89 | out[b] ! Output to DX I/O port
|
---|
| 90 |
|
---|
| 91 | .B "Address Object"
|
---|
| 92 | .HS
|
---|
| 93 | lds reg,source ! Load reg and DS from source
|
---|
| 94 | les reg,source ! Load reg and ES from source
|
---|
| 95 | lea reg,source ! Load effect address of source to reg and DS
|
---|
| 96 | {cdsefg}seg ! Specify seg register for next instruction
|
---|
| 97 | a16 ! Use 16 bit addressing mode instead of 32 bit
|
---|
| 98 |
|
---|
| 99 | .B "Flag Transfer"
|
---|
| 100 | .HS
|
---|
| 101 | lahf ! Load AH from flag register
|
---|
| 102 | popf ! Pop flags
|
---|
| 103 | pushf ! Push flags
|
---|
| 104 | sahf ! Store AH in flag register
|
---|
| 105 |
|
---|
| 106 | .B "Addition"
|
---|
| 107 | .HS
|
---|
| 108 | aaa ! Adjust result of BCD addition
|
---|
| 109 | add[b] dest,source ! Add
|
---|
| 110 | adc[b] dest,source ! Add with carry
|
---|
| 111 | daa ! Decimal Adjust after addition
|
---|
| 112 | inc[b] dest ! Increment by 1
|
---|
| 113 |
|
---|
| 114 | .B "Subtraction"
|
---|
| 115 | .HS
|
---|
| 116 | aas ! Adjust result of BCD subtraction
|
---|
| 117 | sub[b] dest,source ! Subtract
|
---|
| 118 | sbb[b] dest,source ! Subtract with borrow from dest
|
---|
| 119 | das ! Decimal adjust after subtraction
|
---|
| 120 | dec[b] dest ! Decrement by one
|
---|
| 121 | neg[b] dest ! Negate
|
---|
| 122 | cmp[b] dest,source ! Compare
|
---|
| 123 |
|
---|
| 124 | .B "Multiplication"
|
---|
| 125 | .HS
|
---|
| 126 | aam ! Adjust result of BCD multiply
|
---|
| 127 | imul[b] source ! Signed multiply
|
---|
| 128 | mul[b] source ! Unsigned multiply
|
---|
| 129 |
|
---|
| 130 | .B "Division"
|
---|
| 131 | .HS
|
---|
| 132 | aad ! Adjust AX for BCD division
|
---|
| 133 | o16 cbw ! Sign extend AL into AH
|
---|
| 134 | o16 cwd ! Sign extend AX into DX
|
---|
| 135 | cwde ! Sign extend AX into EAX
|
---|
| 136 | cdq ! Sign extend EAX into EDX
|
---|
| 137 | idiv[b] source ! Signed divide
|
---|
| 138 | div[b] source ! Unsigned divide
|
---|
| 139 |
|
---|
| 140 | .B "Logical"
|
---|
| 141 | .HS
|
---|
| 142 | and[b] dest,source ! Logical and
|
---|
| 143 | not[b] dest ! Logical not
|
---|
| 144 | or[b] dest,source ! Logical inclusive or
|
---|
| 145 | test[b] dest,source ! Logical test
|
---|
| 146 | xor[b] dest,source ! Logical exclusive or
|
---|
| 147 |
|
---|
| 148 | .B "Shift"
|
---|
| 149 | .HS
|
---|
| 150 | sal[b]/shl[b] dest,CL ! Shift logical left
|
---|
| 151 | sar[b] dest,CL ! Shift arithmetic right
|
---|
| 152 | shr[b] dest,CL ! Shift logical right
|
---|
| 153 |
|
---|
| 154 | .B "Rotate"
|
---|
| 155 | .HS
|
---|
| 156 | rcl[b] dest,CL ! Rotate left, with carry
|
---|
| 157 | rcr[b] dest,CL ! Rotate right, with carry
|
---|
| 158 | rol[b] dest,CL ! Rotate left
|
---|
| 159 | ror[b] dest,CL ! Rotate right
|
---|
| 160 |
|
---|
| 161 | .B "String Manipulation"
|
---|
| 162 | .HS
|
---|
| 163 | cmps[b] ! Compare string element ds:esi with es:edi
|
---|
| 164 | lods[b] ! Load from ds:esi into AL, AX, or EAX
|
---|
| 165 | movs[b] ! Move from ds:esi to es:edi
|
---|
| 166 | rep ! Repeat next instruction until ECX=0
|
---|
| 167 | repe/repz ! Repeat next instruction until ECX=0 and ZF=1
|
---|
| 168 | repne/repnz ! Repeat next instruction until ECX!=0 and ZF=0
|
---|
| 169 | scas[b] ! Compare ds:esi with AL/AX/EAX
|
---|
| 170 | stos[b] ! Store AL/AX/EAX in es:edi
|
---|
| 171 |
|
---|
| 172 | .fi
|
---|
| 173 | .B "Control Transfer"
|
---|
| 174 | .PP
|
---|
| 175 | \fIAs\fR accepts a number of special jump opcodes that can assemble to
|
---|
| 176 | instructions with either a byte displacement, which can only reach to targets
|
---|
| 177 | within \(mi126 to +129 bytes of the branch, or an instruction with a 32-bit
|
---|
| 178 | displacement. The assembler automatically chooses a byte or word displacement
|
---|
| 179 | instruction.
|
---|
| 180 | .PP
|
---|
| 181 | The English translation of the opcodes should be obvious, with
|
---|
| 182 | \*(OQl(ess)\*(CQ and \*(OQg(reater)\*(CQ for signed comparisions, and
|
---|
| 183 | \*(OQb(elow)\*(CQ and \*(OQa(bove)*(CQ for unsigned comparisions. There are
|
---|
| 184 | lots of synonyms to allow you to write "jump if not that" instead of "jump
|
---|
| 185 | if this".
|
---|
| 186 | .PP
|
---|
| 187 | The \*(OQcall\*(CQ, \*(OQjmp\*(CQ, and \*(OQret\*(CQ instructions can be
|
---|
| 188 | either intrasegment or
|
---|
| 189 | intersegment. The intersegment versions are indicated with
|
---|
| 190 | the suffix \*(OQf\*(CQ.
|
---|
| 191 |
|
---|
| 192 | .if t .ta 0.25i 1.2i 3i
|
---|
| 193 | .if n .ta 2 10 24
|
---|
| 194 | .nf
|
---|
| 195 | .B Unconditional
|
---|
| 196 | .HS
|
---|
| 197 | jmp[f] dest ! jump to dest (8 or 32-bit displacement)
|
---|
| 198 | call[f] dest ! call procedure
|
---|
| 199 | ret[f] ! return from procedure
|
---|
| 200 |
|
---|
| 201 | .B "Conditional"
|
---|
| 202 | .HS
|
---|
| 203 | ja/jnbe ! if above/not below or equal (unsigned)
|
---|
| 204 | jae/jnb/jnc ! if above or equal/not below/not carry (uns.)
|
---|
| 205 | jb/jnae/jc ! if not above nor equal/below/carry (unsigned)
|
---|
| 206 | jbe/jna ! if below or equal/not above (unsigned)
|
---|
| 207 | jg/jnle ! if greater/not less nor equal (signed)
|
---|
| 208 | jge/jnl ! if greater or equal/not less (signed)
|
---|
| 209 | jl/jnqe ! if less/not greater nor equal (signed)
|
---|
| 210 | jle/jgl ! if less or equal/not greater (signed)
|
---|
| 211 | je/jz ! if equal/zero
|
---|
| 212 | jne/jnz ! if not equal/not zero
|
---|
| 213 | jno ! if overflow not set
|
---|
| 214 | jo ! if overflow set
|
---|
| 215 | jnp/jpo ! if parity not set/parity odd
|
---|
| 216 | jp/jpe ! if parity set/parity even
|
---|
| 217 | jns ! if sign not set
|
---|
| 218 | js ! if sign set
|
---|
| 219 |
|
---|
| 220 | .B "Iteration Control"
|
---|
| 221 | .HS
|
---|
| 222 | jcxz dest ! jump if ECX = 0
|
---|
| 223 | loop dest ! Decrement ECX and jump if CX != 0
|
---|
| 224 | loope/loopz dest ! Decrement ECX and jump if ECX = 0 and ZF = 1
|
---|
| 225 | loopne/loopnz dest ! Decrement ECX and jump if ECX != 0 and ZF = 0
|
---|
| 226 |
|
---|
| 227 | .B "Interrupt"
|
---|
| 228 | .HS
|
---|
| 229 | int n ! Software interrupt n
|
---|
| 230 | into ! Interrupt if overflow set
|
---|
| 231 | iretd ! Return from interrupt
|
---|
| 232 |
|
---|
| 233 | .B "Flag Operations"
|
---|
| 234 | .HS
|
---|
| 235 | clc ! Clear carry flag
|
---|
| 236 | cld ! Clear direction flag
|
---|
| 237 | cli ! Clear interrupt enable flag
|
---|
| 238 | cmc ! Complement carry flag
|
---|
| 239 | stc ! Set carry flag
|
---|
| 240 | std ! Set direction flag
|
---|
| 241 | sti ! Set interrupt enable flag
|
---|
| 242 |
|
---|
| 243 | .fi
|
---|
| 244 | .SS "Location Counter"
|
---|
| 245 | .PP
|
---|
| 246 | The special symbol \*(OQ.\*(CQ is the location counter and its value
|
---|
| 247 | is the address of the first byte of the instruction in which the symbol
|
---|
| 248 | appears and can be used in expressions.
|
---|
| 249 | .SS "Segments"
|
---|
| 250 | .PP
|
---|
| 251 | There are four different assembly segments: text, rom, data and bss.
|
---|
| 252 | Segments are declared and selected by the \fI.sect\fR pseudo-op. It is
|
---|
| 253 | customary to declare all segments at the top of an assembly file like
|
---|
| 254 | this:
|
---|
| 255 | .HS
|
---|
| 256 | ~~~.sect .text; .sect .rom; .sect .data; .sect .bss
|
---|
| 257 | .HS
|
---|
| 258 | The assembler accepts up to 16 different segments, but
|
---|
| 259 | .MX
|
---|
| 260 | expects only four to be used. Anything can in principle be assembled
|
---|
| 261 | into any segment, but the
|
---|
| 262 | .MX
|
---|
| 263 | bss segment may only contain uninitialized data.
|
---|
| 264 | Note that the \*(OQ.\*(CQ symbol refers to the location in the current
|
---|
| 265 | segment.
|
---|
| 266 | .SS "Labels"
|
---|
| 267 | .PP
|
---|
| 268 | There are two types: name and numeric. Name labels consist of a name
|
---|
| 269 | followed by a colon (:).
|
---|
| 270 | .PP
|
---|
| 271 | The numeric labels are single digits. The nearest 0: label may be
|
---|
| 272 | referenced as 0f in the forward direction, or 0b backwards.
|
---|
| 273 | .SS "Statement Syntax"
|
---|
| 274 | .PP
|
---|
| 275 | Each line consists of a single statement.
|
---|
| 276 | Blank or comment lines are allowed.
|
---|
| 277 | .SS "Instruction Statements"
|
---|
| 278 | .PP
|
---|
| 279 | The most general form of an instruction is
|
---|
| 280 | .HS
|
---|
| 281 | ~~~label: opcode operand1, operand2 ! comment
|
---|
| 282 | .HS
|
---|
| 283 | .SS "Expression Semantics"
|
---|
| 284 | .PP
|
---|
| 285 | .tr ~~
|
---|
| 286 | The following operators can be used:
|
---|
| 287 | + \(mi * / & | ^ ~ << (shift left) >> (shift right) \(mi (unary minus).
|
---|
| 288 | .tr ~
|
---|
| 289 | 32-bit integer arithmetic is used.
|
---|
| 290 | Division produces a truncated quotient.
|
---|
| 291 | .SS "Addressing Modes"
|
---|
| 292 | .PP
|
---|
| 293 | Below is a list of the addressing modes supported.
|
---|
| 294 | Each one is followed by an example.
|
---|
| 295 | .HS
|
---|
| 296 | .ta 0.25i 3i
|
---|
| 297 | .nf
|
---|
| 298 | constant mov eax, 123456
|
---|
| 299 | direct access mov eax, (counter)
|
---|
| 300 | register mov eax, esi
|
---|
| 301 | indirect mov eax, (esi)
|
---|
| 302 | base + disp. mov eax, 6(ebp)
|
---|
| 303 | scaled index mov eax, (4*esi)
|
---|
| 304 | base + index mov eax, (ebp)(2*esi)
|
---|
| 305 | base + index + disp. mov eax, 10(edi)(1*esi)
|
---|
| 306 | .HS
|
---|
| 307 | .fi
|
---|
| 308 | Any of the constants or symbols may be replacement by expressions. Direct
|
---|
| 309 | access, constants and displacements may be any type of expression. A scaled
|
---|
| 310 | index with scale 1 may be written without the \*(OQ1*\*(CQ.
|
---|
| 311 | .SS "Call and Jmp"
|
---|
| 312 | .PP
|
---|
| 313 | The \*(OQcall\*(CQ and \*(OQjmp\*(CQ instructions can be interpreted
|
---|
| 314 | as a load into the instruction pointer.
|
---|
| 315 | .HS
|
---|
| 316 | .ta 0.25i 3i
|
---|
| 317 | .nf
|
---|
| 318 | call _routine ! Direct, intrasegment
|
---|
| 319 | call (subloc) ! Indirect, intrasegment
|
---|
| 320 | call 6(ebp) ! Indirect, intrasegment
|
---|
| 321 | call ebx ! Direct, intrasegment
|
---|
| 322 | call (ebx) ! Indirect, intrasegment
|
---|
| 323 | callf (subloc) ! Indirect, intersegment
|
---|
| 324 | callf seg:offs ! Direct, intersegment
|
---|
| 325 | .HS
|
---|
| 326 | .fi
|
---|
| 327 | .SP 1
|
---|
| 328 | .SS "Symbol Assigment"
|
---|
| 329 | .SP 1
|
---|
| 330 | .PP
|
---|
| 331 | Symbols can acquire values in one of two ways.
|
---|
| 332 | Using a symbol as a label sets it to \*(OQ.\*(CQ for the current
|
---|
| 333 | segment with type relocatable.
|
---|
| 334 | Alternative, a symbol may be given a name via an assignment of the form
|
---|
| 335 | .HS
|
---|
| 336 | ~~~symbol = expression
|
---|
| 337 | .HS
|
---|
| 338 | in which the symbol is assigned the value and type of its arguments.
|
---|
| 339 | .SP 1
|
---|
| 340 | .SS "Storage Allocation"
|
---|
| 341 | .SP 1
|
---|
| 342 | .PP
|
---|
| 343 | Space can be reserved for bytes, words, and longs using pseudo-ops.
|
---|
| 344 | They take one or more operands, and for each generate a value
|
---|
| 345 | whose size is a byte, word (2 bytes) or long (4 bytes). For example:
|
---|
| 346 | .HS
|
---|
| 347 | .if t .ta 0.25i 3i
|
---|
| 348 | .if n .ta 2 24
|
---|
| 349 | .data1 2, 6 ! allocate 2 bytes initialized to 2 and 6
|
---|
| 350 | .br
|
---|
| 351 | .data2 3, 0x10 ! allocate 2 words initialized to 3 and 16
|
---|
| 352 | .br
|
---|
| 353 | .data4 010 ! allocate a longword initialized to 8
|
---|
| 354 | .br
|
---|
| 355 | .space 40 ! allocates 40 bytes of zeros
|
---|
| 356 | .HS
|
---|
| 357 | allocates 50 (decimal) bytes of storage, initializing the first two
|
---|
| 358 | bytes to 2 and 6, the next two words to 3 and 16, then one longword with
|
---|
| 359 | value 8 (010 octal), last 40 bytes of zeros.
|
---|
| 360 | .SS "String Allocation"
|
---|
| 361 | .PP
|
---|
| 362 | The pseudo-ops \fI.ascii\fR and \fI.asciz\fR
|
---|
| 363 | take one string argument and generate the ASCII character
|
---|
| 364 | codes for the letters in the string.
|
---|
| 365 | The latter automatically terminates the string with a null (0) byte.
|
---|
| 366 | For example,
|
---|
| 367 | .HS
|
---|
| 368 | ~~~.ascii "hello"
|
---|
| 369 | .br
|
---|
| 370 | ~~~.asciz "world\en"
|
---|
| 371 | .HS
|
---|
| 372 | .SS "Alignment"
|
---|
| 373 | .PP
|
---|
| 374 | Sometimes it is necessary to force the next item to begin at a word, longword
|
---|
| 375 | or even a 16 byte address boundary.
|
---|
| 376 | The \fI.align\fR pseudo-op zero or more null byte if the current location
|
---|
| 377 | is a multiple of the argument of .align.
|
---|
| 378 | .SS "Segment Control"
|
---|
| 379 | .PP
|
---|
| 380 | Every item assembled goes in one of the four segments: text, rom, data,
|
---|
| 381 | or bss. By using the \fI.sect\fR pseudo-op with argument
|
---|
| 382 | \fI.text, .rom, .data\fR or \fI.bss\fR, the programmer can force the
|
---|
| 383 | next items to go in a particular segment.
|
---|
| 384 | .SS "External Names"
|
---|
| 385 | .PP
|
---|
| 386 | A symbol can be given global scope by including it in a \fI.define\fR pseudo-op.
|
---|
| 387 | Multiple names may be listed, separate by commas.
|
---|
| 388 | It must be used to export symbols defined in the current program.
|
---|
| 389 | Names not defined in the current program are treated as "undefined
|
---|
| 390 | external" automatically, although it is customary to make this explicit
|
---|
| 391 | with the \fI.extern\fR pseudo-op.
|
---|
| 392 | .SS "Common"
|
---|
| 393 | .PP
|
---|
| 394 | The \fI.comm\fR pseudo-op declares storage that can be common to more than
|
---|
| 395 | one module. There are two arguments: a name and an absolute expression giving
|
---|
| 396 | the size in bytes of the area named by the symbol.
|
---|
| 397 | The type of the symbol becomes
|
---|
| 398 | external. The statement can appear in any segment.
|
---|
| 399 | If you think this has something to do with FORTRAN, you are right.
|
---|
| 400 | .SS "Examples"
|
---|
| 401 | .PP
|
---|
| 402 | In the kernel directory, there are several assembly code files that are
|
---|
| 403 | worth inspecting as examples.
|
---|
| 404 | However, note that these files, are designed to first be
|
---|
| 405 | run through the C preprocessor. (The very first character is a # to signal
|
---|
| 406 | this.) Thus they contain numerous constructs
|
---|
| 407 | that are not pure assembler.
|
---|
| 408 | For true assembler examples, compile any C program provided with
|
---|
| 409 | .MX
|
---|
| 410 | using the \fB\(enS\fR flag.
|
---|
| 411 | This will result in an assembly language file with a suffix with the same
|
---|
| 412 | name as the C source file, but ending with the .s suffix.
|
---|