| [9] | 1 | Notes on i80386 string assembly routines.               Author: Kees J. Bot | 
|---|
|  | 2 | 2 Jan 1994 | 
|---|
|  | 3 |  | 
|---|
|  | 4 | Remarks. | 
|---|
|  | 5 | All routines set up proper stack frames, so that stack traces can be | 
|---|
|  | 6 | derived from core dumps.  String routines are often the ones that | 
|---|
|  | 7 | get the bad pointer. | 
|---|
|  | 8 |  | 
|---|
|  | 9 | Flags are often not right in boundary cases (zero string length) on | 
|---|
|  | 10 | repeated string scanning or comparing instructions.  This has been | 
|---|
|  | 11 | handled in sometimes nonobvious ways. | 
|---|
|  | 12 |  | 
|---|
|  | 13 | Only the eax, edx, and ecx registers are not preserved, all other | 
|---|
|  | 14 | registers are.  This is what GCC expects.  (ACK sees ebx as scratch | 
|---|
|  | 15 | too.)  The direction byte is assumed to be wrong, and left clear on | 
|---|
|  | 16 | exit. | 
|---|
|  | 17 |  | 
|---|
|  | 18 | Assumptions. | 
|---|
|  | 19 | The average string is short, so short strings should not suffer from | 
|---|
|  | 20 | smart tricks to copy, compare, or search large strings fast.  This | 
|---|
|  | 21 | means that the routines are fast on average, but not optimal for | 
|---|
|  | 22 | long strings. | 
|---|
|  | 23 |  | 
|---|
|  | 24 | It doesn't pay to use word or longword operations on strings, the | 
|---|
|  | 25 | setup time hurts the average case. | 
|---|
|  | 26 |  | 
|---|
|  | 27 | Memory blocks are probably large and on word or longword boundaries. | 
|---|
|  | 28 |  | 
|---|
|  | 29 | No unaligned word moves are done.  Again the setup time may hurt the | 
|---|
|  | 30 | average case.  Furthermore, the author likes to enable the alignment | 
|---|
|  | 31 | check on a 486. | 
|---|
|  | 32 |  | 
|---|
|  | 33 | String routines. | 
|---|
|  | 34 | They have been implemented using byte string instructions.  The | 
|---|
|  | 35 | length of a string it usually determined first, followed by the | 
|---|
|  | 36 | actual operation. | 
|---|
|  | 37 |  | 
|---|
|  | 38 | Strcmp. | 
|---|
|  | 39 | This is the only string routine that uses a loop, and not | 
|---|
|  | 40 | instructions with a repeat prefix.  Problem is that we don't know | 
|---|
|  | 41 | how long the string is.  Scanning for the end costs if the strings | 
|---|
|  | 42 | are unequal in the first few bytes. | 
|---|
|  | 43 |  | 
|---|
|  | 44 | Strchr. | 
|---|
|  | 45 | The character we look for is often not there, or at some distance | 
|---|
|  | 46 | from the start.  The string is scanned twice, for the terminating | 
|---|
|  | 47 | zero and the character searched, in chunks of increasing length. | 
|---|
|  | 48 |  | 
|---|
|  | 49 | Memory routines. | 
|---|
|  | 50 | Memmove, memcpy, and memset use word or longword instructions if the | 
|---|
|  | 51 | address(es) are at word or longword boundaries.  No tricks to get | 
|---|
|  | 52 | alignment after doing a few bytes.  No unaligned operations. | 
|---|