Neuromancer Codewheel – Part 2

Editorial note: This is the second of a two-part tutorial on reverse engineering executables. Today, we’ll walk through the process of analyzing a series of assembly instructions. These instructions implement the codewheel verification check from the 1989 game “Neuromancer”. We covered the process of finding these instructions last week.


Let’s dive right in, and begin to examine the code, instruction by instruction.

Stack setup

A brief note on the format of the following instructions: The first column of numbers (e.g. 13DB:5992) contains segment:offset addresses describing the instructions’ locations in memory, as determined in last week’s exercise. The second column of numbers (e.g. 55, or 8BEC) contains the machine language bytes which comprise the instructions. The remainder of a row (e.g. PUSH BP) contains the assembly language interpretation of those bytes (as 8086 real-mode machine code).

With that said, here are the first three instructions:

13DB:5992 55            PUSH    BP
13DB:5993 8BEC          MOV     BP, SP
13DB:5995 83EC0C        SUB     SP, +0C

These instructions initialize the stack frame: they save the Base Pointer from the previous frame on the stack, set the Base Pointer for the current frame to the current Stack Pointer, and allocate 12 bytes of memory on the stack, addressable as [BP-0C] through [BP-01].

File loading

13DB:5998 B87C6A        MOV     AX, 6A7C
13DB:599B 1E            PUSH    DS
13DB:599C 50            PUSH    AX
13DB:599D B84A54        MOV     AX, 544A
13DB:59A0 50            PUSH    AX
13DB:59A1 E8ACCB        CALL    2550
13DB:59A4 83C406        ADD     SP, +06

Call the function at CS:2550 with a near pointer to DS:544A, and a far pointer to DS:6A7C. (I list the near pointer first, since it’s passed at a lower memory address on the stack.) After the function call, the arguments are deleted from the stack.

If we use DOS DEBUG, we can find out a little more about what this function call does. If we set a breakpoint just before the function call (with the debugger command “g 13db:59a1“) we will see that, before the call, DS:544A points to the NULL-terminated string “paxcodes.txh”, and DS:6A7C points to garbage. (Use the debugger’s “d” command – e.g. “d ds:544a” – to examine the contents of memory.) If we run to the instruction following the call (with the debugger command “g 13db:59a4“) we will find that DS:6A7C now points to a series of NULL-terminated strings including words such as “Chatsubo”, “Cyberspace”, “Gemeinschaft”, and so on; the keywords found on the code wheel.

It seems reasonable to conclude that CS:2550 loads the contents of a file into memory; the filename is given by its first argument, the destination buffer by its second. (The file itself is presumably stored somewhere within the NEURO1.DAT or NEURO2.DAT files that ship with the game.)

Mystery function 1

13DB:59A7 B80100        MOV     AX, 0001
13DB:59AA 50            PUSH    AX
13DB:59AB E8D3F7        CALL    5181
13DB:59AE 83C402        ADD     SP, +02

Call the function at CS:5181 with an argument of 1, then delete the argument from the stack. Its purpose is unclear.

Mystery function 2

13DB:59B1 B80800        MOV     AX, 0008
13DB:59B4 50            PUSH    AX
13DB:59B5 50            PUSH    AX
13DB:59B6 E8F6F1        CALL    4BAF
13DB:59B9 83C404        ADD     SP, +04

Call the function at CS:4BAF with arguments of 8 and 8, then delete the arguments from the stack. The purpose of this function is not immediately obvious.

Display text

13DB:59BC B80200        MOV     AX, 0002
13DB:59BF 50            PUSH    AX
13DB:59C0 B85854        MOV     AX, 5458
13DB:59C3 50            PUSH    AX
13DB:59C4 E813F2        CALL    4BDA
13DB:59C7 83C404        ADD     SP, +04

Call the function at CS:4BDA with arguments of a near pointer to DS:5458 and the integer 2, then delete the arguments from the stack. The meaning of the “2” is obscure, but DS:5458 points (at the time of function invocation, as determined with DEBUG) to the NULL-terminated string “PAX – Public Access System”. Since this text appears at the top of the prompt screen, it seems likely that CS:4BDA prints the text given by its first argument to the screen.

Problem setup

13DB:59CA E801A1        CALL    FACE
13DB:59CD 250F00        AND     AX, 000F
13DB:59D0 8946FC        MOV     [BP-04], AX
13DB:59D3 E8F8A0        CALL    FACE
13DB:59D6 250F00        AND     AX, 000F
13DB:59D9 8946FA        MOV     [BP-06], AX
13DB:59DC E8EFA0        CALL    FACE
13DB:59DF 250F00        AND     AX, 000F
13DB:59E2 8946F8        MOV     [BP-08], AX

This code makes 3 calls to the function at CS:FACE, and stores the low 4 bits of the results to WORDs of memory at [BP-04], [BP-06], and [BP-08]. Since CS:FACE is called 3 times, it’s probably expected to return different values each time (otherwise, it would be more efficient to simply copy the results of one call) despite the fact that it’s called with no arguments. This implies that CS:FACE is a pseudo-random number generator. The pRNG hypothesis is bolstered by the facts that the results of the calls to CS:FACE are clipped to one of the 16 values between 0 and 15, and that there are 16 labels on the inner and outer codewheel rings, and 16 codewheel windows. From context, it seems likely that this code is (pseudo-)randomly generating the PAX verification query to be posed.

Mystery function 2 (redux)

13DB:59E5 B81800        MOV     AX, 0018
13DB:59E8 50            PUSH    AX
13DB:59E9 B86000        MOV     AX, 0060
13DB:59EC 50            PUSH    AX
13DB:59ED E8BFF1        CALL    4BAF
13DB:59F0 83C404        ADD     SP, +04

Another call to CS:4BAF, this time with arguments of 0x60, and 0x18.

Mystery function 3

13DB:59F3 B80200        MOV     AX, 0002
13DB:59F6 50            PUSH    AX
13DB:59F7 FF76FC        PUSH    [BP-04]
13DB:59FA B87C6A        MOV     AX, 6A7C
13DB:59FD 50            PUSH    AX
13DB:59FE E8BC21        CALL    7BBD
13DB:5A01 83C404        ADD     SP, +04

Call the function at CS:7BBD with arguments of a near pointer to DS:6A7C, the contents of [BP-04] (the first previously randomly-generated number between 0 and 15) and the integer 2. The meaning of the “2” is obscure, but DS:6A7C points to a series of NULL-terminated strings including words such as “Chatsubo”, “Cyberspace”, “Gemeinschaft”, and so on: the keywords found on the code wheel. More particularly, the 1st 16 strings are from the outside ring of the codewheel, the next 16 from the inside ring, and the last 16 from the windows of the codewheel.

After the call to CS:7BBD, remove the 1st two arguments from the stack, but leave the last – the integer “2” – in place for the next call.

Display text (redux)

13DB:5A04 50            PUSH    AX
13DB:5A05 E8D2F1        CALL    4BDA
13DB:5A08 83C404        ADD     SP, +04

Call the “print” function at CS:4BDA with arguments of the CS:7BBD return value and the integer “2”, then delete the arguments from the stack.

Display inner ring name

13DB:5A0B B82000        MOV     AX, 0020
13DB:5A0E 50            PUSH    AX
13DB:5A0F B86000        MOV     AX, 0060
13DB:5A12 50            PUSH    AX
13DB:5A13 E899F1        CALL    4BAF
13DB:5A16 83C404        ADD     SP, +04
13DB:5A19 B80200        MOV     AX, 0002
13DB:5A1C 50            PUSH    AX
13DB:5A1D 8B46FA        MOV     AX, [BP-06]
13DB:5A20 051000        ADD     AX, 0010
13DB:5A23 50            PUSH    AX
13DB:5A24 B87C6A        MOV     AX, 6A7C
13DB:5A27 50            PUSH    AX
13DB:5A28 E89221        CALL    7BBD
13DB:5A2B 83C404        ADD     SP, +04
13DB:5A2E 50            PUSH    AX
13DB:5A2F E8A8F1        CALL    4BDA
13DB:5A32 83C404        ADD     SP, +04

This block of code repeats the 3 function calls we just saw, with subtle, but significant differences that allow us to draw some conclusions about what is going on.

First of all, the block begins with another call to CS:4BAF, this time with arguments of 0x60 and 0x20. We’ve now seen 3 calls to CS:4BAF, with arguments of (0x08, 0x08), (0x60, 0x18), and (0x60, 0x20), each of which was quickly followed by a call to the “print” function at CS:4BDA. This suggests that CS:4BAF is doing some preparatory work for the “print” function; the most obvious guess is that it positions the cursor.

An examination of the prompt screen suggests that the game is using an 8×8 pixel monospaced font. If we assume that CS:4BAF is a cursor positioning function, and that its arguments represent X and Y offsets in pixels, we’d expect the 2nd and 3rd lines to be indented 11 characters relative to the 1st line, the 2nd line to be 2 lines below the 1st, and the 3rd 1 line below the 2nd. This is, in fact, exactly what we see.

Secondly, another call is made to CS:7BBD, but the 2nd argument is now the 2nd randomly generated number plus 16. Finally, the CS:7BBD return value is printed to the screen.

It seems likely that CS:7BBD returns the Nth NULL-terminated string from an array of character data. It also seems likely that [BP-04] is the outer ring position, [BP-06] the inner ring position, and [BP-08] the codewheel window.

Display window name

13DB:5A35 B82800        MOV     AX, 0028
13DB:5A38 50            PUSH    AX
13DB:5A39 B86000        MOV     AX, 0060
13DB:5A3C 50            PUSH    AX
13DB:5A3D E86FF1        CALL    4BAF
13DB:5A40 83C404        ADD     SP, +04
13DB:5A43 B80200        MOV     AX, 0002
13DB:5A46 50            PUSH    AX
13DB:5A47 8B46F8        MOV     AX, [BP-08]
13DB:5A4A 052000        ADD     AX, 0020
13DB:5A4D 50            PUSH    AX
13DB:5A4E B87C6A        MOV     AX, 6A7C
13DB:5A51 50            PUSH    AX
13DB:5A52 E86821        CALL    7BBD
13DB:5A55 83C404        ADD     SP, +04
13DB:5A58 50            PUSH    AX
13DB:5A59 E87EF1        CALL    4BDA
13DB:5A5C 83C404        ADD     SP, +04

The third almost-repetition of the code block. Based upon the conclusions we’ve already drawn, we can assume that these instructions print the name of the codewheel window at character position (0x0C, 0x05).

Display prompt

13DB:5A5F B83800        MOV     AX, 0038
13DB:5A62 50            PUSH    AX
13DB:5A63 B80800        MOV     AX, 0008
13DB:5A66 50            PUSH    AX
13DB:5A67 E845F1        CALL    4BAF
13DB:5A6A 83C404        ADD     SP, +04
13DB:5A6D B80200        MOV     AX, 0002
13DB:5A70 50            PUSH    AX
13DB:5A71 B87454        MOV     AX, 5474
13DB:5A74 50            PUSH    AX
13DB:5A75 E862F1        CALL    4BDA
13DB:5A78 83C404        ADD     SP, +04

This code positions the cursor at character position (0x01, 0x07), and prints the string at DS:5474 – “Enter verification code:”.

Position cursor

13DB:5A7B B83800        MOV     AX, 0038
13DB:5A7E 50            PUSH    AX
13DB:5A7F B8D000        MOV     AX, 00D0
13DB:5A82 50            PUSH    AX
13DB:5A83 E829F1        CALL    4BAF
13DB:5A86 83C404        ADD     SP, +04

This code positions the cursor at character position (0x1A, 0x07) – 1 character to the left of the string just printed.

Mystery function 4

13DB:5A89 2BC0          SUB     AX, AX
13DB:5A8B 50            PUSH    AX
13DB:5A8C B80600        MOV     AX, 0006
13DB:5A8F 50            PUSH    AX
13DB:5A90 E84405        CALL    5FD7
13DB:5A93 83C404        ADD     SP, +04

Call the function at CS:5FD7 with arguments of 6 and 0, then delete the arguments from the stack. The purpose of this function isn’t immediately obvious, but it’s perhaps suggestive that the longest number on the codewheel is only 6 digits long.

Check for input

13DB:5A96 3DFFFF        CMP     AX, FFFF
13DB:5A99 7508          JNZ     5AA3
13DB:5A9B 83FAFF        CMP     DX, -01
13DB:5A9E 7503          JNZ     5AA3
13DB:5AA0 E904FF        JMP     59A7

Test the results of the call to CS:5FD7: If AX equals 0xFFFF and DX equals -1, sign-extended (which comes to the same test for each register, really) then execution jumps back to CS:59A7, which is just after the loading of “paxcodes.txh”, and just before the problem is generated or any text is written to the screen.

In practice, the jump back to CS:59A7 seems to occur when the user hits “Return” at the PAX verification prompt without entering any data. This lets us conclude two things:

  • CS:5FD7 is probably responsible for gathering keyboard input from the user at the prompt
  • CS:5181 is probably some sort of screen-clearing function (it’s the only still-unexplained function we’ve seen, and it seems likely that some function is clearing the screen).

Display acknowledgement

13DB:5AA3 B85800        MOV     AX, 0058
13DB:5AA6 50            PUSH    AX
13DB:5AA7 50            PUSH    AX
13DB:5AA8 E804F1        CALL    4BAF
13DB:5AAB 83C404        ADD     SP, +04
13DB:5AAE B80200        MOV     AX, 0002
13DB:5AB1 50            PUSH    AX
13DB:5AB2 B88E54        MOV     AX, 548E
13DB:5AB5 50            PUSH    AX
13DB:5AB6 E821F1        CALL    4BDA
13DB:5AB9 83C404        ADD     SP, +04

This code positions the cursor at character position (0x0B, 0x0B), and prints the string at DS:548E – “Verifying access…”.

Calculate LUT column

13DB:5ABC 8B5EF8        MOV     BX, [BP-08]
13DB:5ABF 8A875620      MOV     AL, [BX+2056]
13DB:5AC3 2AE4          SUB     AH, AH
13DB:5AC5 0346FC        ADD     AX, [BP-04]
13DB:5AC8 2B46FA        SUB     AX, [BP-06]
13DB:5ACB 250F00        AND     AX, 000F
13DB:5ACE 8946FE        MOV     [BP-02], AX

Look up the window index in the BYTE array at DS:2056, then add the outer ring index to it, and subtract the inner ring index from it. Store the low 4 bits of the result in the WORD at [BP-02]. To understand the significance of these calculations, we must consider the physical construction of the codewheel.

The codewheel consists of two paper disks, one slightly larger than the other, which are mounted on a common axis, and which may turn independently. The larger disk is divided into 16 radial slices; each slice contains a column of 8 codes and a label (“Chatsubo”, “Cyberspace”, “Gemeinschaft”, etc.) above the codes, on the disk’s perimeter.

Each code on the larger disk may be identified by a (rank, name) pair, where “name” is taken from the set of labels on the larger disk, and “rank” is a number from 0 to 7, where 0 represents the topmost/outermost code in a column. For instance, (0, “Cyberspace”) is 021655, (2, “Chatsubo”) is 44312, and (7, “Cyberspace”) is 045.

Each slice of the larger disk may also be assigned a number, from 0 to 15. In particular, we can assign each slice an index based upon the number of slices clockwise it falls from the “Chatsubo” slice. This allows us to identify each code on the larger disk with a pair of numbers. Restating the previous examples in these terms, (0, 1) is 021655, (2, 0) is 44312, and (7, 1) is 045. This sort of addressing lets us represent the set of codes on the larger disk as an 8 row, 16 column table.

The codewheel’s smaller disk is also divided into 16 radial slices, each of which is labelled with a name (“Ratz”, “Holografix”, “Larry Moe”, etc.) on the disk’s perimeter. The smaller disk is mounted in front of the larger disk, such that only the labels on the larger disk’s perimeter are visible at all times. The smaller disk also has 16 windows cut into it, which are aligned so that some of the codes on the larger disk are visible; which particular codes are visible depends upon the rotation of the smaller disk with respect to the larger disk. Each window is assigned a label (“Zion Cluster”, “Chiba City”, “Asano Computing”, etc.).

Each slice of the smaller disk may be assigned a number, from 0 to 15. In particular, we can assign each slice an index based upon the number of slices clockwise it is from the “Ratz” slice. We may also assign indicies to the windows by listing the windows in each slice from perimeter to center, and by ordering the slices by increasing clockwise distance from the “Ratz” slice. This convention would assign an index of 0 to “Zion Cluster”, 1 to “Chiba City”, 2 to “Asano Computing”, and so on, through 15 for “Fuji Electric”.

Each window may be characterized by a (rank, slice) pair, describing that window’s location on the smaller disk. The “slice” value is equal to the index of the slice in which the window is found, while “rank” is equal to the rank of the codes over which the window falls. For instance, the “0” or “Zion Cluster” window has a characteristic of (2, 0), while the “2” or “Asano Computing” window has a characteristic of (0, 1).

Finally, the relative rotation of the larger and smaller disks may be summarized by the index of the outer disk slice aligned with the inner disk’s “0” or “Ratz” slice. If the “Ratz” and “Chatsubo” slices are aligned, the disk’s rotation is 0. If the “Ratz” and “Cyberspace” slices are aligned, the disk’s rotation is 1.

With all that said, we can make some sense of the preceeding code. First of all, a disk rotation is represented in the verification query by one of the 16 inner and outer slice pairs that are aligned in that particular rotation. For instance, a rotation of 5 might be represented by an (outer, inner) pair of (7, 2), or (2, 13) – in terms of slice labels, these pairs would be (“Donut World”, “Larry Moe”) and (“Gemeinschaft”, “Cowboy”). To convert such a pair to a rotation, just subtract the “inner” index from the “outer” index, modulo 16. (The modulo computation addresses the fact that indices -11, 5, and 21 all refer to the same slice on a 16-segment disk.)

The rotation tells us which slice of the larger disk is aligned with the “0” slice of the smaller disk. If we add the slice characteristic of a particular window to this rotation (modulo 16), we will compute the index of the slice of the larger disk which lies behind that window. This computed index, combined with the window’s rank, yields the (rank, slice) pair of a code, which can ultimately be checked against user input.

Now we can understand the ASM fragment we are currently examining. The BYTE array at DS:2056 contains these 16 values:

  • 00 00 01 01 02 03 03 04 05 05 06 06 07 08 09 09

These values are the “slice” halves of the window characteristics; they describe, for each window, that window’s clockwise rotation from the “Ratz” slice of the smaller disk.

The code adds the outer ring index to, and subtracts the inner ring index from, a slice offset taken from this array; this effectively adds the disk’s rotation to the slice offset, and computes the index of the slice of the larger disk which lies behind the verification query’s window. By storing only the low 4 bits of this index, the code performs a modulo 16 operation on the computed index, ensuring it falls between 0 and 15.

That’s probably too much explanation for 21 bytes of code, which is why I prefer programming to writing.

Calculate LUT row and index

13DB:5AD1 8A876620      MOV     AL, [BX+2066]
13DB:5AD5 2AE4          SUB     AH, AH
13DB:5AD7 B104          MOV     CL, 04
13DB:5AD9 D3E0          SHL     AX, CL
13DB:5ADB 0146FE        ADD     [BP-02], AX

Look up the window index in the BYTE array at DS:2066, multiply the value there by 16, and add the result to the slice index computed by the previous piece of code.

The BYTE array at DS:2066 contains these 16 values:

  • 02 05 00 07 03 01 06 04 00 07 02 04 06 01 03 05

These values are the “rank” halves of the window characteristics; they describe, for each window, the rank of the codes exposed on the larger disk by that window. When a rank is multipled by 16 and added to a slice index, the result is an index into a 16 column table stored in row-major form.

Calculate LUT value

13DB:5ADE 8B5EFE        MOV     BX, [BP-02]
13DB:5AE1 D1E3          SHL     BX, 1
13DB:5AE3 8B877620      MOV     AX, [BX+2076]
13DB:5AE7 8946FE        MOV     [BP-02], AX

Use the index we just computed to retrieve a WORD from the WORD array at DS:2076, and store it in the WORD at [BP-02]. The WORD array at DS:2076 begins with these values:

  • 9BD1 23AD 97B7 ...

These don’t have any obvious relationship to the 1st three codewheel codes (115721, 021655, 113667), but let’s see what the rest of the code does …

Input processing – loop setup

13DB:5AEA 2BC0          SUB     AX, AX
13DB:5AEC 8946FC        MOV     [BP-04], AX
13DB:5AEF 8946FA        MOV     [BP-06], AX
13DB:5AF2 EB03          JMP     5AF7

Initialize a loop: Set the WORDs at [BP-04] and [BP-06] to zero, then skip the “increment” step of the loop (at CS:5AF4) by jumping to the instruction at CS:5AF7.

Input processing – loop increment

13DB:5AF4 FF46FC        INC     WORD PTR [BP-04]

The “increment” step of a loop: add 1 to the WORD at [BP-04], which is, presumably, the loop counter.

Input processing – loop test

13DB:5AF7 8B5EFC        MOV     BX, [BP-04]
13DB:5AFA 80BF84693C    CMP     BYTE PTR [BX+6984], 3C
13DB:5AFF 7412          JZ      5B13

Check if the Ith (where i is the loop counter, the WORD at [BP-04]) element of the BYTE array at DS:6984 is equal to 0x3C, or the ASCII character ‘<‘. If it is, exit the loop by jumping to instruction CS:5B13.

Input processing – process character

13DB:5B01 B103          MOV     CL, 03
13DB:5B03 D366FA        SHL     WORD PTR [BP-06], CL
13DB:5B06 8A878469      MOV     AL, [BX+6984]
13DB:5B0A 98            CBW
13DB:5B0B 2D3000        SUB     AX, 0030
13DB:5B0E 0146FA        ADD     [BP-06], AX

Shift the WORD at [BP-06] left by 3 bits, and then add the difference between the Ith (where i is the loop counter, the WORD at [BP-04]) element of the BYTE array at DS:6984 and 0x30, or the ASCII character ‘0’. This has the effect of building up in the WORD at [BP-06] a binary-coded-decimal (BCD) representation of a series of ASCII digits stored at DS:6984 – assuming that no digit is greater than 7, of course.

Experiment reveals (enter “g 13db:5af7” into DOS DEBUG, then access the PAX terminal, enter some text, and check memory with “d 6984“) that whatever the user enters at the PAX verification prompt is stored at DS:6984 (followed by a ‘<‘ character), and it can be seen that no code on the PAX codewheel contains an ‘8’ or ‘9’ digit. (Also, no 6-digit PAX code contains a leading digit other than ‘0’ or ‘1’, which makes sense, since a 16-bit WORD has space for only 5 3-bit BCD numbers, and 1 extra bit.)

When the 1st 3 codewheel codes (115721, 021655, 113667) are encoded with this algorithm implemented by this loop, they match the 1st 3 elements of the WORD array at DS:2076 (9BD1 23AD 97B7).

Input processing – loop!

13DB:5B11 EBE1          JMP     5AF4

Loop back to instruction CS:5AF4; process the next character.

Delay

13DB:5B13 B81400        MOV     AX, 0014
13DB:5B16 50            PUSH    AX
13DB:5B17 E816DF        CALL    3A30
13DB:5B1A 83C402        ADD     SP, +02

Call the function at CS:3A30 with an argument of 0x14, then delete the argument from the stack. Its purpose is unclear. By setting breakpoints at 13DB:5B17 and 13DB:5B1A, however, and doing a little “wristwatch benchmarking”, it seems that CS:3A30 generates the delay experienced by the user during PAX verification; it may serve no function other than giving the user the idea that the PAX system is taking a while to perform validation.

Let’s take a moment to contemplate code written specifically to make a 1989-era PC respond more slowly, in order to make a simulated 21st century computer seem more realistic.

Ok, that was fun. Let’s move on.

Cursor positioning

13DB:5B1D B85800        MOV     AX, 0058
13DB:5B20 50            PUSH    AX
13DB:5B21 50            PUSH    AX
13DB:5B22 E88AF0        CALL    4BAF
13DB:5B25 83C404        ADD     SP, +04

Position the cursor at character position (0x0B, 0x0B).

Validation test

13DB:5B28 8B46FE        MOV     AX, [BP-02]
13DB:5B2B 3946FA        CMP     [BP-06], AX
13DB:5B2E 741F          JZ    5B4F

Jump to CS:5B4F iff the value computed from the verification problem’s inner, outer, and window codewheel indexes (stored in the WORD at [BP-02]) matches that computed based upon user input (stored in the WORD at [BP-06]).

This code controls whether the function exits successfully, or loops and queries the user again.

Failure message

13DB:5B30 B80200        MOV     AX, 0002
13DB:5B33 50            PUSH    AX
13DB:5B34 B8A254        MOV     AX, 54A2
13DB:5B37 50            PUSH    AX
13DB:5B38 E89FF0        CALL    4BDA
13DB:5B3B 83C404        ADD     SP, +04

Print the string at DS:54A2 – ” Access denied “.

Mystery function 5

13DB:5B3E B80600        MOV     AX, 0006
13DB:5B41 50            PUSH    AX
13DB:5B42 9A6E33DB13    CALL    13DB:336E
13DB:5B47 83C402        ADD     SP, +02

Call the function at 13DB:336E with an argument of 6, then delete the argument from the stack. Its purpose is unclear.

Retry

13DB:5B4A E95AFE        JMP     59A7

Jump back to CS:59A7, which is just after the loading of “paxcodes.txh”, and just before the problem is generated or any text is written to the screen. (This jump is taken iff the earlier comparison of the WORDs at [BP-02] and [BP-06] failed.)

An orphan

13DB:5B4D EB1A          JMP     5B69

Unreachable code!

Success message

13DB:5B4F B80200        MOV     AX, 0002
13DB:5B52 50            PUSH    AX
13DB:5B53 B8B654        MOV     AX, 54B6
13DB:5B56 50            PUSH    AX
13DB:5B57 E880F0        CALL    4BDA
13DB:5B5A 83C404        ADD     SP, +04

Print the string at DS:54B6 – ” Access allowed “.

Mystery function 5 (redux)

13DB:5B5D B80B00        MOV     AX, 000B
13DB:5B60 50            PUSH    AX
13DB:5B61 9A6E33DB13    CALL    13DB:336E
13DB:5B66 83C402        ADD     SP, +02

Call the function at 13DB:336E with an argument of 0xB, then delete the argument from the stack. Its purpose is unclear.

Stack cleanup

13DB:5B69 8BE5          MOV     SP, BP
13DB:5B6B 5D            POP     BP

Restore the caller’s stack frame; set the Stack Pointer equal to this frame’s Base Pointer, then restore the caller’s Base Pointer by popping it off the stack.

Return (at last!)

13DB:5B6C C3            RET

Return to caller.

Conclusions

What can we learn from this exercise? Well, even though we were working with an executable targeted to an older, simpler platform (a real-mode x86 processor running DOS), I think there are a few general lessons:

  • Executables can be read just like “source code”. They’re just written in a very unfriendly language. Perhaps only Perl is harder to read.
  • The first step in reading an executable is “mapping” it; finding the parts which implement the features you’re interested in.
  • Calls to known interrupts and system code can help you identify the parts of a program which are performing certain functions.
  • Data references are even more helpful, but require some understanding of how the program behaves. There is a small chicken-and-egg problem with using data references: you can’t find good places to break the program’s execution by searching for data references until you know a little bit about how the program organizes its data, and you can’t investigate run-time data organization until you can set good breakpoints. Resolve this by beginning with breakpoints on interrupts and system calls.
  • Better debugging tools (and better debugging support from the CPU) can make the process of understanding an executable much easier, but executables can be read using only the most primitive of tools.
  • If you’re stuck, making some guesses is always a good idea.
  • When trying to make sense of machine/assembly instructions, an understanding of the problem domain, and an eye for context, can be very helpful.
  • The process of understanding an executable is really just a matter of persistance, care, and diligence. They’ve given you the code; you just have to read it. (May not apply to exectuables for systems with DRM hardware, which you ought to avoid for that reason.)

…and last, but not least:

  • Neuromancer’s code wheel values are simply stored in a table, not derived from a formula computed at run-time. Unsurprising, perhaps, but vaguely disappointing to me.
Share and Enjoy:
  • Twitter
  • Facebook
  • Digg
  • Reddit
  • HackerNews
  • del.icio.us
  • Google Bookmarks
  • Slashdot
This entry was posted in Reverse Engineering. Bookmark the permalink.

Comments are closed.