8086 Opcode Redundancies

Looking over the opcode map for the 8086 processor, I was struck by the seemingly large number of redundant opcodes. I decided to determine exactly how many were redundant, and I was surprised to find that just over 30% of the available 1-byte opcode space was assigned to redundant, short-form opcodes. While I’m sure Intel had good reasons to burn these opcodes at the time (most likely related to code size or performance), the scale of the waste gives some indication of just how surprised Intel must have been to see this instruction set architecture live on for 31 years (and counting).

Redundancy

It can be a little tricky to define a “redundant” opcode. In principle, for instance, one can do without almost any particular opcode, given that there is usually some combination of others which can be used in its place. Therefore, I define an opcode as redundant iff:

  • Any instruction containing that opcode can be replaced with some single alternate instruction containing a different opcode. An opcode is not redundant if instructions containing it can only be replaced with multi-instruction sequences. The replacement instruction may be longer than the original instruction.
  • The operand size(s) of the replacement instruction are the same as those of the original instruction. For instance, this rule means that EB is not redundant with E9; even though any short jump can be replaced with a near jump, the operand sizes are different.
  • The opcode is not one of the “special cases” 90, C3, CB, or CC, all of which seem (to me) to have good reason to have special short forms distinct from their more verbose cousins 87, C2, CA, and CD.

The List

Here are all the redundant opcodes in the 8086 instruction set, and alternate ways to encode instructions containing them:

Opcode Redundant with
04 Opcode 80, with a ModR/M of C0
05 Opcode 81, with a ModR/M of C0
0C Opcode 80, with a ModR/M of C8
0D Opcode 81, with a ModR/M of C8
14 Opcode 80, with a ModR/M of D0
15 Opcode 81, with a ModR/M of D0
1C Opcode 80, with a ModR/M of D8
1D Opcode 81, with a ModR/M of D8
24 Opcode 80, with a ModR/M of E0
25 Opcode 81, with a ModR/M of E0
2C Opcode 80, with a ModR/M of E8
2D Opcode 81, with a ModR/M of E8
34 Opcode 80, with a ModR/M of F0
35 Opcode 81, with a ModR/M of F0
3C Opcode 80, with a ModR/M of F8
3D Opcode 81, with a ModR/M of F8
40 Opcode FF, with a ModR/M of C0
41 Opcode FF, with a ModR/M of C1
42 Opcode FF, with a ModR/M of C2
43 Opcode FF, with a ModR/M of C3
44 Opcode FF, with a ModR/M of C4
45 Opcode FF, with a ModR/M of C5
46 Opcode FF, with a ModR/M of C6
47 Opcode FF, with a ModR/M of C7
48 Opcode FF, with a ModR/M of C8
49 Opcode FF, with a ModR/M of C9
4A Opcode FF, with a ModR/M of CA
4B Opcode FF, with a ModR/M of CB
4C Opcode FF, with a ModR/M of CC
4D Opcode FF, with a ModR/M of CD
4E Opcode FF, with a ModR/M of CE
4F Opcode FF, with a ModR/M of CF
50 Opcode FF, with a ModR/M of F0
51 Opcode FF, with a ModR/M of F1
52 Opcode FF, with a ModR/M of F2
53 Opcode FF, with a ModR/M of F3
54 Opcode FF, with a ModR/M of F4
55 Opcode FF, with a ModR/M of F5
56 Opcode FF, with a ModR/M of F6
57 Opcode FF, with a ModR/M of F7
58 Opcode 8F, with a ModR/M of C0
59 Opcode 8F, with a ModR/M of C1
5A Opcode 8F, with a ModR/M of C2
5B Opcode 8F, with a ModR/M of C3
5C Opcode 8F, with a ModR/M of C4
5D Opcode 8F, with a ModR/M of C5
5E Opcode 8F, with a ModR/M of C6
5F Opcode 8F, with a ModR/M of C7
82 Opcode 80
91 Opcode 87, with a ModR/M of C8
92 Opcode 87, with a ModR/M of D0
93 Opcode 87, with a ModR/M of D8
94 Opcode 87, with a ModR/M of E0
95 Opcode 87, with a ModR/M of E8
96 Opcode 87, with a ModR/M of F0
97 Opcode 87, with a ModR/M of F8
A0 Opcode 8A, with a ModR/M of 06
A1 Opcode 8B, with a ModR/M of 06
A2 Opcode 88, with a ModR/M of 06
A3 Opcode 89, with a ModR/M of 06
A8 Opcode F6, with a ModR/M of C0
A9 Opcode F7, with a ModR/M of C0
B0 Opcode C6, with a ModR/M of C0
B1 Opcode C6, with a ModR/M of C1
B2 Opcode C6, with a ModR/M of C2
B3 Opcode C6, with a ModR/M of C3
B4 Opcode C6, with a ModR/M of C4
B5 Opcode C6, with a ModR/M of C5
B6 Opcode C6, with a ModR/M of C6
B7 Opcode C6, with a ModR/M of C7
B8 Opcode C7, with a ModR/M of C0
B9 Opcode C7, with a ModR/M of C1
BA Opcode C7, with a ModR/M of C2
BB Opcode C7, with a ModR/M of C3
BC Opcode C7, with a ModR/M of C4
BD Opcode C7, with a ModR/M of C5
BE Opcode C7, with a ModR/M of C6
BF Opcode C7, with a ModR/M of C7

Counts

There are 78 redundant opcodes, as I have defined them. They occupy 30% of the available 1-byte opcode space, and they account for 33% of the opcodes defined when the 8086 was launched. If you relax my latter two constraints you could probably find another half-dozen or so redundant opcodes. Of course, one could also begin to argue about the wisdom of including support for certain instructions (e.g. the BCD stuff, or string operations).

I think it’s pretty clear than this architecture was never intended to survive so long. It’s a testament to everyone involved that it has, given that the legacy of its early days has never been particularly helpful, from a programmer’s standpoint.

Share and Enjoy:
  • Twitter
  • Facebook
  • Digg
  • Reddit
  • HackerNews
  • del.icio.us
  • Google Bookmarks
  • Slashdot
This entry was posted in Reverse Engineering. Bookmark the permalink.

Comments are closed.