The MIPS ISA instructions fall into three categories: R-type, I-type, and J-type. Not all ISAs divide their instructions this neatly. This is one reason to study MIPS as a first assembly language. The format is simple.
This is the format of the R-type instruction, when it is encoded in machine code.
| B31-26 | B25-21 | B20-16 | B15-11 | B10-6 | B5-0 |
| opcode | register s | register t | register d | shift amount | function |
The prototypical R-type instruction is:
add $rd, $rs, $rt
where $rd refers to some register d (d is shown
as a variable, however, to use the instruction, you must put a
number between 0 and 31, inclusive for d). $rs, $rt are
also registers.
The semantics of the instruction are;
R[d] = R[s] + R[t]
where the addition is signed addition.
You will notice that the order of the registers in the instruction is the destination register ($rd), followed by the two source registers ($rs and $rt).
However, the actual binary format (shown in the table above) stores the two source registers first, then the destination register. Thus, how the assembly language programmer uses the instruction, and how the instruction is stored in binary, do not always have to match.
Let's explain each of the fields of the R-type instruction.
Opcode is short for "operation code". The opcode is a binary encoding for the instruction. Opcodes are seen in all ISAs. In MIPS, there is an opcode for add.
The opcode in MIPS ISA is only 6 bits. Ordinarily, this means there are only 64 possible instructions. Even for a RISC ISA, which typically has few instructions, 64 is quite small. For R-type instructions, an additional 6 bits are used (B5-0) called the function. Thus, the 6 bits of the opcode and the 6 bits of the function specify the kind of instruction for R-type instructions.
This is the destination register. The destination register is the register where the result of the operation is stored.
This is the first source register. The source register is the register that holds one of the arguments of the operation.
This is the second source register.
The amount of bits to shift. Used in shift instructions.
An additional 6 bits used to specify the operation, in addition to the opcode.
| B31-26 | B25-21 | B20-16 | B15-0 |
| opcode | register s | register t | immediate |
The prototypical I-type instruction looks like:
add $rt, $rs, immed
In this case, $rt is the destination register,
and $rs is the only source register. It is unusual
that $rd is not used, and that $rd does not
appear in bit positions B25-21 for both
R-type and I-type instructions. Presumably, the designers
of the MIPS ISA had their reasons for not making the destination
register at a particular location for R-type and I-type.
The semantics of the addi instruction are;
R[t] = R[s] + (IR15)16 IR15-0
where IR refers to the instruction register, the register
where the current instruction is stored. (IR15)16
means that bit B15 of the instruction register
(which is the sign bit of the immediate value) is repeated 16 times.
This is then followed by IR15-0, which is the 16 bits
of the immediate value.
Basically, the semantics says to sign-extend the immediate value to 32 bits, add it (using signed addition) to register R[s], and store the result in register $rt.
| B31-26 | B25-0 |
| opcode | target |
The prototypical I-type instruction looks like:
j target
The semantics of the j instruction (j means jump)
are:
PC <- PC31-28 IR25-0 00
where PC is the program counter, which stores the current
address of the instruction being executed. You update
the PC by using the upper 4 bits of the program counter,
followed by the 26 bits of the target (which is the lower
26 bits of the instruction register), followed by two 0's,
which creates a 32 bit address. The jump instruction will
be explained in more detail in a future set of notes.
MIPS supports 32 integer registers. To specify each register, the register are identified with a number from 0 to 31. It takes log2 32 = 5 bits to specify one of 32 registers.
If MIPS has 64 register, you would need 6 bits to specify the register.
The register number is specified using unsigned binary. Thus, 00000 refers to $r0 and 11111 refers to register $r31.
Furthermore, you begin to realize what information the instructions store. For example, it's not all that obvious that immediate values are stored as part of the instruction for I-type instructions.
If you know that, for example, addi does signed addition, then you can also conclude that the immediate value is represented in 2C. Also, to add the immediate value to a 32-bit register value would mean sign-extending the immediate value to 32 bits.
However, not all I-type instructions encode the 16 bit immediate in 2C. For example, addiu (add immediate unsigned) interprets the 16 bits as UB. It zero-extends the immediate and then adds it to the value stored in a 32 bit register.
How did they manage that? Here's an example of an instruction
cisc_add $r1, $r2 # R[1] = R[1] + R[2]
One way to reduce the total number of operands is to make one operand both a source and a destination register.
Another approach is to use an implicit register.
acc_add $r2 # Acc = Acc + R[2]
For example, there may be a special register called the accumulator.
This register is not mentioned explicitly in the instruction. Instead,
it is implied by the opcode.
Early personal computers such as the Apple 2, used ISAs with 1 or 2 registers, and those registers were often part of most instructions, thus they didn't have to be specified.
With memory becoming cheaper, and memory access becoming cheaper, it's become easier to devote more bits to an instruction, and to specify three operands instead of two. This makes it more convenient for the assembly language programmer.