This is a continuation from my previous post where we built a simpe bootloader in assembly language for the x86 processor and IBM PC architecture. It didn’t really do much except halt the processor, so in this post I want to get at least some text to display on the screen, the famous Hello, World! program. But before we write any code we should get familiar with the Intel x86 architecture and assembly language, starting with CPU registers.
The x86 processor has number of so called registers where you can store data. The amount of data that you can store varies per register, the first x86 processors could store 16 bits in the
si (source index),
di (destination index),
bp (base pointer) and
sp (stack pointer) registers. The Intel 386 was expanded with 32-bit registers called
ebx and so on. Modern processors like the Core 2 have
rbx and similar registers that can hold 64 bits. The register size is what we refer to when we say a processor is 64-bit.
If we want to store a value in a register we can use the
mov instruction. The instruction looks and sounds like move, which is more or less what happens when it is used (conceptually at least). We call this kind of instruction a mnemonic. These instructions are easier to remember (just like the
hlt instruction in the previous post). Let’s see how the mov instruction works:
mov ax, 42h mov bx, ax
In this example the hexadecimal value
42 is stored in
ax, and the value in
ax is stored in
bx. Notation matters, and it can differ per assembler, I will continue to use Netwide Assembler (NASM) syntax with suffixes to indicate the base of numbers (
h for hexadecimal). The
mov instruction has two operands that follow it. The first is the target, the second is the value, which can be an actual value (line 1) or the name of a register (line 2) or even a memory address where the value can be found (not shown). The code above is assembled into to the following machine code by NASM:
b8 42 00 89 c3
Notice how the mov instruction has several different opcodes depending on the way it is used. The
b8 opcode is for moving an immediate word (16 bit value, in little-endian) to a register, the
89 opcode is for moving the value in register
bx to register
ax. Don’t remember that, just know that you can use ‘simple’ assembly notation and the assembler will figure out how to properly instruct the processor.
Memory and variables
In this example we hard-coded the values, but we can also use variables. Memory (RAM) is just a block of 8 bit (1 byte) storage spaces, each having a unique address, starting at 0. If you want to get a value stored in memory a processor will put the address on the address bus and initiate a read to get the value back on the data bus. Communication between CPU, memory and I/O devices using an address, data and instruction bus is the basic Von Neumann architecture.
Programs and data are stored in RAM. Remember from the previous post that we can use pseudo-instructions in NASM to put arbitrary data in our program. This data will end up in RAM at some point when our program is loaded from disk. What if we want to use such data, how can we refer to it? Well, we can use effective addresses in NASM for that.
answer dw 42h ; Declare variable answer mov ax, [answer] ; Use variable answer
NASM will analyze the source code in multiple passes and find out how many bytes in the output file are needed for all data and opcodes. Then it will know at which point in the file (offset) the value
42h is stored. It will then replace references to variables with this offset and make sure the right opcode is used for the mov instruction. There are two potential problems we need to be aware of though: we need to make sure that the processor does not interpret our data as instructions, and we need to take into account the location of our program in RAM when using offsets.
Instructions vs data
The BIOS loads our program into memory at a certain address and points the processor to that address (see: instruction pointer). In Bochs we can see in the output log:
Booting from 0000:7c00
What if the first byte of our program happens to be
0F4h? That is the opcode for the
hlt instruction, it would stop the processor.
0F4h could also just be data (the decimal value 244). The following code will never execute past the first line, since it will halt the processor:
var db 0F4h ; Declare and initialize a variable mov ax, [var]
We need to make sure the processor jumps over our data and starts executing at a known point. When can use labels and the
jmp instruction for that. A label is nothing more than a marker that NASM will resolve to an address. The
jmp instruction instructs the processor to continue executing the instruction at the specified address.
; Jump to the label jmp code var db 0F4h code: mov ax, [var]
NASM will generate the opcode for
jmp with a relative offset. Problem solved, as long as we remember to always structure our code this way. Depending on the environment you work in, this way of structuring code is common and might even be mandatory (the different parts of your file are called sections in that case).
The program’s location in RAM
The other problem is that when effective addresses are resolved during compilation NASM calculates absolute addresses from the beginning of the program which it assumes is at address 0 in RAM. But as we’ve seen in the output from Bochs it is actually
0x7C00. So the absolute addressing will be wrong when our program is loaded and executed. We can see this if we debug the following program in Bochs:
xor ax, ax mov ax, [var] jmp halt var dw 'A' halt: hlt times 510-($-$$) nop dw 0AA55h
When we step through the program we can see that the instruction on line 2 was compiled to the opcode
a10800 which means
mov ax, word ptr ds:0x8. So clearly the dereferencing of our variable is not working the way it should be. To solve this we can simply add
org 07C00h to the beginning or our program. During compilation NASM will use the
org value for calculating offsets (if you step through the program you will see that the actual instruction has become
mov ax, word ptr ds:0x7c08 and indeed our value
'A' is stored at that address in memory).
Now that we know how to solve some potential problems in our assembly programs, we can focus on the real program. Of course, good programmers are lazy (as they should be), so let’s see how we can use the BIOS to do some of our work for us.
An IBM PC offers services to programs running on the computer. These services can be triggered with the
int instruction (interrupt). When you use these instructions, the processor looks up the operand in the interrupt vector table to find an associated interrupt service routine (a handler). This is basically a callback: the handler is executed and control returns to the program. Because a context switch happens the state of your program is retained.
The BIOS implements several services, for instance keyboard input and video output (the VGA BIOS can implement the video services by extension). For most services there are multiple functions. To select the function you store a predefined value in the ah register before you call the interrupt. Some functions require operands to be stored in specific registers. For example, to write the character ! to screen you use the following instructions:
mov al, '!' mov ah, 0Eh int 10h
On line 3 the video service (
10h) interrupt is called. But before that, on line 2 the function teletype output is specified by storing the value
0Eh in register
ah and the operand for that function is stored in the
al register on line 1. Now we can start to think about writing a message to the screen.
Strings and the index registers
It would be cumbersome to print a message character by character. Instead we want to specify our message in the form of a string and then have some way to print that in one go. For this purpose we can use the si index register (source index) in combination with the
Index registers can be used for indirect addressing, which means that the register holds the memory address of an operand. So if our string is stored in memory, and we have its starting address stored in an index register, we can use that register to move the value at the starting address into another register:
mov ax, [si]
Here, we don’t move the value in
ax, but the value in memory at the address that is stored in
si. If we increment the
si register it points to the next character in our string. Because this is such a common combination of operations, the x86 instruction set has an instruction that does this in one go:
lodsb. It stores the value
si points to in
al and increments
Hello world, finally!
Now we can use all that we have learned so far to create a program in assembly that writes a message to the screen.
org 07C00h mov si, msg print: lodsb cmp al, 0 je halt mov ah, 0Eh int 10h jmp print halt: hlt msg db "Hello, world!", 0 times 510-($-$$) nop dw 0AA55h
This program uses all the concepts we have discussed in this post. We begin with the
org instruction for NASM to adjust its addressing according to the location in RAM of our program (line 1). Then we store the starting address of a variable called
msg in the string index register (3). We have separated the variable declaration at the end of the program so it can’t be executed accidentally (line 17). We copy the first value from memory into the low byte of the accumulator register and increment the string index register at the same time (line 6). We compare the value to 0 which marks the end of the string (line 7,
cmp instruction compares its operands) and if it is 0 we jump to the end of the program (line 8,
je instruction jumps if the operators of the previous
cmp instruction were equal). If not, we set the high byte of the accumulator to
0Eh (line 10) and call the video service interrupt (line 11). This will print the current value in
al. We repeat the process (line 12 effectively creates a loop) until we have read the 0 at the end of the string.
Well, it took some time to arrive at our final destination, but we did cover a lot of ground. There’s much more to learn about assembly programming, and I might revisit this subject in the future. Because I feel that understanding how the machine works at this low level helps understand many high level concepts. In the end it should help you write better code, or at least get a better understanding of the implications of your choices. And because I think it is great fun to learn!