Fork me on GitHub

3. Hello ARM

In this section, you will learn to assemble a simple ARM program, and test it on a bare metal connex board emulated by Qemu.

The assembly program source file consists of a sequence of statements, one per line. Each statement has the following format.

label:    instruction         @ comment

Each of the components is optional.

The label is a convenient way to refer to the location of the instruction in memory. The label can be used where ever an address can appear, for example as an operand of the branch instruction. The label name should consist of alphabets, digits, _ and $.
A comment starts with an @, and the characters that appear after an @ are ignored.
The instruction could be an ARM instruction or an assembler directive. Assembler directives are commands to the assembler. Assembler directives always start with a . (period).

Here is a very simple ARM assembly program to add two numbers.

Listing 1. Adding Two Numbers

start:                       @ Label, not really required
        mov   r0, #5         @ Load register r0 with the value 5
        mov   r1, #4         @ Load register r1 with the value 4
        add   r2, r1, r0     @ Add r0 and r1 and store in r2

stop:   b stop               @ Infinite loop to stop execution

The .text is an assembler directive, which says that the following instructions have to be assembled into the code section, rather than the .data section. Sections will be covered in detail, later in the tutorial.

3.1. Building the Binary

Save the program in a file say add.s. To assemble the file, invoke the GNU Toolchain’s assembler as, as shown in the following command.

$ arm-none-eabi-as -o add.o add.s

The -o option specifies the output filename.

[Note] Note

Cross toolchains are always prefixed with the target architecture for which they are built, to avoid name conflicts with the host toolchain. For the sake readability, tools will be referred to without the prefix, in the text.

To generate the executable file, invoke the GNU Toolchain’s linker ld, as shown in the following command.

$ arm-none-eabi-ld -Ttext=0x0 -o add.elf add.o

Here again, the -o option specifies the output filename. The -Ttext=0x0, specifies that addresses should be assigned to the labels, such that the instructions were starting from address 0x0. To view the address assignment for various labels, the nm command can be used as shown below.

$ arm-none-eabi-nm add.elf
... clip ...
00000000 t start
0000000c t stop

Note the address assignment for the labels start and stop. The address assigned for start is 0x0. Since it is the label of the first instruction. The label stop is after 3 instructions. Each instructions is 4 bytes. Hence stop is assigned an address 12 (0xC).

Linking with a different base address for the instructions will result in a different set of addresses being assigned to the labels.

$ arm-none-eabi-ld -Ttext=0x20000000 -o add.elf add.o
$ arm-none-eabi-nm add.elf
... clip ...
20000000 t start
2000000c t stop

The output file created by ld is in a format called ELF. Various file formats are available for storing executable code. The ELF format works fine when you have an OS around, but since we are going to run the program on bare metal, we will have to convert it to a simpler file format called the binary format.

A file in binary format contains consecutive bytes from a specific memory address. No other additional information is stored in the file. This is convenient for Flash programming tools, since all that has to be done when programming is to copy each byte in the file, to consecutive address starting from a specified base address in memory.

The GNU toolchain’s objcopy command can be used to convert between different object file formats. A common usage of the command is given below.

objcopy -O <output-format> <in-file> <out-file>

To convert add.elf to binary format the following command can be used.

$ arm-none-eabi-objcopy -O binary add.elf add.bin

Check the size of the file. The file will be exactly 16 bytes. Since there are 4 instructions and each instruction occupies 4 bytes.

$ ls -al add.bin
-rw-r--r-- 1 vijaykumar vijaykumar 16 2008-10-03 23:56 add.bin

3.2. Executing in Qemu

When the ARM processor is reset, it starts executing from address 0x0. On the connex board a 16MB Flash is located at address 0x0. The instructions present in the beginning of the Flash will be executed.

When qemu emulates the connex board, a file has to be specified which will be treated file as Flash memory. The Flash file format is very simple. To get the byte from address X in the Flash, qemu reads the byte from offset X in the file. In fact, this is the same as the binary file format.

To test the program, on the emulated Gumstix connex board, we first create a 16MB file representing the Flash. We use the dd command to copy 16MB of zeroes from /dev/zero to the file flash.bin. The data is copied in 4K blocks.

$ dd if=/dev/zero of=flash.bin bs=4096 count=4096

add.bin file is then copied into the beginning of the Flash, using the following command.

$ dd if=add.bin of=flash.bin bs=4096 conv=notrunc

This is the equivalent of programming the bin file on to the Flash memory.

After reset, the processor will start executing from address 0x0, and the instructions from the program will get executed. The command to invoke qemu is given below.

$ qemu-system-arm -M connex -pflash flash.bin -nographic -serial /dev/null

The -M connex option specifies that the machine connex is to be emulated. The -pflash options specifies that flash.bin file represents the Flash memory. The -nographic specifies that simulation of a graphical display is not required. The -serial /dev/null specifies that the serial port of the connex board is to be connected to /dev/null, so that the serial port data is discarded.

The system executes the instructions and after completion, keeps looping infinitely in the stop: b stop instruction. To view the contents of the registers, the monitor interface of qemu can be used. The monitor interface is a command line interface, through which the emulated system can be controlled and the status of the system can be viewed. When qemu is started with the above mentioned command, the monitor interface is provided in the standard I/O of qemu.

To view the contents of the registers the info registers monitor command can be used.

(qemu) info registers
R00=00000005 R01=00000004 R02=00000009 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=00000000
R12=00000000 R13=00000000 R14=00000000 R15=0000000c
PSR=400001d3 -Z-- A svc32

Note the value in register R02. The register contains the result of the addition and should match with the expected value of 9.

3.3. More Monitor Commands

Some useful qemu monitor commands are listed in the following table.

Command Purpose


List available commands


Quits the emulator

xp /fmt addr

Physical memory dump from addr


Reset the system.

The xp command deserves more explanation. The fmt argument specifies how the memory contents is to be displayed. The syntax of fmt is <count><format><size>.

specifies no. of data items to be dumped.
specifies the size of each data item. b for 8 bits, h for 16 bits, w for 32 bits and g for 64 bits.
specifies the display format. x for hex, d for signed decimal, u for unsigned decimal, o for octal, c for char and i for asm instructions.

This xp command with the i format, can be used to disassemble the instructions present in memory. To disassemble the instructions located at 0x0, the xp command with the fmt specified as 4iw can be used. The 4 specifies 4 items are to be displayed, i specifies that the items are to be printed as instructions (yes, a built in disassembler!), w specifies that the items are 32 bits in size. The output of the command is shown below.

(qemu) xp /4iw 0x0
0x00000000:  mov        r0, #5  ; 0x5
0x00000004:  mov        r1, #4  ; 0x4
0x00000008:  add        r2, r1, r0
0x0000000c:  b  0xc