While writing a multi-file program, each file is assembled individually into object files. The linker combines these object files to form the final executable.
While combining the object files together, the linker performs the following operations.
We will look into these operations, in detail, in this section.
In a single file program, while producing the object file, all references to labels are replaced by their corresponding addresses by the assembler. But in a multi-file program, if there are any references to labels defined in another file, the assembler marks these references as "unresolved". When these object files are passed to the linker, the linker determines the values for these references from the other object files, and patches the code with the correct values.
The sum of array example is split into two files, to demonstrate the symbol resolution performed by the linker. The two files will be assembled and their symbol tables examined to show the presence of unresolved references.
sum-sub.s contains the
sum subroutine, and the file
main.s invokes the subroutine with the
required arguments. The source of the files is shown below.
main.s - Subroutine Invocation
.text b start @ Skip over the data arr: .byte 10, 20, 25 @ Read-only array of bytes eoa: @ Address of end of array + 1 .align start: ldr r0, =arr @ r0 = &arr ldr r1, =eoa @ r1 = &eoa bl sum @ Invoke the sum subroutine stop: b stop
sum-sub.s - Subroutine Definition
@ Args @ r0: Start address of array @ r1: End address of array @ @ Result @ r3: Sum of Array .global sum sum: mov r3, #0 @ r3 = 0 loop: ldrb r2, [r0], #1 @ r2 = *r0++ ; Get array element add r3, r2, r3 @ r3 += r2 ; Calculate sum cmp r0, r1 @ if (r0 != r1) ; Check if hit end-of-array bne loop @ goto loop ; Loop mov pc, lr @ pc = lr ; Return when done
A word on the
.global directive is
in order. In C, all variables declared outside functions are
visible to other files, until explicitly stated as
static. In assembly, all labels are
static AKA local (to the file), until explicitly
stated that they should be visible to other files, using the
The files are assembled, and the symbol tables are dumped using
$ arm-none-eabi-as -o main.o main.s $ arm-none-eabi-as -o sum-sub.o sum-sub.s $ arm-none-eabi-nm main.o 00000004 t arr 00000007 t eoa 00000008 t start 00000018 t stop U sum $ arm-none-eabi-nm sum-sub.o 00000004 t loop 00000000 T sum
For now, focus on the letter in the second column, which
specifies the symbol type. A
indicates that the symbol is defined, in the text section. A
u indicates that the symbol is
undefined. A letter in uppercase indicates that the symbol is
It is evident that the symbol
is defined in
sum-sub.o and is not
resolved yet in
main.o. When the
linker is invoked the symbol references will be resolved, and the
executable will be produced.
Relocation is the process of changing addresses already assigned to labels. This will also involve patching up all label references to reflect the newly assigned address. Primarily, relocation is performed for the following two reasons:
To understand the process of relocation, an understanding of the concept of sections is essential.
Code and data have different run time requirements. For example
code can be placed in read-only memory, and data might require
read-write memory. It would be convenient, if code and data is
not interleaved. For
this purpose, programs are divided into sections. Most programs
have at least two sections,
.data for data. Assembler
.data, are used to switch back and forth between
the two sections.
It helps to imagine each section as a bucket. When the assembler hits a section directive, it puts the code/data following the directive in the selected bucket. Thus the code/data that belong to particular section appear in contiguous locations. The following figures show how the assembler re-arranges data into sections.
Now that we have an understanding of sections, let us look into the primary reasons for which relocation is performed.
When dealing with multi-file programs, the sections with the
same name (example
appear, in each file. The linker is responsible for merging
sections from the input files, into sections of the output file. By
default, the sections, with the same name, from each file is placed
contiguously and the label references are patched to reflect the
The effects of section merging can be seen by looking at the
symbol table of the object files and the corresponding executable
file. The multi-file sum of array program can be used to illustrate
section merging. The symbol table of the object files
and the symbol table of the executable file
sum.elf is shown below.
$ arm-none-eabi-nm main.o 00000004 t arr 00000007 t eoa 00000008 t start 00000018 t stop U sum $ arm-none-eabi-nm sum-sub.o 00000004 t loop ❶ 00000000 T sum $ arm-none-eabi-ld -Ttext=0x0 -o sum.elf main.o sum-sub.o $ arm-none-eabi-nm sum.elf ... 00000004 t arr 00000007 t eoa 00000008 t start 00000018 t stop 00000028 t loop ❷ 00000024 T sum
When a program is assembled, each section is assumed to start from address 0. And thus labels are assigned values relative to start of the section. When the final executable is created, the section is placed at some address X. And all references to the labels defined within the section, are incremented by X, so that they point to the new location.
The placement of each section at a particular location in memory and the patching of all references to the labels in the section, is done by the linker.
The effects of section placement can be seen by looking at the
symbol table of the object file and the corresponding executable
file. The single file sum of array program can be used to
illustrate section placement. To make things clearer, we will place
.text section at address
$ arm-none-eabi-as -o sum.o sum.s $ arm-none-eabi-nm -n sum.o 00000000 t entry ❶ 00000004 t arr 00000007 t eoa 00000008 t start 00000014 t loop 00000024 t stop $ arm-none-eabi-ld -Ttext=0x100 -o sum.elf sum.o ❷ $ arm-none-eabi-nm -n sum.elf 00000100 t entry ❸ 00000104 t arr 00000107 t eoa 00000108 t start 00000114 t loop 00000124 t stop ...
|The address for labels are assigned
|When the executable is created the
linker is instructed to place the text section at address
|The address for labels in the
The process of section merging and placement is shown in the following figure.