It is not possible to directly execute C code, when the processor comes out of reset. Since, unlike assembly language, C programs need some basic pre-requisites to be satisfied. This section will describe the pre-requisites and how to meet the pre-requisites.
We will take the example of C program that calculates the sum of an array as an example. And by the end of the section will be able perform the necessary setup, transfer control to the C code and execute it.
Listing 12. Sum of Array in C
static int arr[] = { 1, 10, 4, 5, 6, 7 };
static int sum;
static const int n = sizeof(arr) / sizeof(arr[0]);
int main()
{
int i;
for (i = 0; i < n; i++)
sum += arr[i];
}Before transferring control to C code, the following have to be setup correctly.
Global variables
C uses the stack for storing local (auto) variables, passing function arguments, storing return address, etc. So it is essential the stack is setup correctly before transferring control to C code.
Stacks are highly flexible in the ARM architecture, since the implementation is completely left to the software. For people not familiar with the ARM architecture a overview is provided in Appendix C, ARM Stacks.
To make sure that code generated by different compilers is
interroperable, ARM has created the
ARM
Architecture Procedure Call Standard (AAPCS). The register to be used
as the stack pointer and the direction in which the stack grows is all
dictated by the AAPCS. According to the AAPCS, register r13 is to
be used as the stack pointer. Also the stack should be
full-descending.
One of way of placing global variables and the stack is shown in the following diagram.
So all that has to be done in the startup code is to point r13 at
the highest RAM address, so that the stack can grow downwards (towards
lower addresses). For the connex board this can be acheived using
the following ARM instruction.
ldr sp, =0xA4000000Note that the the assembler provides an alias sp for the r13
register.
![]() | Note |
|---|---|
The address |
When C code is compiled, the compiler places initialized global
variables in the .data section. So just as with the assembly, the
.data has to be copied from Flash to RAM.
The C language guarantees that all uninitialized global variables will
be initialized to zero. When C programs are compiled, a separate
section called .bss is used for uninitialized variables. Since the
value of these variables are all zeroes to start with, they do not
have to be stored in Flash. Before transferring control to C code, the
memory locations corresponding to these variables have to be
initialized to zero.
GCC places global variables marked as const in a separate section,
called .rodata. The .rodata is also used for storing string
constants.
Since contents of .rodata section will not be modified, they can be
placed in Flash. The linker script has to modified to accomodate
this.
Now that we know the pre-requisites we can create the linker script and the startup code. The linker script Listing 10, “Linker Script with Section Copy Symbols” is modified to accomodate the following.
.bss section placement
vectors section placement
.rodata section placement
The .bss is placed right after .data section in RAM. Symbols to
locate the start of .bss and end of .bss are also created in the
linker script. The .rodata is placed right after .text section in
Flash. The following diagram shows the placement of the various
sections.
Listing 13. Linker Script for C code
SECTIONS {
. = 0x00000000;
.text : {
* (vectors);
* (.text);
}
.rodata : {
* (.rodata);
}
flash_sdata = .;
. = 0xA0000000;
ram_sdata = .;
.data : AT (flash_sdata) {
* (.data);
}
ram_edata = .;
data_size = ram_edata - ram_sdata;
sbss = .;
.bss : {
* (.bss);
}
ebss = .;
bss_size = ebss - sbss;
}The startup code has the following parts
.data from Flash to RAM
.bss
Listing 14. C Startup Assembly
.section "vectors"
reset: b start
undef: b undef
swi: b swi
pabt: b pabt
dabt: b dabt
nop
irq: b irq
fiq: b fiq
.text
start:
@@ Copy data to RAM.
ldr r0, =flash_sdata
ldr r1, =ram_sdata
ldr r2, =data_size
@@ Handle data_size == 0
cmp r2, #0
beq init_bss
copy:
ldrb r4, [r0], #1
strb r4, [r1], #1
subs r2, r2, #1
bne copy
init_bss:
@@ Initialize .bss
ldr r0, =sbss
ldr r1, =ebss
ldr r2, =bss_size
@@ Handle bss_size == 0
cmp r2, #0
beq init_stack
mov r4, #0
zero:
strb r4, [r0], #1
subs r2, r2, #1
bne zero
init_stack:
@@ Initialize the stack pointer
ldr sp, =0xA4000000
bl main
stop: b stopTo compile the code, it is not necessary to invoke the assembler,
compiler and linker individually. gcc is intelligent enough to do
that for us.
As promised before, we will compile and execute the C code shown in Listing 12, “Sum of Array in C”.
$ arm-none-eabi-gcc -nostdlib -o csum.elf -T csum.lds csum.c startup.sThe -nostdlib option is used to specify that the standard C library
should not be linked in. A little extra care has to be taken when the
C library is linked in. This is discussed in Section 11, “Using the C Library”.
A dump of the symbol table will give a better picture of how things have been placed in memory.
$ arm-none-eabi-nm -n csum.elf
00000000 t reset ❶
00000004 A bss_size
00000004 t undef
00000008 t swi
0000000c t pabt
00000010 t dabt
00000018 A data_size
00000018 t irq
0000001c t fiq
00000020 T main
00000090 t start ❷
000000a0 t copy
000000b0 t init_bss
000000c4 t zero
000000d0 t init_stack
000000d8 t stop
000000f4 r n ❸
000000f8 A flash_sdata
a0000000 d arr ❹
a0000000 A ram_sdata
a0000018 A ram_edata
a0000018 A sbss
a0000018 b sum ❺
a000001c A ebss
reset and the rest of the exception vectors are placed starting from 0x0.
| |
The assembly code is placed right after the 8 exception vectors
(8 * 4 = 32 = 0x20).
| |
The read-only data n, is placed in Flash after the code.
| |
The initialized data arr, an array of 6 integers, is placed at
the start of RAM 0xA0000000.
| |
The uninitialized data sum is placed after the array of 6
integers. (6 * 4 = 24 = 0x18)
|
To execute the program, convert the program to .bin format, execute
in Qemu, and dump the sum variable located at 0xA0000018.
$ arm-none-eabi-objcopy -O binary csum.elf csum.bin
$ dd if=csum.bin of=flash.bin bs=4096 conv=notrunc
$ qemu-system-arm -M connex -pflash flash.bin -nographic -serial /dev/null
(qemu) xp /6dw 0xa0000000
a0000000: 1 10 4 5
a0000010: 6 7
(qemu) xp /1dw 0xa0000018
a0000018: 33