Memory Types
Almost all modern microprocessor have the ability to access two types of memory. The first type of memory is a non-volatile memory that stores the machine instructions used to implement an embedded application. In addition to the machine instructions, this area also contains constant variables, such as strings, that are used in the application. We will refer to this type of memory as FLASH. For our purposes, FLASH is only modified when we program our board via the JTAG interface and cannot be written to by the application itself. The other important characteristic of the FLASH is that it is non-volatile. This means that FLASH retains its contents when power is removed from the microprocessor. As a result, our application does not need to re-initialize FLASH. Once the flash is programmed via JTAG, it will remain unchanged until the next time the microprocessor is programmed. In some situations FLASH can be updated by an application, but that is beyond the scope of this class.
The second memory type we will look at is SRAM (Static Random Access Memory). SRAM is a volatile memory that is used to store computations that is calculated by the application. A key differences between SRAM and FLASH is that the SRAM does not retain its contents when power is removed. When the microprocessor is powered on, the contents of SRAM are uninitialized. Any variables that are stored in SRAM will require the application to initialize the variable. Most variables in a high level language would fall into this category.
Memory Addressing Modes
Microprocessors can access memory using several different memory addressing modes. Here are a few of the more common addressing modes
The address is computed by adding an offset value encoded in the instruction to the current value of the program counter.
Depending on the processor architecture, it may only support a few different memory addressing modes. The ARM architecture is a RISC architecture, so it does not support direct addressing ( a 32-bit address consumes the entire instruction leaving no room for the op code). As a result, the Cortex-M architecture supports register indirect, PC Relative, and based addressing modes.
Memory Map
The Cortex-M architecture has a 32-bit address bus. 32 address bits allow 4,294,967,296 address locations (2^32). The roughly 4 billion addresses make up what is called a memory map. On a microcontroller, only a small percentage of those 4 billion address locations will be usable by the application. The MCU used in our class has only 32KBytes of SRAM and 256KBytes of FLASH. In order to properly access the SRAM and FLASH, we need to know what addresses the FLASH and SRAM are mapped to.
Description | Address Range |
SRAM | 0x2000.7FFF 0x2000.0000 |
Reserved | 0x1FFF.FFF 0x0004.0000 |
FLASH | 0x0003.FFFF 0x0000.0000 |
Allocating Memory
Before we access data in the FLASH or SRAM, we need to allocate the memory in the assembler. The type of memory allocation shown below is static memory allocation. Static memory allocation reserves a predetermined amount of memory at compile time. We simply tell the assembler how much memory we want and what type of memory we want to allocate. Statically allocated memory is used to allocated global variables in a high level language. Once a statically allocated memory location is allocated, it cannot be used for any other purpose in the application. This differs from dynamic memory allocation. Dynamic memory allocation is a run time memory allocation scheme. In dynamic memory allocation, memory is allocated based on need. When the the application is done with a segment of SRAM, the SRAM is returned to the free pool and can be allocated for some other purpose. This is a more flexible memory allocation scheme, but also is more complicated. Dynamic memory allocation will be covered in more detail at a later time.
Allocating Application Storage
In order for an application to be run on the MCU, space needs to be reserved in the FLASH for the machine instructions that comprise the application. The example code below can be used to define the main application code for the Keil assembler. Take a look at the comments below for a more detailed explanation of what each line does.
export __main ;(1) AREA |.text|, CODE, READONLY ;(2) align ;********************** ; main assembly program ;********************** __main PROC ;(3) ; Instructions go here align ENDP ;(4) END ;(5)
- Export directives exports the symbol __main to the linker. The symbol represents the address where the main routine starts. This allows other files to import the __main symbol and branch to it.
- Indicates to the compiler that the resulting instructions will be located in the FLASH or CODE section of the memory map. The specific section is given a name of .text
- A label used to determine the address of __main. The PROC directive indicates the machine instructions are part of a procedure. Without the PROC directive, you would not be able to debug the procedure in the debugger.
- Ends the procedure
- Ends the file
Constant Variables
In addition to allocating space in the FLASH for the application, we can also allocate constant variables. A constant variable is a variable whose value is known at compile time and cannot be modified while the application is running. One of the most common constant variables types are string constants that are displayed to the user. Examine the examples below to see how to allocate constants in uVision.
;********************************************** ; Constant Variables (FLASH) Segment ;********************************************** AREA |.text|, CODE, READONLY CONST_WORD DCD 0xDEADBEEF ;(1) HWORD_CONST DCW 0xABCD ;(2) BYTE_CONST DCB 0xAB ;(3) STRING_CONST DCB "Hello ECE353" ;(4) align ;********************************************** ; Code (FLASH) Segment ;********************************************** ece353_main PROC B ece353_main align ENDP END
- DCD is used to allocate 32-bits (4 bytes) of space and initializes the value to 0xDEADBEEF
- DCW is used to allocate 16-bits (2 bytes) of space and initializes the value to 0xABCD
- DCB is used to allocate 8-bits (1 bytes) of space and initializes the value to 0xAB
- Allocates an array of bytes (12) and initializes the contents to be “Hello ECE353”
Loading Addresses
The two most common commands used to load an address into a register are shown below
(1) Normally used to access a label in FLASH ADR R0, CONST_WORD (2) Normally used to access a label in SRAM LDR R0, =(CONST_WORD)
- ADR is used to generate a PC relative address for a label. The label must be within a range of -4095 and +4095 from the current PC. If you are running the application out of FLASH, you cannot use this instruction to access a label in SRAM. SRAM is more than 4095 bytes way from flash in the memory map.
- If you need to access a label in SRAM, you can use this version of the LDR command. This is a pseudo command that creates a hidden read-only variable in the FLASH. The assembler sets the value of the hidden variable to be the address of the desired label in SRAM. The hidden variable is loaded using a PC relative LDR instruction that is within 4095 of the program counter.
Loading Data
The MCU executes a LDR (Load Register) instruction to read data in either FLASH or SRAM. The LDR instruction requires that we supply an address as the 2nd operand. So how do we know what address to use? The answer is that we use the label supplied for the given piece of data. The following instruction will load the 4-byte data found at CONST_WORD into R1. This is the PC-Relative version of the LDR instruction. Note that the data at CONST_WORD is loaded into R1, not the address of CONST_WORD
LDR R1, CONST_WORD
If you are required to load a 16-bit data value into a register, you will need to use LDRH. If you wish to load a 8-bit data value into a register you will need to use an LDRB.
LDRH R1, HWORD_CONST LDRB R2, BYTE_CONST
What happens to the upper 16-bits of R1 after the LDRH instruction? They are set to 0. The upper 24-bits of R2 would be set to 0 after the LDRB instruction.
But what if the 8-bit value in BYTE_CONST was a negative number? In this situation, we need to sign extend the 8-bit value to the entire 32-bit register. A number is sign extended by taking the sign bit (bit 8 in this case) and replicating the value of the sign bit in all the bit positions greater than the sign bit (bits 31 through 8). We can do this by appending a S to the instruction.
Sign extending an 8 or 16-bit value from SRAM or FLASH into a register DOES NOT set any values of the APSR. Only a ALU type instruction or MOV instruction can do that.
LDRSB R2, BYTE_CONST ; Sign Extend the 8-bit value to 32-bits
The ARM architecture supports load instructions with the following address modes.
; Register Indirect ; R0 ? MEM[R1] LDR R0, [R1] ; Register Indirect with pre-indexed ; R0 ? MEM[R1+4] LDR R0, [R1, #4] ; Register Indirect with pre-indexed ; R0 ? MEM[R1+R2] LDR R0, [R1, R2] ; Register Indirect with pre-index, R1 updated ; R0 ? MEM[R1+4], R1 ? R1 + 4 LDR R0, [R1, #4]! ; Register Indirect post-indexed ; R0 ? MEM[R1], R1 ? R1 + 4 LDR R0, [R1], #4 ; PC relative ; R0? MEM[PC ± Offset], LDR R0, label_1 ; Pseudo Instruction, literal insertion ; Assembler creates a hidden constant in the ; literal pool with the correct value ; R0? ADDR of label_1, not the value at label_1 LDR R0, =(label_1)
Allocating Global Variables
When an application must write to a variable, that variable must be located in SRAM. Functionally, SRAM differs from FLASH because we can modify the contents of SRAM. It is important to note that SRAM cannot be initialized at compile time. Unlike FLASH which is non-volatile storage, SRAM is a volatile memory. Because SRAM is a volatile memory, the ARM assembler does not generate a ‘SRAM image’. Generating a ‘SRAM image’ would be pointless because once power is removed from the processor, the initial state of any variables in SRAM would be lost the next time the processor was powered on and the application would not function properly.
Instead, when the microprocessor is powered on, the application begins to execute from the non-volatile application image stored in FLASH. The application code is required to have an initialization routine that will execute MOV and STR commands that are used to initialize the variables allocated in SRAM.
;**************************** ; SRAM ;**************************** AREA SRAM, READWRITE ; (1) BYTE_DATA DCB 0xAB ; (2) BYTE_ARRAY SPACE 10*1 ; (3) WORD_ARRAY SPACE 10*4 ; (4) align
- Directive indicating that the following data allocations take place in SRAM and can be written and read.
- Creates a label called BYTE_DATA in SRAM for 1 byte of data. The initialization has no effect. The contents of SRAM need to be set via STR instructions
- Allocates 10 bytes of data. The label BYTE_ARRAY is used to access the beginning of the array.
- Allocates 10 words of data. The label WORD_ARRAY is used to access the beginning of the array.
Writing to SRAM
When the MCU wants to write to SRAM, it must issue an STR command. The STR command works in a similar way to the LDR command. It can store 32, 16, or 8 bits from a general purpose register into an address in SRAM using a 32-bit base address. Why does the Cortex-M architecture support 8 and 16-bit stores into SRAM? The answer is data density. Microcontrollers typically have a limited amount of SRAM. If data can be represented using fewer bits, this allows the programmer to better utilize the SRAM resources available. For example, the analog to digital converter on the Tiva Launchpad is a 12-bit converter. This means that every data sample we take is only 12-bits. If we store this data as a WORD (32-bits), 20 of the 32-bits will be wasted. A better approach would be to use half words (16-bits) to store the data. This would allow twice the number of measurements to be stored into the same amount of SRAM as a 32-bit store.
The examples below give examples of how to write 32, 16, and 8 bits of data to SRAM.
MOV R0, #0x20000000 ; Stores a 32-bit Value at location 0x20000000 STR R1, [R0] ; Stores a 16-bit value at location 0x20000004 STRH R1, [R0, #4] ; Stores a 8-bit value at location 0x20000006 STRB R2, [R0, #6]
The Cortex-M Architecture supports the following store commands
; Register Indirect ; MEM[R1] ? R0 STR R0, [R1] ; Register Indirect with pre-indexed ; MEM[R1+4] ? R0 STR R0, [R1, #4] ; Register Indirect with pre-indexed ; MEM[R1+R0] ? R0 STR R0, [R1, R0] ; Register Indirect with pre-index, R1 updated ; MEM[R1+4]? R0, R1 ? R1 + 4 STR R0, [R1, #4]! ; Register Indirect post-indexed ; MEM[R1]? R0, R1 ? R1 + 4 STR R0, [R1], #4
Endianness
When you begin to examine the data in SRAM, you may be surprised by the order in which the data is stored in memory. Data can be stored in one of two ways: big endian or little endian. A big endian system stores the most significant byte in the smallest address. A little endian system stores the least significant byte in the smallest address. The TM4C123 uses little endian by default.
Examples
Allocating an Array in FLASH
;********************************************** ; Constant Variables (FLASH) Segment ;********************************************** AREA |.text|, CODE, READONLY ; Allocate an array of 4 bytes BYTE_ARRAY DCB 0 DCB 1 DCB 2 DCB 3 align
Allocating Array of WORDs in SRAM
Note that the values stored in the array are unknown at reset.
; This is a string constant. It does not allocate any space in the ; application. The assembler replaces WORD with '4' before compilation. WORD EQU 4 ;**************************** ; SRAM ;**************************** AREA SRAM, READWRITE WORD_ARRAY SPACE 10*WORD align
Reading Array – Pre-Indexed
; Load the contents of the byte array into R1-R8. ; Values treated as unsinged ; Register Indirect with pre-indexed ADR R0, BYTE_ARRAY LDRB R1, [R0, #0] LDRB R2, [R0, #1] LDRB R3, [R0, #2] LDRB R4, [R0, #3] LDRB R5, [R0, #4] LDRB R6, [R0, #5] LDRB R7, [R0, #6] LDRB R8, [R0, #7]
Reading Array – Post-Indexed
; Load the contents of the byte array into R1-R8. ; Values treated as unsinged ; Register Indirect post-indexed ADR R0, BYTE_ARRAY LDRB R1, [R0], #1 LDRB R2, [R0], #1 LDRB R3, [R0], #1 LDRB R4, [R0], #1 LDRB R5, [R0], #1 LDRB R6, [R0], #1 LDRB R7, [R0], #1 LDRB R8, [R0], #1
Summing Array – Pre-Indexed
; Sum the the contents of BYTE array ; using a FOR loop ADR R0, BYTE_ARRAY MOV R1, #0 ; Initialize index MOV R2, #0 ; Initialize Array Sum FOR_START CMP R1, #8 BEQ FOR_END LDRB R3, [R0, R1] ADD R2, R2, R3 ADD R1, R1, #1 B FOR_START FOR_END
Summing Array – Post-Indexed
; Sum the the contents of BYTE array ; using a FOR loop ADR R0, BYTE_ARRAY MOV R1, #0 MOV R2, #0 ; Initialize Array Sum FOR_START2 CMP R1, #8 BEQ FOR_END2 LDRB R3, [R0], #1 ADD R2, R2, R3 ADD R1, R1, #1 B FOR_START2 FOR_END2
Array Copy
; Copy 8 bytes of data from BYTE_ARRAY to ; SRAM_ARRAY ADR R0, BYTE_ARRAY ; SRC Address LDR R1,=(SRAM_ARRAY_1) ; DEST Address LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes LDRB R3, [R0], #1 ; Load 1 Bytes STRB R3, [R1], #1 ; Store 1 Bytes ; Copy 8 bytes of data from BYTE_ARRAY to ; SRAM_ARRAY much more efficiently ADR R0, BYTE_ARRAY ; SRC Address LDR R1,=(SRAM_ARRAY_2) ; DEST Address LDR R3, [R0, #0] ; Load 4 Bytes STR R3, [R1, #0] ; Store 4 Bytes LDR R3, [R0, #4] ; Load 4 Bytes STR R3, [R1, #4] ; Store 4 Bytes