Arm is RISC based architecture which can be seen a lot in embedded domain. There are billions of arm devices currently active. It ranges from automotive to smartphone processor and now in your laptop thanks to Apple. That being said let us dive right in.
Prerequisites:
Basic understanding of C programming will come in handy.(If not you might have to do some googling here and there)
Where to code?
So most of us are on x86_64 arch machines. So we need to rely on an emulator. Cpulator is a good place to get started. in Architecture select ARMv7 and in System DE1-SoC.
The basics
So here are some terms you have to be familiar with just have a look and then you can always
scroll back.
GPRs (General Purpose Registers)
There are 16 (R0,R1,R2....R15) registers available. We can used them in computations and storing values. Out of these 3 has some special purpose (R13,R14,R15) and some regs will have temporary special assignments.
Link Register (R13):
This holds return address; So I will assume that you know functions in higher level languages; A function is called from a certain context (main function for c /c++), and once all the instructions of called function functions are executed, control is transferred to main function or function(or context) which called the function. So we need to store the address where the control has to return. Link reg stores the same
Stack Pointer register (R14):
SP stores address of top of the stack. I usually don't mess with it (Although it's totally mess-able :) ).
Program counter (R15)
PC stores address of next instruction to be executed so if you override this you can jump between points in your program. Or PC is used for teleportation.
That is all for terminologies
Current Program Status Register:
CPSR register contains flags indicating current status of our program.
CFSR FLAG DESCRIPTION
[31] N Negative condition code flag
[30] Z Zero condition code flag
[29] C Carry condition code flag
[28] V Overflow condition code flag
[27] Q Cumulative saturation bit
[26:25] IT[1:0] If-Then execution state bits for the Thumb IT (If-Then)
instruction
[24] J Jazelle bit
[19:16] GE Greater than or Equal flags
[15:10] IT[7:2] If-Then execution state bits for the Thumb IT (If-Then)
instruction
[9] E Endianness execution state bit: 0 - Little-endian, 1 - Big-
endian
[8] A Asynchronous abort mask bit
[7] I IRQ mask bit
[6] F FIRQ mask bit
[5] T Thumb execution state bit
[4:0] M Mode field
Let's LOAD our bags and get MOVing
I am sorry for bad puns :)
Let's look at a pretty basic program and we'll learn to talk in assembly.
.global _start// EXTERNALLY ACCESSIBLE _start label _start: //STARTING POINT OF PROGRAM YOU CAN CHANGE _start TO ANYTHING MOV R0,#12 // MOVE VALUE IMMEDIATELY AT RIGHT TO R0 MOV R1,#11 // MOVE VALUE IMMEDIATELY AT RIGHT TO R1 ADD R2,R0,R1// ADD R0,R1 AND STORE IT IN R2 i.e, R2 = R0 + R1
Copy paste this in your emulator (cpulator) and click on compile and load.Now Click on step into three times and check reg values in your left. R0 has 12 (C in hex) stored and R1 has 11(B in hex) and R2 stores sum of R0 and R1 which is 23 (17 in hex).
If I have to read this program then, .global makes _start accessible outside of this current file.So it can be used as a starting point.
Syntax of label
label: instructions
So start is analogous to main function.
Now we move values into regs and add them and store them in a separate reg. This is the basic structure of a assembly program.Note that in comments I have used the word immediate;It's intentional and will become clear in next section. Now let us see different ways to move things around.
Addressing Modes
We can move data in different ways in assembly.
- We can move a value immediately into a register
- We can move a address directly into a register
- We can move values directly between registers
- Or we can move values in a indexed way
If you are learning for academic purpose then I would recommend addressing mode.
.global _start _start: LDR R0,=VAL// ABSOLUTE (or DIRECT) ADDRESSING LDR R1,[R0] //REGISTER INDIRECT ADDRESSING MOV R2, #1// IMMEDIATE (or LITERAL) ADDRESSING ADD R3,R1,R2 MOV R4,R3 //REGISTER TO REGISTER (or REGISTER DIRECT) ADDRESSING .data VAL: .word 0x17
Final Reg values:
R0->0xFF00AA00
R1 ->0x00000001
R2->0xFF00AA01
R3->0xFF00AA01
There are three modes shown in this example
In ABSOLUTE addressing, address of data( val) is loaded onto reg
In REGISTER INDIRECT addressing, value stored in source address is moved into destination register.
Where R1 is destination register and R0 is source register which stores address from where value has to be fetched.
In IMMEDIATE addressing value immediately to the right is written in to the register.
.global _start
.equ val ,0xfffffffe
_start:
MOV R0,#0xFFFFFFFE
MOV R1,#2
ADDS R2,R0,R1 // ADD WITH FLAGS
ADDCS R3,#1 //IF CARRY IS SET ADD CARRY TO R4
Also values from registers can be copied into each other in REGISTER TO REGISTER addressing.
These are ways to move without any indexing. i.e, these cannot be used to access a block of memory to access multiple values.Before looking into those let us have an introduction to branching statements.
Lets Jump Around a bit
In this section we will just peep into branching and then in later section we can have a deep dive.
syntax of branching :
label:
instructions
instructions
.
.
condition check
branching statement
//PROGRAM TO ADD 10 NUMBERS
.global _start
_start:
LDR R0,=length
LDR R1,=memory_block
LDR R3,[R0]
MOV R4,#0
LOOP: //BRANCH LABEL
LDR R2,[R1],#4 //POST INCREMENT
ADD R4,R4,R2 //ADDING PREVIOUS VALUE OF R4 WITH R2
SUB R3,#1 //DECREMENTING R3
CMP R3,#0 //CHECKING IF R3 IS ZERO
BNE LOOP //IF EQUAL FLAG IS NOT SET THEN LOOP
END: BAL END //HALT STATEMENT
.data
memory_block:
.word 1,2,3,4,5,6,7,8,9,10
length:
.word 10
Syntax for branching statement:
B {CEF} LABEL WHERE CEF IS CONDITIONAL EXECUTION FLAG DIFFERENT CEFS: EQ Equal NE Not equal CS Carry set (identical to HS) HS Unsigned higher or same (identical to CS) CC Carry clear (identical to LO) LO Unsigned lower (identical to CC) MI Minus or negative result PL Positive or zero result VS Overflow VC No overflow HI Unsigned higher LS Unsigned lower or same GE Signed greater than or equal LT Signed less than GT Signed greater than LE Signed less than or equal AL Always (this is the default) EX: BEQ EQUAL_LABEL BLT LESS_THAN_LABEL
Yep lot of new stuff there :) But Let's crack them.
So problem we had in all other mode of addressing was that we were not able to access a block of memory . So we have indexed addressing.
Here are different indexed addressing
//pre-Indexed ( base with displacement) / Register indirect with offset
LDR R0,[R1, #4] //CURRENT VALUE IN R1 IS STORED INTO R0
//pre-Indexed (auto indexing) / Register indirect pre-incrementing
LDR R0, [R1, #4]! //VALUE 4 ADDRESS NEXT TO R1 IS STORED INTO R0
//Post-indexing (auto indexed) / Register indirect post increment
LDR R0,[R1], #4 //HERE R1 WILL ALSO GET INCREMENTED HENCE THE NAME AUTO INDEXING
Arithmetic and Logical operations:
.global _start
_start:4
MOV R0,#10
MOV R1,#20
ADD R2,R0,R1
MUL R3,R0,R1
SUB R4,R1,R0
SUBS R5,R0,R1 //CLEARLY R0-R1 IS NEGATIVE SO WE USE SUBS TO SET
//FLAGS IN CPSR
This is quite self-explanatory.
//Logical Instructions
.global _start
_start:
MOV R0, #0x00FF00FF00
MOV R1, #0xAA00AA00AA
AND R2,R0,R1
ORR R3,R0,R1 //OR
EOR R4,R0,R1 //EXCLUSIVE OR
MVN R5,R0 //NEGATION //ROTATES AND SHIFTS
---------------------------------------------------------------------------
.global _start
_start:
MOV R0, #10
LSL R0,#1 //LOGICAL SHIFT LEFT BY 1
LSR R0,#1 //LOGICAL SHIFT RIGHT BY 1
ROR R0,#1 //ROTATE ONCE
Conditional Execution :
We can have conditional execution with inline check for condition
.global _start
.equ val ,0xfffffffe
_start:
MOV R0,#0xFFFFFFFE
MOV R1,#2
ADDS R2,R0,R1 // ADD WITH FLAGS
ADDCS R3,#1 //IF CARRY IS SET ADD CARRY TO R4
In the above example if there is a carry which is true in our case we indicate it by storing 1 in R3
Context Switching
Whenever we jump from one function to other we have no control over over-writing of GPRs. So we have to preserve the current state(or context). Lets have a look how to do the same.
.global _start
_start:
MOV R0,#21
MOV R1,#31
PUSH {R0,R1} //preserving the state(context)
BL ADDNUM //While branching load next instruction address in link register
POP {R0,R1} //retrieving the state
ADD R3,R0,R1
ADDNUM:
MOV R0,#31 //over writing preserved regs
MOV R1,#45
ADD R4,R0,R1
BX lr //branch using address stored in register (where register is lr in our case)
Idea is to push register values onto stack and while switching context we put value of next instruction onto link register.
In subroutine we are free to override GPRs and once we return from function we can retrieve GPRs by poping them from regs.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Splitting some assembly on actual hardware.
For this section I'll be using my Beagle Bone Black. You can follow it with raspberry pi or any arm hardware with some general purpose operating system or continue with emulator.
Syscalls in arm assembly
We can call for syscalls by storing syscall id onto reg R7 and then executing SWI instruction. You would have observed that if we did not use blocking brach such as:
end:b end
our control would just wander into deep mystical black hole.
To prevent this in hardware we can call for exit of program by storing 1 in r7 and calling for software interrupt
.global _start _start: MOV R0,#10 MOV R1,#12 ADD R3,R0,R1 MOV R7,#1 SWI 0 //SOFTWARE INTERRUPT TO STOP THE PROGRAM
For this example I will be using BBB with inbuilt debian present in eemc.
So I'll drop into serial terminal. My requirement is also the same having access to arm based terminal with assembler present .If you have a raspberry pi you can also drop into your terminal over your fancy HDMi or just use remote ssh to get access to a serial terminal.
BBB setup :
So the goal is to get a terminal. If you can do that on your own you are all set. But I'll be showing you how to get access for linux host. I'll be using picocom. To get picocom you can search in your package manager.
sudo apt install picocom -y
Now you can connect into serial terminal using a serial to usb converter . I recommend this as its pin overlap perfectly onto BBB you just have to replace the port to male to female right angled header.
Now connect your usb connector and fireup picocom.
picocom -b 115200 /dev/ttyUSB0
Now connect your BBB via USB and you can see boot sequence on your terminal.
Login with username debain, password :temppwd
now you will be logged into your home directory.
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Let's get coding
Step1:Create a working directory
mkdir Arm && cd Arm
Step 2: create a assembly file
touch hello.s
Step 3:Now you can write to this file using nano / vim. If you have no experience with vim feel free to use nano.
nano hello.s
Step 4: Write the following code
.global _start
_start:
MOV R0,#1 //std out (1->stdout ,2->stdin,3->stderr)
LDR R1,=message
LDR R2,=len //len of message
MOV R7,#4 //syscall for printing
SWI 0 //software interrupt
MOV R7,#1 //syscall to exit program
SWI 0
.data
message:
.string "hello world \n"
len = .-message
if you are using nano type ctrl + o and then ctrl + x on vim :wq (now you know to exit from vim)
Step 5: compiling into object file using as command:
as hello.s -o hello.o
Step 6: Now we load the object file onto the kernel
ld hello.o -o hello ./hello
Now you should see hello world output on your screen :)
Alright goodbye then this was a decent introduction to arm assembly. Hope you enjoyed this.
Now you can try some hardware interfacing,sorting etc..
Signing off