wiki/articles/assembly.md

97 lines
2.9 KiB
Markdown

# Assembly
Assembly, also known as assembler, is family of low-level [programming languages](programming_language.md) that are closely tied to a certain specific
[CPU](cpu.md) architecture machine code, thus by definition assembly is not portable.
Assembly is also OS-specific, since it depends on [syscalls](syscall.md) to do anything useful and things like the program
entrypoint differ between operating systems.
An assembler program is then converted into machine code by an *assembler*. Some assemblers are gas, the assembler of the [GNU](gnu.md)
project which supports multiple architecures and [NASM](https://www.nasm.us/), a popular x86-only assembler.
## Examples
### Hello world (x86-64, Linux)
```nasm
default rel ; tell the assembler to use RIP-relative addressing
; instead of absolute memory addresses
section .text ; start of the section containing the program's code (.text)
global _start ; make entrypoint symbol global
_start: ; On Linux, the executable entrypoint symbol name is `_start`
mov rax, 0x1 ; sys_write
mov rdi, 0x1 ; stdout
mov rdx, hello_str_len ; length of buffer
lea rsi, [hello_str] ; address of buffer
syscall
mov rax, 0x3c ; sys_exit (60)
xor rdi, rdi ; return code, zero
syscall
section .rodata ; start of the section containing readonly data (.rodata)
hello_str: db "Hello world!", 0xa ; `db` is an assembler directive for embdedding data (a string here)
hello_str_len: equ $-hello_str ; assemble-time length calculation
```
Assembling and linking:
```sh
nasm -felf64 -o hello.o hello.asm # assemble, generates an object file
cc -static-pie -nostdlib -o hello hello.o # now link to get an executable
```
### Hello world (ARM64, Linux)
```asm
.global _start
.section .text
_start:
mov x0, #1 // stdout file descriptor
adr x1, msg // address of buffer, pc relative
mov x2, msg_len // length of buffer
mov w8, #0x40 // service number
svc #0 // do syscall
mov x0, #39 // return number
mov w8, #0x5d // exit syscall
svc #0
.section .rodata
msg: .ascii "Hello, world!\n"
.equ msg_len, . - msg
```
For this we use the gas (GNU assembler):
```sh
as -o hello.o hello.asm
cc -nostdlib -o hello hello.o
```
### Factorial (x86-64, Linux)
Returns the calculated factorial as exit code (which is 255 max.)
```nasm
default rel
section .text
global _start
; Compute the factorial of `rdi`, saving the result in `rax`
fact:
mov eax, 1 ; register holding result
.l:
mul rdi ; rax *= rdi
dec rdi ; decrement rdi
test rdi, rdi ; check if rdi = 0
jne .l ; loop again if rdi != 0
ret
_start:
; x86 registers are sliced, here we assign to edi (32 bit slice)
; which gets extended to rdi later...
mov edi, 5
call fact
; exit syscall
mov rdi, rax
mov eax, 0x3c
syscall
```
TODO examples for other architectures