wiki/articles/assembly.md

2.9 KiB

Assembly

Assembly, also known as assembler, is family of low-level programming languages that are closely tied to a certain specific CPU architecture machine code, thus by definition assembly is not portable. Assembly is also OS-specific, since it depends on syscalls to do anything useful and things like the program entrypoint differ between operating systems.

An assembler program is then converted into machine code by an assembler. Some assemblers are gas, the assembler of the GNU project which supports multiple architecures and NASM, a popular x86-only assembler.

Examples

Hello world (x86-64, Linux)

default rel                 ; tell the assembler to use RIP-relative addressing
                            ; instead of absolute memory addresses
section .text               ; start of the section containing the program's code (.text)
global _start               ; make entrypoint symbol global

_start:                     ; On Linux, the executable entrypoint symbol name is `_start`
	mov rax, 0x1            ; sys_write
	mov rdi, 0x1            ; stdout
	mov rdx, hello_str_len  ; length of buffer
	lea rsi, [hello_str]    ; address of buffer
	syscall

	mov rax, 0x3c           ; sys_exit (60)
	xor rdi, rdi            ; return code, zero
	syscall

section .rodata                     ; start of the section containing readonly data (.rodata)
hello_str: db "Hello world!", 0xa   ; `db` is an assembler directive for embdedding data (a string here)
hello_str_len: equ $-hello_str      ; assemble-time length calculation

Assembling and linking:

nasm -felf64 -o hello.o hello.asm             # assemble, generates an object file
cc -static-pie -nostdlib -o hello hello.o     # now link to get an executable

Hello world (ARM64, Linux)

.global _start
.section .text

_start:
	mov x0, #1			// stdout file descriptor
	adr x1, msg 		// address of buffer, pc relative
	mov x2, msg_len 	// length of buffer
	mov w8, #0x40		// service number
	svc #0				// do syscall

	mov x0, #39			// return number
	mov w8, #0x5d		// exit syscall
	svc #0

.section .rodata
msg: .ascii "Hello, world!\n"
.equ msg_len, . - msg

For this we use the gas (GNU assembler):

as -o hello.o hello.asm
cc -nostdlib -o hello hello.o

Factorial (x86-64, Linux)

Returns the calculated factorial as exit code (which is 255 max.)

default rel

section .text
global _start

; Compute the factorial of `rdi`, saving the result in `rax`
fact:
	mov eax, 1			; register holding result
.l:
	mul rdi				; rax *= rdi
	dec rdi				; decrement rdi
	test rdi, rdi		; check if rdi = 0
	jne .l				; loop again if rdi != 0
	ret
	
_start:
	; x86 registers are sliced, here we assign to edi (32 bit slice)
	; which gets extended to rdi later...
	mov edi, 5
	call fact
	; exit syscall
	mov rdi, rax
	mov eax, 0x3c
	syscall

TODO examples for other architectures