From 32ebbacc40155bd2ae7f01b8528812271cca8ee2 Mon Sep 17 00:00:00 2001 From: tocariimaa Date: Fri, 14 Mar 2025 00:10:44 -0300 Subject: [PATCH] articles/bytecode.md: add Lua bytecode example --- articles/bytecode.md | 76 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/articles/bytecode.md b/articles/bytecode.md index 47d21f2..cedc84c 100644 --- a/articles/bytecode.md +++ b/articles/bytecode.md @@ -25,4 +25,80 @@ before lowering down to more machine-specific code. TODO more info ### Lua bytecode +[Lua](lua.md) uses a register based virtual machine, originally it was stack-based; the change was +made in version 5.0, improving performance. + +#### Factorial program in bytecode form +```lua +function fact(n) + local res = 1 + for i = 1, n do + -- Lua has no operator-assignment operators + res = res * i + end + return res +end + +return fact(5) +``` +We compile to bytecode and dump it with: +```sh +lua -p -l -l fact.lua +``` + +Annotated (by me) bytecode output: +``` +Toplevel function: main (9 instructions at 0x58e78aa2ecc0) + 0+ params, 3 slots, 1 upvalue, 0 locals, 2 constants, 1 function +Varargs, 0 fixed arguments expected 1 [1] VARARGPREP 0 +Create a function 2 [8] CLOSURE 0 0 ; 0x58e78aa2ef20 +Bound it to the name "fact" and add it to the _ENV 3 [1] SETTABUP 0 0 0 ; _ENV "fact" +Get `print` function reference from the _ENV, to R0 4 [8] GETTABUP 0 0 1 ; _ENV "print" +Get `fact` function reference from the _ENV, to R1 5 [9] GETTABUP 1 0 0 ; _ENV "fact" +Load 5 into R2 (first argument for `fact`) 6 [9] LOADI 2 5 +Call function at R1, with 2-1 args, saving result in the `top` ^1 7 [9] CALL 1 2 0 ; 1 in all out +Call function at R2, using `top` as argument, no return value 8 [9] CALL 0 0 1 ; all in 0 out +Return from toplevel (exit program) 9 [9] RETURN 0 1 1 ; 0 out + constants (2) for 0x58e78aa2ecc0: +The function name, as a string costant 0 S "fact" +Ditto for print function 1 S "print" + locals (0) for 0x58e78aa2ecc0: + upvalues (1) for 0x58e78aa2ecc0: +_ENV is the table that contains the global enviroment 0 _ENV 1 0 + + +The `fact` function: function (10 instructions at 0x58e78aa2ef20) + 1 param, 6 slots, 0 upvalues, 6 locals, 0 constants, 0 functions +Load 1 into R1 (`res`) 1 [2] LOADI 1 1 +Load 1 into R2, holding the for loop initial state 2 [3] LOADI 2 1 +Move value from R0 [`n`] to the for loop max limit register 3 [3] MOVE 3 0 +for loop step value, 1 by default 4 [3] LOADI 4 1 +Initialize for loop ^2 .--- 5 [3] FORPREP 2 2 ; exit to 9 +Multiply. R1 [`res`] = R1 * R5 [`i`] | .> 6 [5] MUL 1 1 5 +Attempt to execute the metamethod `__mul` ^3 | | 7 [5] MMBIN 1 5 8 ; __mul +Do a for loop iteration | `- 8 [3] FORLOOP 2 3 ; to 6 +Return from function with 1 value, R1 [`res`] `--> 9 [7] RETURN1 1 +Return with no arguments (redundant bytecode) ^4 10 [8] RETURN0 + constants (0) for 0x58e78aa2ef20: +Local bindings for this function, with the registers: locals (6) for 0x58e78aa2ef20: + 0 n 1 11 + 1 res 2 11 + 2 (for state) 5 9 + 3 (for state) 5 9 + 4 (for state) 5 9 + 5 i 6 8 + upvalues (0) for 0x58e78aa2ef20: +``` +Notes: +1. The `top` is a special slot in the virtual machine, originally meaning the "top" of the stack (when Lua had stack-based +bytecode). Lua still uses a stack model for the C API. +2. The first argument to this opcode is an offset in the register file in which `FORPREP` will treat +the following three registers as arguments to it; R(Off) = initial and internal state of the loop, +R(Off + 1) = the loop limit, R(Off + 2) = the loop step value (1 here) and R(Off + 3) = the register containing the external +state, `i` in this case. The second argument is the jump offset to the instruction after the end of the loop. +3. I don't know why it does this redundant metamethod lookup +4. This redundant bytecode is meant for functions that return nothing, here it exists +probably because the compiler always adds it to the end of all functions, but it doesn't bother +to check later on if its redundant or not, to keep the bytecode compiler simple and fast. + TODO