4747da3a5b Add syscall sandboxing (seccomp-bpf) (practicalswift)
Pull request description:
Add experimental syscall sandboxing using seccomp-bpf (Linux secure computing mode).
Enable filtering of system calls using seccomp-bpf: allow only explicitly allowlisted (expected) syscalls to be called.
The syscall sandboxing implemented in this PR is an experimental feature currently available only under Linux x86-64.
To enable the experimental syscall sandbox the `-sandbox=<mode>` option must be passed to `bitcoind`:
```
-sandbox=<mode>
Use the experimental syscall sandbox in the specified mode
(-sandbox=log-and-abort or -sandbox=abort). Allow only expected
syscalls to be used by bitcoind. Note that this is an
experimental new feature that may cause bitcoind to exit or crash
unexpectedly: use with caution. In the "log-and-abort" mode the
invocation of an unexpected syscall results in a debug handler
being invoked which will log the incident and terminate the
program (without executing the unexpected syscall). In the
"abort" mode the invocation of an unexpected syscall results in
the entire process being killed immediately by the kernel without
executing the unexpected syscall.
```
The allowed syscalls are defined on a per thread basis.
I've used this feature since summer 2020 and I find it to be a helpful testing/debugging addition which makes it much easier to reason about the actual capabilities required of each type of thread in Bitcoin Core.
---
Quick start guide:
```
$ ./configure
$ src/bitcoind -regtest -debug=util -sandbox=log-and-abort
…
2021-06-09T12:34:56Z Experimental syscall sandbox enabled (-sandbox=log-and-abort): bitcoind will terminate if an unexpected (not allowlisted) syscall is invoked.
…
2021-06-09T12:34:56Z Syscall filter installed for thread "addcon"
2021-06-09T12:34:56Z Syscall filter installed for thread "dnsseed"
2021-06-09T12:34:56Z Syscall filter installed for thread "net"
2021-06-09T12:34:56Z Syscall filter installed for thread "msghand"
2021-06-09T12:34:56Z Syscall filter installed for thread "opencon"
2021-06-09T12:34:56Z Syscall filter installed for thread "init"
…
# A simulated execve call to show the sandbox in action:
2021-06-09T12:34:56Z ERROR: The syscall "execve" (syscall number 59) is not allowed by the syscall sandbox in thread "msghand". Please report.
…
Aborted (core dumped)
$
```
---
[About seccomp and seccomp-bpf](https://en.wikipedia.org/wiki/Seccomp):
> In computer security, seccomp (short for secure computing mode) is a facility in the Linux kernel. seccomp allows a process to make a one-way transition into a "secure" state where it cannot make any system calls except exit(), sigreturn(), and read() and write() to already-open file descriptors. Should it attempt any other system calls, the kernel will terminate the process with SIGKILL or SIGSYS. In this sense, it does not virtualize the system's resources but isolates the process from them entirely.
>
> […]
>
> seccomp-bpf is an extension to seccomp that allows filtering of system calls using a configurable policy implemented using Berkeley Packet Filter rules. It is used by OpenSSH and vsftpd as well as the Google Chrome/Chromium web browsers on Chrome OS and Linux. (In this regard seccomp-bpf achieves similar functionality, but with more flexibility and higher performance, to the older systrace—which seems to be no longer supported for Linux.)
ACKs for top commit:
laanwj:
Code review and lightly tested ACK 4747da3a5b
Tree-SHA512: e1c28e323eb4409a46157b7cc0fc29a057ba58d1ee2de268962e2ade28ebd4421b5c2536c64a3af6e9bd3f54016600fec88d016adb49864b63edea51ad838e17
However, keep a declaration in validation to make it possible to move
smaller chunks to blockstorage without breaking compilation.
Also, expose AbortNode in the header.
Can be reviewed with --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space