SECCOMP(1)

dandb3·2023년 5월 22일

pwnable

목록 보기

2/26

Sandbox?
- 취약점 자체를 보호한다기 보단, 공격받을 수 있는 표면적 자체를 줄이는 기법
- Allow list, Deny list가 존재하여 꼭 필요한 시스템 콜, 파일 접근등을 허용한다.
SECCOMP?
- SECure COMPuting mode의 약자.
- Sandbox 매커니즘 중 하나.
- 실행할 수 있는 시스템 콜을 제한하는 역할을 한다.
- 시스템 콜 호출 전에 미리 검사하여 허용되지 않은 시스템 콜의 경우 SIGKILL을 발생시킨다.
Modes
- STRICT_MODE
  - 말 그대로 strict한 모드로, 오직 read, write, _exit, sigreturn 시스템 콜만 허용한다.
- FILTER_MODE
  - 유동적으로 필터링 할 시스템 콜을 설정할 수 있다.

SECCOMP 구성 코드

	int __secure_computing(const struct seccomp_data *sd) {
    	int mode = current->seccomp.mode;
    	int this_syscall;
    	... 
    	this_syscall = sd ? sd->nr : syscall_get_nr(current, task_pt_regs(current));
    	switch (mode) {
      	case SECCOMP_MODE_STRICT:
        	__secure_computing_strict(this_syscall); /* may call do_exit */
        	return 0;
        case SECCOMP_MODE_FILTER:
        	return __seccomp_filter(this_syscall, sd, false);
        ...
    	}
  	}

mode에 따라서 STRICT 검사 혹은 FILTER 검사를 할 것인지 선택하는 것을 알 수 있다.

STRICT_MODE?

static const int mode1_syscalls[] = {
    __NR_seccomp_read,
    __NR_seccomp_write,
    __NR_seccomp_exit,
    __NR_seccomp_sigreturn,
    -1, /* negative terminated */
};
#ifdef CONFIG_COMPAT
static int mode1_syscalls_32[] = {
    __NR_seccomp_read_32,
    __NR_seccomp_write_32,
    __NR_seccomp_exit_32,
    __NR_seccomp_sigreturn_32,
    0, /* null terminated */
};
#endif
static void __secure_computing_strict(int this_syscall) {
  	const int *allowed_syscalls = mode1_syscalls;
#ifdef CONFIG_COMPAT
  	if (in_compat_syscall()) allowed_syscalls = get_compat_mode1_syscalls();
#endif
  	do {
    	if (*allowed_syscalls == this_syscall) return;
  	} while (*++allowed_syscalls != -1);
#ifdef SECCOMP_DEBUG
  	dump_stack();
#endif
  	seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL_THREAD, true);
  	do_exit(SIGKILL);
}

호출된 시스템 콜이 mode1_syscalls 혹은 mode1_syscalls_32 배열의 원소와 같은지 확인하고, 그렇지 않은 경우 SIGKILL을 통해 종료시키는 것을 알 수 있다.
물론 배열을 통해 허용된 시스템 콜은 read, write, _exit, sigreturn의 4개만 해당된다는 것도 확인 가능하다.

FILTER_MODE?
두 가지의 방법으로 적용가능하다.
- 라이브러리 함수 사용
```
#include <seccomp.h>

typedef void * scmp_filter_ctx;

scmp_filter_ctx seccomp_init(uint32_t def_action);
int seccomp_reset(scmp_filter_ctx otx, uint32_t def_action);
```
  -lseccomp를 링크해주어야 한다.
  - seccomp filter state를 초기화 해준다.
  - seccomp filter를 다 사용하고 적용까지 완료된 후에는 seccomp_release(3)을 호출해 주어야 한다.
  - def_action 인자에 따라서 초기화되는 상태가 달라진다. 인자에 들어올 수 있는 값은 아래와 같다 :
    - SCMP_ACT_KILL : seccomp filter rules에 해당되지 않는 시스템 콜을 호출한 스레드는 커널에 의해 SIGSYS를 받고 종료된다. 해당 스레드는 SIGSYS signal을 catch할 수 없다.
    - SCMP_ACT_KILL_PROCESS : SCMP_ACT_KILL과 비슷한데, 스레드가 아닌 프로세스가 SIGSYS를 받고 종료된다.
    - SCMP_ACT_TRAP : SCMP_ACT_KILL과 비슷한데, 종료되지 않고 SIGSYS signal을 catch해서 핸들링이 가능하다.
    - SCMP_ACT_ERRNO(uint16_t errno) : errno에 해당하는 값을 리턴받는다.
    - SCMP_ACT_TRACE(uint16_t msg_num) : ... 생략(알 게 너무 많음)
    - SCMP_ACT_LOG : seccomp filter에 해당되지 않는 경우 아무 영향을 끼치지 않지만 시스템 콜이 로그에 기록된다.
    - SCMP_ACT_ALLOW : seccomp filter에 해당되지 않는 경우 아무 영향을 끼치지 않는다. (filter에 허용이 아닌 금지가 있을 경우 금지시킴.)
```
int seccomp_rule_add(scmp_filter_ctx ctx, uint32_t action, int syscall, unsigned int arg_cnt, ...);
int seccomp_load(scmp_filter_ctx ctx);
```
  - 얘는 워낙에 기능이 많아서 예시를 통해서만 알아보자.
  - ALLOW LIST : default가 deny, 필터에 해당하는 것만 allow.
```
void sandbox() {
	scmp_filter_ctx ctx;
    ctx = seccomp_init(SCMP_ACT_KILL);
    if (ctx == NULL) {
      printf("seccomp error\n");
      exit(0);
	}
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(rt_sigreturn), 0);
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(exit), 0);
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(exit_group), 0);
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(read), 0);
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 0);
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(open), 0);
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(openat), 0);
    seccomp_load(ctx);
}
```
  dreamhack에서 가져온 코드.
  - seccomp_init(SCMP_ACT_KILL) : 필터에 해당하지 않는 경우 종료.
  - seccomp_rule_add : 각 system call을 SCMP_ACT_ALLOW를 통해 허용시키는 필터를 추가한다.
  - seccomp_load : 변경사항이 적용된 ctx를 실제 seccomp에 적용시킨다.
  - DENY LIST : default가 allow, 필터에 해당하는 경우 deny.
```
void sandbox() {
    scmp_filter_ctx ctx;
    ctx = seccomp_init(SCMP_ACT_ALLOW);
    if (ctx == NULL) {
        exit(0);
    }
    seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(open), 0);
    seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(openat), 0);
    seccomp_load(ctx);
}
```
  위와 비슷하다.
  - seccomp_init(SCMP_ACT_ALLOW) : 필터에 해당하지 않는 경우 허용.
  - seccomp_rule_add : 각 system call을 SCMP_ACT_KILL를 통해 금지시키는 필터를 추가한다.
  - seccomp_load : 변경사항이 적용된 ctx를 실제 seccomp에 적용시킨다.
  - 이런 과정을 통해 사용된다.
- BPF?
  - Berkeley Packet Filter.
  - 패킷을 분석하고 필터링하는데 사용되는 in-kernel virtual machine.
  - 패킷 외에도 커널 내부에서 다양한 용도로 많이 사용된다.
  - 이 BPF를 사용하여 syscall을 필터링하는 데에도 쓸 수 있다.
  - 사실 seccomp 라이브러리 함수 또한 내부적으로 BPF를 사용함.
- BPF 사용 방법.
  - prctl함수를 사용해서 관리한다.
```
#include <sys/prctl.h>
int prctl(int option, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5);
```
    일반적으로 prctl함수는 option인자를 통해 어떤 설정을 바꿀 것인지 결정하고, 그 이후의 인자들로 calling thread나 process의 여러 설정값들을 조정하는 역할을 한다. 여기서는 seccomp 기능에 대해서만 알아본다.
    - PR_SET_SECCOMP
      호출 스레드의 seccomp 모드를 설정한다.
      사용 전에 prctl(PR_SET_NO_NEW_PRIVS, 1)을 미리 실행해야 한다. (자세한 건 나도 모르겠음. 어쨌든 미리 써야 함.)
      - arg2 == SECCOMP_MODE_STRICT : STRICT_MODE로 설정 가능, 커널이 CONFIG_SECCOMP enabled되어야 사용 가능하다.
      - arg2 == SECCOMP_MODE_FILTER : FILTER_MODE로 설정, arg3를 통해 BPF 구조체를 전달해 상세한 필터링을 할 수 있다. 커널이 CONFIG_SECCOMP_FILTER enabled 되어야 사용할 수 있다.
- BPF를 담는 구조체
```
cBPF(classic BPF)
struct sock_filter {	/* Filter block */
    __u16	code;   /* Actual filter code */
    __u8	jt;	/* Jump true */
    __u8	jf;	/* Jump false */
    __u32	k;      /* Generic multiuse field */
};

struct sock_fprog {	/* Required for SO_ATTACH_FILTER. */
    unsigned short		len;	/* Number of filter blocks */
    struct sock_filter __user *filter;
};
```
  - code : 명령어에 해당하는 부분.
  - jt : true일 경우 jump하는 offset
  - jf : false일 경우 jump하는 offset
  - k : generic하게 사용된다.
- BPF를 사용하는 방법.
```
#ifndef BPF_STMT
#define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
#endif
#ifndef BPF_JUMP
#define BPF_JUMP(code, k, jt, jf) { (unsigned short)(code), jt, jf, k }
#endif
```
  - 이 부분을 보면 알겠지만, BPF_STMT, BPF_JUMP는 어떠한 구조체의 정의에 해당하는 매크로임을 알 수 있다.
  - 실제로 struct sock_filter 배열을 선언할 때 이 매크로를 다음과 같이 사용한다 :
```
struct sock_filter filter[] = {
    /* Validate architecture. */
    BPF_STMT(BPF_LD + BPF_W + BPF_ABS, arch_nr),
   	BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ARCH_NR, 1, 0),
    BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_KILL),
    /* Get system call number. */
    BPF_STMT(BPF_LD + BPF_W + BPF_ABS, syscall_nr),
};
```
    그래서 각각의 매크로들은 배열의 인자 하나에 해당하게 된다.
  - 여기서 첫 번째 인자의 + 연산은 도대체 무엇일까? 다음 소스코드를 보자.
```
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef _UAPI__LINUX_BPF_COMMON_H__
#define _UAPI__LINUX_BPF_COMMON_H__

/* Instruction classes */
#define BPF_CLASS(code) ((code) & 0x07)
#define		BPF_LD		0x00
#define		BPF_LDX		0x01
#define		BPF_ST		0x02
#define		BPF_STX		0x03
#define		BPF_ALU		0x04
#define		BPF_JMP		0x05
#define		BPF_RET		0x06
#define		BPF_MISC        0x07

/* ld/ldx fields */
#define BPF_SIZE(code)  ((code) & 0x18)
#define		BPF_W		0x00 /* 32-bit */
#define		BPF_H		0x08 /* 16-bit */
#define		BPF_B		0x10 /*  8-bit */
/* eBPF		BPF_DW		0x18    64-bit */
#define BPF_MODE(code)  ((code) & 0xe0)
#define		BPF_IMM		0x00
#define		BPF_ABS		0x20
#define		BPF_IND		0x40
#define		BPF_MEM		0x60
#define		BPF_LEN		0x80
#define		BPF_MSH		0xa0

/* alu/jmp fields */
#define BPF_OP(code)    ((code) & 0xf0)
#define		BPF_ADD		0x00
#define		BPF_SUB		0x10
#define		BPF_MUL		0x20
#define		BPF_DIV		0x30
#define		BPF_OR		0x40
#define		BPF_AND		0x50
#define		BPF_LSH		0x60
#define		BPF_RSH		0x70
#define		BPF_NEG		0x80
#define		BPF_MOD		0x90
#define		BPF_XOR		0xa0

#define		BPF_JA		0x00
#define		BPF_JEQ		0x10
#define		BPF_JGT		0x20
#define		BPF_JGE		0x30
#define		BPF_JSET        0x40
#define BPF_SRC(code)   ((code) & 0x08)
#define		BPF_K		0x00
#define		BPF_X		0x08

#ifndef BPF_MAXINSNS
#define BPF_MAXINSNS 4096
#endif

#endif /* _UAPI__LINUX_BPF_COMMON_H__ */
```
    from /include/uapi/linux/bpf_common.h
    너무 길긴 하지만 일단 보면은
    - instruction class : 0x07과 AND 연산.
      - 하위 1, 2, 3 비트에 해당!
    - ld/ldx fields : 0x18 혹은 0xe0와 AND 연산.
      - 하위 4, 5 비트 혹은 6, 7, 8 비트에 해당!
    - alu/jmp fields : 0xf0 혹은 0x08과 AND 연산.
      - 하위 5, 6, 7, 8 비트 혹은 4 비트에 해당!
    - 즉, BPF 명령어의 종류에 따라 사용하는 비트가 다르다는 것을 알 수 있다.
  - 비트영역의 분할 기준 (eBPF를 기준으로 설명, cBPF도 크게 다르지 않을 듯.)
    - arithmetic and jump instructions
      +----------------+--------+--------------------+
      | 4 bits | 1 bit | 3 bits |
      | operation code | source | instruction class |
      +----------------+--------+--------------------+
      (MSB) (LSB)
    - load and store instructions
      +--------+--------+-------------------+
      | 3 bits | 2 bits | 3 bits |
      | mode | size | instruction class |
      +--------+--------+-------------------+
      (MSB) (LSB)
      이런 식으로 나뉘게 된다.
  - 앞의 헤더파일을 참고로 다시 돌아가서 보면,
```
struct sock_filter filter[] = {
    /* Validate architecture. */
    BPF_STMT(BPF_LD + BPF_W + BPF_ABS, arch_nr),
   	BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ARCH_NR, 1, 0),
    BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_KILL),
    /* Get system call number. */
    BPF_STMT(BPF_LD + BPF_W + BPF_ABS, syscall_nr),
};
```
    - BPF_LD + BPF_W + BPF_ABS : 0x00(1, 2, 3비트 instruction class) + 0x00(4, 5비트 ld/ldx fields) + 0x20(6, 7, 8비트 ld/ldx fields)
    - BPF_JMP + BPF_JEQ + BPF_K : 0x05(1, 2, 3비트 instruction class) + 0x10(5, 6, 7, 8비트 alu/jmp fields) + 0x00(4비트 alu/jmp fields)
      둘 다 겹치지 않고 8비트를 채우는 명령어들로만 이루어져 있다.
그래서 이거 왜함?
- 투머치하게 보긴 했는데, BPF 필터에 명령어를 채우는(?) 방법에 대해 간단하게 알아보았다.

2편에 이어서...

출처

dandb3

공부 내용 저장소

이전 포스트

syscall call table이 만들어지는 과정

다음 포스트

SECCOMP(1)

pwnable

syscall call table이 만들어지는 과정

SECCOMP(2)

0개의 댓글