아직 포스팅을 하지는 않았지만, PostgreSQL은 여러가지 백업 관련 Extension을 이용해서 백업 복구를 실행하면 관리가 편리합니다. pg_rman, barman등을 활용해서 백업 복구를 진행한다면 간단히 PITR을 할 수 있습니다만, 이번 포스팅에서는 PostgreSQL 기본 기능을 활용해서 시점복구를 진행하고 기록해두려고 합니다.
온라인 가동 중 전체 백업을 위해서는 다음과 같이 postgresql.conf의 WAL(Write Ahead Log) 아카이브가 설정되어야합니다.
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'
# sql
select pg_start_backup('HotBackup');
# bash
tar cvf -f /home/postgres/backup.tar -C $PGDATA .
# pg 접속
psql -U postgres -d postgres
create table people(
id integer not null
, name text not null
, born date not null
, phone text null
, dept text null
, create_dt date not null
);
insert into people
select series as id,
substr('김이박최정강조윤장임한오서신권황안송전홍유고문양손배조백허심', trunc(random() * 29)::integer + 1, 1) ||
substr('경규나린림라민빈서선세수소신연영예우유윤원연은온영아이재제정준지진하태후현희', trunc(random() * 38)::integer + 1, 1) ||
substr('경규나린림라민빈서선세수소신연영예우유윤원연은온영아이재제정준지진하태후현희 ', trunc(random() * 39)::integer + 1, 1) as name,
(timestamp '1970-01-01' + random() * (timestamp '2002-12-31' - timestamp '1970-01-01'))::date as born,
'010-'||LPAD(trunc(random()*9999)::text,4, '0')||'-'||LPAD(trunc(random()*9999)::text,4, '0') as phone,
substr('서울대전대구부산', trunc(random() * 4)::integer*2 + 1, 2) as dept,
(timestamp '2006-01-01 00:00:00' + random() * (timestamp '2022-12-31 00:00:00' - timestamp '2018-08-12 00:00:00'))::date as create_dt
from generate_series(1, 1000000) series;
select count(*) from people;
# sql
select pg_stop_backup();
# Drop table을 백업 이후 10분정도 흐른 뒤 실행하였으므로 backup시점 10분 내로 PITR 진행 시 복구 확인 가능
drop table people;
DROP TABLE
# sql
ag_ctl stop
# bash
rm -rf $PGDATA
tar xvf /home/postgres/backup.tar -C $PGDATA
# 가장 최근의 .backup파일을 찾습니다.
# 저의 가장 최근 .backup 파일은 아래와 같습니다.
ls -altr /archive
-rw-------. 1 agens2 agens2 332 5월 14 17:43 00000002000000000000000E.00000110.backup
cat -rw-------. 1 agens2 agens2 332 5월 14 17:43 00000002000000000000000E.00000110.backup
START WAL LOCATION: 0/70000110 (file 00000002000000000000000E)
STOP WAL LOCATION: 0/70000220 (file 00000002000000000000000E)
CHECKPOINT LOCATION: 0/70000148
BACKUP METHOD: pg_start_backup
BACKUP FROM: primary
START TIME: 2024-05-14 17:42:34 KST
LABEL: hotbackup
START TIMELINE: 2
STOP TIME: 2024-05-14 17:43:44 KST
STOP TIMELINE: 2
postgresql.conf 파일에서 아래와 같이 restore와 recovery관련 파라미터에 내용을 기입합니다.
vi postgresql.conf
restore_command = 'cp /arhive/%f %p'
recovery_end_command = 'pg_wal_replay_resume()'
recovery_target_time = '2024-05-14 17:45:00 KST'
Database에 recovery 상태임을 알려주기 위한 용도인 recovery.signal 파일을 생성합니다.
touch $PGDATA/recovery.signal
DB를 재기동하고 복구가 되었는지 확인합니다.
ag_ctl start
[agens2@lenovo2 ~]$ psql -U agens -d sampledb
psql (14.5)
Type "help" for help.
sampledb=# select count(*) from people;
count
---------
1000000
(1 row)
현재 상황은 다음과 같습니다.
\c sampledb
select count(*) from people ;
count
---------
1000000
(1 row)
pg_basebackup을 활용하여 백업을 진행합니다.
/home/agens2 경로 밑에 basebackup을 받을 'basebackup' 디렉토리를 미리 생성해두었습니다.
[agens2@lenovo2 data]$ pg_basebackup -h localhost -p 5321 -U agens -D /home/agens2/basebackup -Fp -Xs -P -v
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/F0000028 on timeline 4
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_2548507"
139005/139005 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/F0000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
[agens2@lenovo2 basebackup]$ cat /home/agens2/basebackup/backup_label
START WAL LOCATION: 0/F0000028 (file 00000004000000000000001E)
CHECKPOINT LOCATION: 0/F0000060
BACKUP METHOD: streamed
BACKUP FROM: primary
START TIME: 2024-05-16 11:02:58 KST
LABEL: pg_basebackup base backup
START TIMELINE: 4
psql -U agens -d sampledb
sampledb=# drop table people ;
DROP TABLE
sampledb=# select current_time;
current_time
--------------------
11:06:23.960842+09
(1 row)
\q
pg_ctl stop
rm -rf $PGDATA/*
cd $PGDATA
cp -r /home/agens2/basebackup/* .
vi $PGDATA/postgresql.conf
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2024-05-16 02:05:00 UTC'
[agens2@lenovo2 basebackup]$ pg_ctl start
waiting for server to start....2024-05-16 11:16:19.129 KST [2553688] LOG: redirecting log output to logging collector process
2024-05-16 11:16:19.129 KST [2553688] HINT: Future log output will appear in directory "log".
.. done
server started
cat $PGDATA/log/postgresql-2024-05-16.log
2024-05-16 11:16:19.129 KST [2553688] LOG: starting PostgreSQL 14.5 (AgensSQL 14.5.1) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20), 64-bit
2024-05-16 11:16:19.131 KST [2553688] LOG: listening on IPv6 address "::1", port 5321
2024-05-16 11:16:19.131 KST [2553688] LOG: listening on IPv4 address "127.0.0.1", port 5321
2024-05-16 11:16:19.157 KST [2553688] LOG: listening on Unix socket "/tmp/.s.PGSQL.5321"
2024-05-16 11:16:19.235 KST [2553690] LOG: database system was interrupted; last known up at 2024-05-16 11:02:58 KST
2024-05-16 11:16:20.121 KST [2553690] LOG: redo starts at 0/F0000028
2024-05-16 11:16:20.143 KST [2553690] LOG: consistent recovery state reached at 0/F0000100
2024-05-16 11:16:20.143 KST [2553690] LOG: redo done at 0/F0000100 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.02 s
2024-05-16 11:16:21.137 KST [2553688] LOG: database system is ready to accept connections
~
[agens2@lenovo2 ~]$ psql -U agens -d sampledb
psql (14.5)
Type "help" for help.
sampledb=# select count(*) from people;
count
---------
1000000
(1 row)
PostgreSQL 버전에 따라서 방법이 조금씩 달라지기도 합니다. PostgreSQL 12.1 이전에는 recovery.conf 파일을 생성하고 파일 내에 restore_command 및 이외의 내용을을 적어 DB를 기동시켰다면, 그 이후 버전부터는 postgresql.conf 파일 내에서 기입 후 DB를 기동하는 식으로 편리해졌습니다. 따라서 PITR을 진행하고자하는 PG의 버전을 파악하고 진행할 필요가 있겠습니다.