3.1.配置主从数据库
3.2.备库创建 pg_rewind 用户并授权
step 1.创建 rewind_user 用户
CREATE USER rewind_user LOGIN; GRANT EXECUTE ON function pg_catalog.pg_ls_dir(text, boolean, boolean) TO rewind_user; GRANT EXECUTE ON function pg_catalog.pg_stat_file(text, boolean) TO rewind_user; GRANT EXECUTE ON function pg_catalog.pg_read_binary_file(text) TO rewind_user; GRANT EXECUTE ON function pg_catalog.pg_read_binary_file(text, bigint, bigint, boolean) TO rewind_user;
复制
step 2.修改 postgresql.conf 配置
wal_log_hints = on # default:off 或者initdb初始化集群时启用数据校验 full_page_writes = on # default:on
复制
3.3.回退只读从库
3.3.1.激活从库(从库)
step 1.备节点执行切换,使从库能够读写
[postgres@standby-database data]$ pg_ctl -D /opt/pg15beta1/data -l logfile promote waiting for server to promote..... done server promoted [postgres@standby-database data]$
复制
step 2.查看备库状态
[postgres@standby-database data]$ pg_controldata |grep state Database cluster state: in production [postgres@standby-database data]$
复制
3.3.2.写入从库(从库)
从库激活后产生读写,使用pg_rewind修复从库,回退到只读从库
step 1.在数据库“postgres”中进行初始化(“pgbench -i”)。
[postgres@standby-database data]$ pgbench -i dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data (client-side)... 100000 of 100000 tuples (100%) done (elapsed 0.37 s, remaining 0.00 s) vacuuming... creating primary keys... done in 1.50 s (drop tables 0.00 s, create tables 0.38 s, client-side generate 0.70 s, vacuum 0.08 s, primary keys 0.34 s). [postgres@standby-database data]$
复制
step 2.在从库中写入数据,生成wal日志
[postgres@standby-database data]$ pgbench -M prepared -v -r -P 1 -c 4 -j 4 -T 10 -p 5432 pgbench (15beta1) starting vacuum...end. starting vacuum pgbench_accounts...end. progress: 1.0 s, 73.0 tps, lat 52.092 ms stddev 44.973, 0 failed progress: 2.0 s, 71.0 tps, lat 57.346 ms stddev 48.713, 0 failed progress: 3.0 s, 102.0 tps, lat 39.707 ms stddev 20.312, 0 failed progress: 4.0 s, 88.0 tps, lat 45.349 ms stddev 33.727, 0 failed progress: 5.0 s, 73.0 tps, lat 53.625 ms stddev 47.683, 0 failed progress: 6.0 s, 87.0 tps, lat 45.796 ms stddev 31.125, 0 failed progress: 7.0 s, 96.0 tps, lat 42.844 ms stddev 20.861, 0 failed progress: 8.0 s, 96.0 tps, lat 41.207 ms stddev 22.846, 0 failed progress: 9.0 s, 100.0 tps, lat 40.096 ms stddev 20.878, 0 failed progress: 10.0 s, 72.0 tps, lat 55.707 ms stddev 46.478, 0 failed transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: prepared number of clients: 4 number of threads: 4 maximum number of tries: 1 duration: 10 s number of transactions actually processed: 862 number of failed transactions: 0 (0.000%) latency average = 46.447 ms latency stddev = 34.642 ms initial connection time = 6.326 ms tps = 86.007961 (without initial connection time) statement latencies in milliseconds and failures: 0.002 0 \set aid random(1, 100000 * :scale) 0.001 0 \set bid random(1, 1 * :scale) 0.000 0 \set tid random(1, 10 * :scale) 0.001 0 \set delta random(-5000, 5000) 0.071 0 BEGIN; 0.136 0 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.084 0 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 7.026 0 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 27.587 0 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.094 0 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 11.445 0 END; [postgres@standby-database data]$
复制
此时从库和主库已不在同一个时间线上 。
step 3.查看主备库的时间线
查看从库时间线
[postgres@standby-database ~]$ pg_controldata | grep TimeLineID Latest checkpoint's TimeLineID: 2 Latest checkpoint's PrevTimeLineID: 2 [postgres@standby-database ~]$
复制
查看主库时间线
[postgres@primary-database 20220729]$ pg_controldata | grep TimeLineID Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 [postgres@primary-database 20220729]$
复制
3.3.3.修复从库,使之继续成为当前主库的从库
step 1.查看切换点
[postgres@standby-database data]$ ll $PGDATA/pg_wal/*.history -rw-------. 1 postgres postgres 41 Jul 29 16:54 /opt/pg15beta1/data/pg_wal/00000002.history [postgres@standby-database data]$
复制
查看最新时间线的.history文件
[postgres@standby-database data]$ cat /opt/pg15beta1/data/pg_wal/00000002.history 1 0/9005F28 no recovery target specified [postgres@standby-database data]$
复制
step 2.从库激活时间开始产生的WAL必须全部在pg_wal目录中
从 000000020000000000000090 开始,所有的wal必须存在从库pg_wal目录中。如果已经覆盖了,必须从归档目录拷贝到从库pg_wal目录中,也可以直接将归档文件全部拷贝到pg_wal目录下 。
cp -R /opt/pg_backup/archive_log/20220729/* /opt/pg15beta1/data/pg_wal/
复制
step 3.停掉从库
[postgres@standby-database data]$ pg_ctl -D /opt/pg15beta1/data -l logfile stop waiting for server to shut down.... done server stopped [postgres@standby-database data]$
复制
step 4.测试修复是否能够成功
[postgres@standby-database data]$ pg_rewind -n -D /opt/pg15beta1/data --source-server="hostaddr=193.169.100.79 user=postgres port=5432" pg_rewind: servers diverged at WAL location 0/9005F28 on timeline 1 pg_rewind: rewinding from last common checkpoint at 0/8000060 on timeline 1 pg_rewind: Done! [postgres@standby-database data]$
复制
step 5.可以修复,直接修复
[postgres@standby-database ~]$ pg_rewind -D /opt/pg15beta1/data --source-server="hostaddr=193.169.100.79 user=postgres port=5432" pg_rewind: servers diverged at WAL location 0/9005F28 on timeline 1 pg_rewind: rewinding from last common checkpoint at 0/8000060 on timeline 1 pg_rewind: Done! [postgres@standby-database ~]$
复制
修复时间线,会将归档日志恢复到错误的时间线的记录。
step 6.查看从库时间线
[postgres@standby-database pg_wal]$ pg_controldata | grep TimeLineID Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 [postgres@standby-database pg_wal]$
复制
step 7.修改备库配置文件 postgresql.auto.conf,保证从库能从主库获取到日志流
$ cat postgresql.auto.conf primary_conninfo = 'user=repl passfile=''/home/postgres/.pgpass'' host=193.169.100.79 port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any' restore_command = 'DATE=`date +%Y%m%d`; DIR="/opt/pg_backup/archive_log/$DATE"; cp $DIR/%f %p' archive_cleanup_command = 'DATE=`date +%Y%m%d`; DIR="/opt/pg_backup/archive_log/$DATE"; pg_archivecleanup $DIR %r' recovery_target_timeline = 'latest'
复制
修复时间线,会将数据库的配置文件恢复到错误时间线的配置。
step 8.删除归档中错误时间线上产生的文件否则会在启动修复后的从库后,走到00000002时间线上,这是不想看到的
mkdir /opt/pg_backup/archive_log/error_tl_2 mv 00000002*00 error_tl_2
复制
step 9.创建从库标识文件
touch /opt/pg15beta1/data/standby.signal
复制
最后修改时间:2024-07-10 11:21:08
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。
文章被以下合辑收录
评论
目录