3.1 主机初始化
请注意主机初始化,主节点和从节点操作步骤一样
Host | IP | 组件 | 主机角色 |
repmgrpri | 192.168.56.224 | Postgressql12.4,repmgr,keepalive | 主节点 |
repmgrsta | 192.168.56.225 | Postgressql12.4,repmgr,keepalive | 从节点 |
repmgrwitness | 192.168.56.226 | Postgressql12.4,repmgr | 见证节点 |
3.1.1 关闭防火墙
root 用户下执行
## 关闭防火墙 systemctl stop firewalld.service ## 开机启动关闭防火墙 systemctl disable firewalld.service |
3.1.2 关闭SELinux
root 用户下执行
## 关闭SELinux sed -i '/SELINUX=/d' /etc/selinux/config echo "SELINUX=disabled" >> /etc/selinux/config ## 查看SELinux配置 cat /etc/selinux/config|grep -v ^#|grep -v '^$' |
3.1.3 配置内核参数
root 用户下执行
## 修改内核参数 echo " fs.file-max = 76724200 kernel.sem = 10000 10240000 10000 1024 kernel.shmmni = 4096 kernel.shmall = 1073741824 kernel.shmmax = 4398046511104 net.ipv4.ip_local_port_range = 9000 65500 net.core.rmem_default = 262144 net.core.wmem_default = 262144 net.core.rmem_max = 4194304 net.core.wmem_max = 4194304 fs.aio-max-nr = 40960000 vm.dirty_ratio=20 vm.dirty_background_ratio=3 vm.dirty_writeback_centisecs=100 vm.dirty_expire_centisecs=500 vm.swappiness=10 vm.min_free_kbytes=524288">>/etc/sysctl.d/99-sysctl.conf &&sysctl --system |
3.1.4 重启操作系统
root 用户下执行
## 重启操作系统,让配置生效 reboot |
3.2 安装postgres数据库
请注意数据库安装,主节点和从节点安装步骤一样
3.2.1 安装依赖包
root 用户下执行
yum install -y libicu-devellibxslt python-develglibc-devellibmpc |
3.2.2 安装rpm包
## 注意安装顺序,root用户执行 rpm -ivh postgresql12-libs-12.4-1PGDG.rhel7.x86_64.rpm rpm -ivh postgresql12-12.4-1PGDG.rhel7.x86_64.rpm rpm -ivh postgresql12-server-12.4-1PGDG.rhel7.x86_64.rpm rpm -ivh postgresql12-contrib-12.4-1PGDG.rhel7.x86_64.rpm rpm -ivh ncurses-devel-5.9-14.20130511.el7_4.x86_64.rpm rpm -ivh libedit-devel-3.0-12.20121213cvs.el7.x86_64.rpm rpm -ivh llvm5.0-* rpm -ivh devtoolset-7-* rpm -ivh llvm-toolset-7-* rpm -ivh postgresql12-devel-12.4-1PGDG.rhel7.x86_64.rpm rpm -ivhpg_repack12-1.4.6-1.rhel7.x86_64.rpm rpm -ivh python2-psycopg2-2.8.5-2.rhel7.x86_64.rpm rpm -ivh postgresql12-plpython-12.4-1PGDG.rhel7.x86_64.rpm |
3.2.3 创建数据库目录
##root用户执行 mkdir -p /pgsql/data mkdir -p /pgsql/arclog chown -R postgres:postgres /pgsql/ |
3.2.4 修改环境变量
root用户执行
echo 'export PGHOME=/usr/pgsql-12/ export PGDATA=/pgsql/data export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH export PATH=$PGHOME/bin:$PATH'>>/etc/profile |
环境变量生效
source /etc/profile |
postgres用户执行
##重请注意标红部分为修改后的值 su - postgres cat .bash_profile [ -f /etc/profile ]&& source /etc/profile PGDATA=/pgsql/data export PGDATA # If you want to customize your settings, # Use the file below. This is not overridden # by the RPMS. [ -f /var/lib/pgsql/.pgsql_profile ] && source /var/lib/pgsql/.pgsql_profile |
环境变量生效
## 重启使部分参数生效 source .bash_profile |
3.2.5 初始化数据库
postgres用户执行
initdb -k -E UTF8 --locale=C --lc-ctype=en_US.UTF8 -D /pgsql/data |
3.2.6 启动数据库
postgres用户执行
pg_ctl start -D /pgsql/data/ |
3.3 主节点配置
主节点配置的操作请都在主节点执行
3.3.1 创建复制用户
用户/密码:repmgr/repmgr
[root@pg~]#su - postgres Last login: Mon Aug 9 16:53:35 CST 2021 on pts/0 -bash-4.2$ createuser -U postgres repmgr -P -c 5 --replication Enter password for new role: Enter it again: -bash-4.2$ psql psql (12.4) Type "help" for help. postgres=# ALTER USER repmgr SET search_path TO repmgr, "$user", public; ALTER ROLE ###创建存储复制元数据的数据库 postgres=# create database repmgr owner repmgr; postgres=# alter user repmgr superuser; |
3.3.2 设置超级用户密码
postgres用户执行
-bash-4.2$ psql psql (12.4) Type "help" for help. postgres=# alter user postgres with password '123456'; ALTER ROLE |
3.3.3 配置pg_hba.conf
postgres用户执行
pg_hba.conf在/pgsql/data目录下
## 请注意192.168.56.225 为本配置文档中的从节点ip地址,请根据实际情况配置 host replication repmgr 192.168.56.225/32 md5 |
3.3.4 配置postgres.conf
postgres用户执行
请注意本示例中的内存的配置为2GB
postgres.conf在/pgsql/data目录下
listen_addresses = '*' archive_mode = on archive_command = 'test ! -f /pgsql/arclog/%f && cp %p /pgsql/arclog/%f' wal_level = replica wal_sender_timeout = 60s max_connections = 500 track_activity_query_size=4096 port = 5432 unix_socket_permissions='0700' logging_collector=on log_destination='csvlog' log_filename='postgresql-%Y-%m-%d_%H%M%S.log' log_truncate_on_rotation=on log_rotation_age=1d log_rotation_size=0 log_checkpoints=on log_lock_waits=on log_duration=on log_min_duration_statement= 250ms min_wal_size = 1GB max_wal_size = 4GB wal_keep_segments=128 wal_log_hints = on superuser_reserved_connections = 10 shared_buffers = 500MB maintenance_work_mem = 128MB wal_buffers = 16MB work_mem = 16MB Log_min_messages=warning idle_in_transaction_session_timeout=300000 max_wal_senders=10 checkpoint_completion_target = 0.9 random_page_cost = 1.1 effective_io_concurrency = 200 effective_cache_size = 1500MB default_statistics_target = 10 log_statement=mod log_line_prefix = '%t [%p] %r %d %u ' log_connections=on shared_preload_libraries = 'pg_stat_statements,auto_explain' pg_stat_statements.max = 20000 pg_stat_statements.track = all auto_explain.log_min_duration = '250ms' track_activities = on track_counts = on track_io_timing = on track_functions = all |
3.3.5 重启主节点数据库
postgres用户执行
pg_ctl restart -D /pgsql/data/ |
3.3.6 从库验证是否能够登陆到主库
请在从节点的postgres用户执行,本配置文档中主节点的ip地址为:192.168.56.224
-bash-4.2$ psql -h 192.168.56.224 -U postgres Password for user postgres: psql (12.4) Type "help" for help. postgres=# |
3.4 从节点配置
从节点配置的操作请都在从节点执行
3.4.1 停止数据库
postgres用户执行
-bash-4.2$ pg_ctl stop -D /pgsql/data/ waiting for server to shut down.... done server stopped |
3.4.2 清空从节点数据文件目录中的内容
postgres用户执行
-bash-4.2$ rm -rf /pgsql/data/* -bash-4.2$ cd /pgsql/data/ -bash-4.2$ ll total 0 |
3.4.3 从主机点获取数据
postgres用户执行
主节点ip:192.168.56.224
复制用户:repmgr
数据文件目录:/pgsql/data
-bash-4.2$ pg_basebackup -h 192.168.56.224 -p 5432 -U repmgr -Fp -Xs -Pv -R -D /pgsql/data Password: pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 0/3000060 on timeline 1 pg_basebackup: starting background WAL receiver pg_basebackup: created temporary replication slot "pg_basebackup_21167" 25312/25312 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/3000138 pg_basebackup: waiting for background process to finish streaming ... pg_basebackup: syncing data to disk ... pg_basebackup: base backup completed |
3.4.4 编辑standby.signal
postgres用户执行
Standby.signal在/pgsql/data目录下
# 添加 standby_mode = 'on' |
3.4.5 修改从节点postgres.conf
postgres用户执行
主节点ip:192.168.56.224。请根据实际情况进行修改。
其他地方请按照文档内容修改,可以直接追加到postgres.conf文件的末尾
primary_conninfo = 'host=192.168.56.224 port=5432 user=repmgr password=123456' recovery_target_timeline = 'latest' max_connections = 600 hot_standby = on max_standby_streaming_delay = 30s wal_receiver_status_interval = 10s hot_standby_feedback = on |
3.4.6 从节点启动
Postgres 启动
-bash-4.2$ pg_ctl -D /pgsql/data -l logfile start waiting for server to start...... done server started |
3.4.7 验证主从关系
请在主节点的postgres用户下执行
-bash-4.2$ psql psql (12.4) Type "help" for help. postgres=# select client_addr,sync_state from pg_stat_replication; client_addr | sync_state ----------------+------------ 192.168.56.225 | async (1 row) |
3.5 rpmgr高可用搭建
3.5.1 rpm包安装(主从都要安装)
root 用户下执行
rpm -ivh repmgr_12-5.2.1-1.rhel7.x86_64.rpm rpm -ivh repmgr_12-devel-5.2.1-1.rhel7.x86_64.rpm chown -R postgres:postgres /etc/repmgr/12/repmgr.conf |
3.5.2 建立postgres用户间互信
需要对repmgr集群中的每个节点间配置SSH互信
主库执行
请注意主库的ip地址为192.168.56.224
##切换用户 su - postgres ##生成秘钥到用户主目录下的.ssh文件夹下 ssh-keygen -t rsa ##将秘钥拷贝到备库 ssh-copy-id -i .ssh/id_rsa.pub postgres@192.168.56.225 ##验证是否授权完成 ssh 192.168.56.225 date |
示例如下
[root@repmgrsta ~]# su - postgres Last login: Tue Dec 14 10:36:46 CST 2021 on pts/0 Last failed login: Tue Dec 14 10:55:13 CST 2021 from repmgrsta on ssh:notty There was 1 failed login attempt since the last successful login. -bash-4.2$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /var/lib/pgsql/.ssh/id_rsa. Your public key has been saved in /var/lib/pgsql/.ssh/id_rsa.pub. The key fingerprint is: SHA256:WUXobHWrIosv4EZFd6zfOomt7kIsJmKzQ01clL3bPOs postgres@repmgrsta The key's randomart image is: +---[RSA 2048]----+ | ..o . oo | | o... +.. . | | . .. ..=.. . . | | o ...o+ . | | o o So . . | |.+..= o..+o o | |o.o= + . =o+ | | o o + o.= | | .. **E . | +----[SHA256]-----+ -bash-4.2$ ssh-copy-id -i .ssh/id_rsa.pub postgres@192.168.56.225 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: ".ssh/id_rsa.pub" /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@192.168.56.225's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'postgres@192.168.56.225'" and check to make sure that only the key(s) you wanted were added. -bash-4.2$ ssh 192.168.56.225 date Tue Dec 14 10:56:47 CST 2021 |
从库执行
请注意主库的ip地址为192.168.56.225
##切换用户 su - postgres ##生成秘钥到用户主目录下的.ssh文件夹下 ssh-keygen -t rsa ##将秘钥拷贝到备库 ssh-copy-id -i .ssh/id_rsa.pub postgres@192.168.56.224 ##验证是否授权完成 ssh 192.168.56.224 date |
-bash-4.2$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa): /var/lib/pgsql/.ssh/id_rsa already exists. Overwrite (y/n)? -bash-4.2$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa): /var/lib/pgsql/.ssh/id_rsa already exists. Overwrite (y/n)? y Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /var/lib/pgsql/.ssh/id_rsa. Your public key has been saved in /var/lib/pgsql/.ssh/id_rsa.pub. The key fingerprint is: SHA256:kEpeVF1pOzT4zeqJCUAJTYxPs3xN+98wNI34Slg7Hp8 postgres@repmgrsta The key's randomart image is: +---[RSA 2048]----+ | .*oo.. o.. | | ..B. + = | | .=+o o = * o | | o o=.. o * * .| | o oS + * . | | . . B + | | . * B = | | o = E .| | | +----[SHA256]-----+ -bash-4.2$ ssh-copy-id -i .ssh/id_rsa.pub postgres@192.168.56.224 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: ".ssh/id_rsa.pub" The authenticity of host '192.168.56.224 (192.168.56.224)' can't be established. ECDSA key fingerprint is SHA256:WNp0vV+TO4G7fUyNySTlK5jtcxXXKW6MkyWbE781iLk. ECDSA key fingerprint is MD5:9c:00:39:a3:26:60:20:ea:aa:4d:69:26:7e:ed:0c:97. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@192.168.56.224's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'postgres@192.168.56.224'" and check to make sure that only the key(s) you wanted were added. -bash-4.2$ -bash-4.2$ -bash-4.2$ ssh 192.168.56.224 date Tue Dec 14 11:07:11 CST 2021 |
3.5.3 pg_hba.conf验证配置(主从都需要配置)
添加复制用户的权限
通常,靠前的记录有比较严的连接匹配参数和比较弱的认证方法,而靠后的记录有比较松的匹配参数和比较强的认证方法
比如:
比较严格的连接匹配参数 repmgr repmgr 192.168.56.225,比较弱的认证方法trust
host repmgr repmgr 192.168.56.225/32 trust
比较松的连接匹配参数 all all 0.0.0.0,比较严的认证方法md5
host all all 0.0.0.0/0
vim /pgsql/data/pg_hba.conf # replication privilege. local replication repmgr trust local repmgr repmgr trust host replication repmgr 127.0.0.1/32 trust host replication repmgr 192.168.56.224/32 trust host replication repmgr 192.168.56.225/32 trust host repmgr repmgr 127.0.0.1/32 trust host repmgr repmgr 192.168.56.224/32 trust host repmgr repmgr 192.168.56.225/32 trust host all all 192.168.56.224/32 trust host all all 192.168.56.225/32 trust host all all 0.0.0.0/0 md5 |
重启生效
pg_ctl -D /pgsql/data/ reload |
3.5.5 检测standby
主库执行,注意host=192.168.56.225 该IP为从库的IP
-bash-4.2$ psql 'host=192.168.56.225 user=repmgr dbname=repmgr connect_timeout=20' psql (12.4) Type "help" for help. repmgr=> |
从库执行,注意host=192.168.56.224 该IP为主库的IP
-bash-4.2$ psql 'host=192.168.56.224 user=repmgr dbname=repmgr connect_timeout=20' psql (12.4) Type "help" for help. repmgr=> |
3.5.6 配置repmgr
3.5.6.1 修改主库repmgr配置(在主库执行)
配置文件默认位于/etc/repmgr/12/repmgr.conf
注意ip地址请根据实际情况进行配置。
su – postgres vim /etc/repmgr/12/repmgr.conf #大于0的唯一Integer值 node_id=224 #唯一值,用于标识本服务器 node_name=repmgrpri #连接到本机的信息 conninfo='host=192.168.56.224 user=repmgr dbname=repmgr connect_timeout=20' #postgresql的data路径 data_directory='/pgsql/data' #repmgr复制槽参数: replication_user='repmgr' replication_type='physical' use_replication_slots=yes |
3.5.6.2 注册主服务器(在主库执行)
注册主服务器
su – postgres /usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf primary register -S postgres -bash-4.2$ repmgr -f /etc/repmgr/12/repmgr.conf primary register -S postgres WARNING: following problems with command line parameters detected: --superuser ignored when executing PRIMARY REGISTER INFO: connecting to primary database... NOTICE: attempting to install extension "repmgr" NOTICE: "repmgr" extension successfully installed NOTICE: primary node record (ID: 1) registered |
查询状态
/usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf cluster show

3.5.6.3修改从库repmgr配置(在从库执行)
配置文件默认位于/etc/repmgr/12/repmgr.conf
su – postgres vim /etc/repmgr/12/repmgr.conf #大于0的唯一Integer值 node_id=225 #唯一值,用于标识本服务器 node_name=repmgrsta #连接到本机的信息 conninfo='host=192.168.56.225 user=repmgr dbname=repmgr connect_timeout=20' #postgresql的data路径 data_directory='/pgsql/data' #repmgr复制槽参数: replication_user='repmgr' replication_type='physical' use_replication_slots=yes |
3.5.6注册从数据库(在从库执行)
Su - postgres
/usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf standby register
-bash-4.2$ /usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf standby register INFO: connecting to local node "repmgrsta" (ID: 225) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 224) INFO: standby registration complete NOTICE: standby node "repmgrsta" (ID: 225) successfully registered |
在从上查询状态
/usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf cluster show

异常处理
报错:
-bash-4.2$ /usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf standby register INFO: connecting to local node "repmgrsta" (ID: 225) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 224) WARNING: node "repmgrsta" not found in "pg_stat_replication" ERROR: local node not attached to primary node 224 HINT: specify the actual upstream node id with --upstream-node-id, or use -F/--force to continue anyway |
处理方法:
/usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf standby register -F /usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf standby follow -F |
3.5.7 配置Repmgrd(主从都需要执行)
su - postgres
vim /pgsql/data/postgresql.conf ##在shared_preload_libraries添加一下内容: shared_preload_libraries = 'pg_stat_statements,auto_explain,repmgr' |
重启数据库使配置生效
/usr/pgsql-12/bin/pg_ctl -D /pgsql/data/ restart |
设置repmgrd配置文件
vim /etc/repmgr/12/repmgr.conf # 设置为自动恢复模式 failover=automatic # 尝试连接到主的次数 reconnect_attempts=6 # 每次尝试连接到主的时间间隔(秒) reconnect_interval=5 # 当本机要成为主时,执行的提升命令 promote_command='/usr/pgsql-12/bin/repmgr standby promote -f /etc/repmgr/12/repmgr.conf --log-to-file' # 当出现新主时,执行的跟随命令 follow_command='/usr/pgsql-12/bin/repmgr standby follow -f /etc/repmgr/12/repmgr.conf --log-to-file --upstream-node-id=%n' log_file='/var/log/repmgr/repmgr.log' |
3.5.8日志轮换配置
root用户下执行
vim /etc/logrotate.conf 添加以下内容 /var/log/repmgr/repmgr.log { missingok compress rotate 30 daily dateext create 0600 postgres postgres } |
3.5.9 启动repmgrd(主从都需要执行)
su - postgres
/usr/pgsql-12/bin/repmgrd -f /etc/repmgr/12/repmgr.conf --pid-file /tmp/repmgrd.pid –d aemonize [2021-12-14 16:10:50] [NOTICE] redirecting logging output to "/var/log/repmgr/repmgr.log" |
查看集群状态
/usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf cluster show |
查看日志信息
-bash-4.2$ more /var/log/repmgr/repmgr.log [2021-12-14 16:10:50] [NOTICE] repmgrd (repmgrd 5.2.1) starting up [2021-12-14 16:10:50] [INFO] connecting to database "host=192.168.56.224 user=repmgr dbname=repmgr connect_timeout=20" INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid [2021-12-14 16:10:50] [NOTICE] starting monitoring of node "repmgrpri" (ID: 224) [2021-12-14 16:10:50] [INFO] "connection_check_type" set to "ping" [2021-12-14 16:10:50] [NOTICE] monitoring cluster primary "repmgrpri" (ID: 224) [2021-12-14 16:10:50] [INFO] child node "repmgrsta" (ID: 225) is attached |
3.6 winess见证节点
Witness 节点为见证节点,主要作用为了防止主库网络故障发生误切以及节点之间互相通信异常导致出现双主等脑裂现象
需要和repmgrd,repmgr 一起配合使用。其中见证节点需要安装一个postgresql实例。
在本文档中见证节点除了有观察集群是否正常的作用外,还起到作为归档服务器的作用,因此该对于该节点的要求是cpu,io 要求不高,存储要求比较高。
3.6.1 数据库安装
数据库安装,初始化,略,请参考3.2 |
3.6.2 创建复制用户
用户/密码:repmgr/repmgr
[root@pg~]#su - postgres Last login: Mon Aug 9 16:53:35 CST 2021 on pts/0 -bash-4.2$ createuser -U postgres repmgr -P -c 5 --replication Enter password for new role: Enter it again: -bash-4.2$ psql psql (12.4) Type "help" for help. postgres=# ALTER USER repmgr SET search_path TO repmgr, "$user", public; ALTER ROLE ###创建存储复制元数据的数据库 postgres=# create database repmgr owner repmgr; postgres=# alter user repmgr superuser; |
3.6.3 Repmgr安装
root 用户下执行
rpm -ivh repmgr_12-5.2.1-1.rhel7.x86_64.rpm rpm -ivh repmgr_12-devel-5.2.1-1.rhel7.x86_64.rpm chown -R postgres:postgres /etc/repmgr/12/repmgr.conf |
3.6.4 Postgresql.conf配置
Postgres用户执行
listen_addresses = '*' shared_preload_libraries = 'repmgr' |
3.6.5 pg_hba.conf配置
请注意主节点、从节点、见证节点库都要修改,
Postgres用户执行
vim /pgsql/data/pg_hba.conf # replication privilege. local replication repmgr trust local repmgr repmgr trust host replication repmgr 127.0.0.1/32 trust host replication repmgr 192.168.56.224/32 trust host replication repmgr 192.168.56.225/32 trust host replication repmgr 192.168.56.226/32 trust host repmgr repmgr 127.0.0.1/32 trust host repmgr repmgr 192.168.56.224/32 trust host repmgr repmgr 192.168.56.225/32 trust host repmgr repmgr 192.168.56.226/32 trust host all all 192.168.56.224/32 trust host all all 192.168.56.225/32 trust host all all 192.168.56.226/32 trust host all all 0.0.0.0/0 md5 |
3.6.6 repmgr.conf配置
基本配置与主库相同,保持 node_id、node_name、conninfo 与主库不同即可
Postgres用户执行
#大于0的唯一Integer值 node_id=226 #唯一值,用于标识本服务器 node_name=repmgrwitness #连接到本机的信息 conninfo='host=192.168.56.226 user=repmgr dbname=repmgr connect_timeout=20' #postgresql的data路径 data_directory='/pgsql/data' #repmgr复制槽参数: replication_user='repmgr' replication_type='physical' use_replication_slots=yes # 设置为自动恢复模式 failover=automatic # 尝试连接到主的次数 reconnect_attempts=6 # 每次尝试连接到主的时间间隔(秒) reconnect_interval=5 # 当本机要成为主时,执行的提升命令 promote_command='/usr/pgsql-12/bin/repmgr standby promote -f /etc/repmgr/12/repmgr.conf --log-to-file' # 当出现新主时,执行的跟随命令 follow_command='/usr/pgsql-12/bin/repmgr standby follow -f /etc/repmgr/12/repmgr.conf --log-to-file --upstream-node-id=%n' log_file='/var/log/repmgr/repmgr.log' |
3.6.7 设置日志轮询
请参考3.5.8
3.6.8 注册witness
Postgres用户执行
repmgr witness register --force -f /etc/repmgr/12/repmgr.conf -h 192.168.56.224 |
3.6.9 启动repmgrd
请参考3.5.9
3.6.10 查看状态
/usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf cluster show

3.7 Keepalived (可选)
keepalived在此架构中,只作为提供VIP的工具,供上层应用程序连接,不进行故障转移操作。该方案是可选方案,如果不需要VIP,可以直接通过JDBC轮询判断具体的主库
3.7.1 安装
root用户,主从都要安装
yum -y install gcc openssl-devel libnfnetlink-devel yum install libnl* tar -zxvf keepalived-2.2.7.tar.gz -C /usr/local/ cd /usr/local/keepalived-2.2.7/ ./configure make -j 4 make install |
3.7.2 主库keepalived.conf配置
mkdir -p /etc/keepalived/{log,scripts} |
cat /etc/keepalived/keepalived.conf global_defs { router_id pg_ha --用户标识 } vrrp_script checkpg { --定义本机检测模块 script "/etc/keepalived/scripts/checkpg.sh" --检测脚本 interval 15 --15秒检测一次 fall 3 --检测失败时重试3次 rise 1 --检测 1 次成功就算成功 } vrrp_instance VI_pgusdp { state BACKUP --定义实例角色 interface enp0s3 --对外提供的网路哦接口 virtual_router_id 152 --虚拟路由ID标识,主备服务器配置中相同实例的ID必须一致,否则将出现脑裂问题 priority 100 --priority表示实例优先级。数字越大,优先级越高。主节点大,从节点小 advert_int 1 --advert_int为同步通知间隔。主备之间通信检查的时间间隔,默认为1秒 authentication { auth_type PASS auth_pass 234235 } track_script { checkpg } notify_master "/etc/keepalived/scripts/master.sh" --定义了notfiy_master脚本,当当前节点成为master时,通知脚本执行任务 notify_backup "/etc/keepalived/scripts/slave.sh" --当当前节点成为backup时,通知脚本执行任务 virtual_ipaddress { 192.168.56.231 --虚拟IP地址;可以配置多个IP,每个IP占一行 } } |
3.7.3 主库checkpg.sh配置
请根据主节点的实际情况进行修改
[root@repmgrpri scripts]# more /etc/keepalived/scripts/checkpg.sh #!/bin/bash export PGDATABASE=postgres export PGPORT=5432 export PGUSER=postgres export PGBIN=/usr/pgsql-12/bin export PGDATA=/pgsql/data export PGHOST=192.168.56.224 LOGFILE=/etc/keepalived/log/keepalived.log nc -w 3 localhost 5432 </dev/null a=`echo $?` if [ $a -eq 1 ] ;then exit 1 else SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();' db_role=`echo $SQL1 | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGHOST -At -w` if [ $db_role == 't' ];then exit 1 fi fi |
3.7.4 主库master.sh配置
请根据主节点的实际情况进行修改
[root@repmgrpri scripts]# more /etc/keepalived/scripts/master.sh #!/bin/bash export PGDATABASE=postgres export PGPORT=5432 export PGUSER=postgres export PGBIN=/usr/pgsql-12/bin export PGDATA=/pgsql/data export PGHOST=192.168.56.224 LOGFILE=/etc/keepalived/log/keepalived.log SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();' db_role=`echo $SQL1 | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGHOST -At -w` if [ $db_role == 't' ];then echo -e `date +"%F %T"` "the current database is standby DB! " >> $LOGFILE exit 1 else echo -e `date +"%F %T"` "the current database is master DB!" >> $LOGFILE fi |
3.7.5 主库slave.sh配置
请根据主节点的实际情况进行修改
[root@repmgrpri scripts]# more /etc/keepalived/scripts/slave.sh #!/bin/bash export PGDATABASE=postgres export PGPORT=5432 export PGUSER=postgres export PGBIN=/usr/pgsql-12/bin export PGDATA=/pgsql/data export PGHOST=192.168.56.224 LOGFILE=/etc/keepalived/log/keepalived.log SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();' db_role=`echo $SQL1 | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGHOST -At -w` if [ $db_role == 't' ];then echo -e `date +"%F %T"` "the current database is standby DB! " >> $LOGFILE else echo -e `date +"%F %T"` "the current database is master DB!" >> $LOGFILE fi |
3.7.6 从库keepalived.conf配置
root用户下执行
mkdir -p /etc/keepalived/{log,scripts} |
priority 90 该值和主库的值不同
cat /etc/keepalived/keepalived.conf global_defs { router_id pg_ha --用户标识 } vrrp_script checkpg { --定义本机检测模块 script "/etc/keepalived/scripts/checkpg.sh" --检测脚本 interval 15 --15秒检测一次 fall 3 --检测失败时重试3次 rise 1 --检测 1 次成功就算成功 } vrrp_instance VI_pgusdp { state BACKUP --定义实例角色 interface enp0s3 --对外提供的网路哦接口 virtual_router_id 152 --虚拟路由ID标识,主备服务器配置中相同实例的ID必须一致,否则将出现脑裂问题 priority 90 --priority表示实例优先级。数字越大,优先级越高。主节点大,从节点小 advert_int 1 --advert_int为同步通知间隔。主备之间通信检查的时间间隔,默认为1秒 authentication { auth_type PASS auth_pass 234235 } track_script { checkpg } notify_master "/etc/keepalived/scripts/master.sh" --定义了notfiy_master脚本,当当前节点成为master时,通知脚本执行任务 notify_backup "/etc/keepalived/scripts/slave.sh" --当当前节点成为backup时,通知脚本执行任务 virtual_ipaddress { 192.168.56.231 --虚拟IP地址;可以配置多个IP,每个IP占一行 } } |
3.7.7 从库checkpg.sh配置
请根据从节点的实际情况进行修改
[root@repmgrpri scripts]# more /etc/keepalived/scripts/checkpg.sh #!/bin/bash export PGDATABASE=postgres export PGPORT=5432 export PGUSER=postgres export PGBIN=/usr/pgsql-12/bin export PGDATA=/pgsql/data export PGHOST=192.168.56.225 LOGFILE=/etc/keepalived/log/keepalived.log nc -w 3 localhost 5432 </dev/null a=`echo $?` if [ $a -eq 1 ] ;then exit 1 else SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();' db_role=`echo $SQL1 | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGHOST -At -w` if [ $db_role == 't' ];then exit 1 fi fi |
3.7.8 从库master.sh配置
请根据从节点的实际情况进行修改
[root@repmgrpri scripts]# more /etc/keepalived/scripts/master.sh #!/bin/bash export PGDATABASE=postgres export PGPORT=5432 export PGUSER=postgres export PGBIN=/usr/pgsql-12/bin export PGDATA=/pgsql/data export PGHOST=192.168.56.225 LOGFILE=/etc/keepalived/log/keepalived.log SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();' db_role=`echo $SQL1 | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGHOST -At -w` if [ $db_role == 't' ];then echo -e `date +"%F %T"` "the current database is standby DB! " >> $LOGFILE exit 1 else echo -e `date +"%F %T"` "the current database is master DB!" >> $LOGFILE fi |
3.7.9 从库slave.sh配置
请根据从节点的实际情况进行修改
[root@repmgrpri scripts]# more /etc/keepalived/scripts/slave.sh #!/bin/bash export PGDATABASE=postgres export PGPORT=5432 export PGUSER=postgres export PGBIN=/usr/pgsql-12/bin export PGDATA=/pgsql/data export PGHOST=192.168.56.225 LOGFILE=/etc/keepalived/log/keepalived.log SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();' db_role=`echo $SQL1 | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGHOST -At -w` if [ $db_role == 't' ];then echo -e `date +"%F %T"` "the current database is standby DB! " >> $LOGFILE else echo -e `date +"%F %T"` "the current database is master DB!" >> $LOGFILE fi |
3.7.10 授权
root用户下执行
cd /etc/keepalived/scripts chmod u+x *.sh |
3.7.11 启动keepalived
root用户下执行
systemctl start keepalived systemctl enable keepalived systemctl status keepalived |
[root@repmgrpri scripts]# systemctl status keepalived ● keepalived.service - LVS and VRRP High Availability Monitor Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2022-02-15 23:52:16 CST; 49min ago Docs: man:keepalived(8) man:keepalived.conf(5) man:genhash(1) https://keepalived.org Process: 24939 ExecStart=/usr/local/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 24941 (keepalived) Tasks: 2 CGroup: /system.slice/keepalived.service ├─24941 /usr/local/sbin/keepalived -D └─24942 /usr/local/sbin/keepalived -D Feb 16 00:04:10 repmgrpri Keepalived_vrrp[24942]: Sending gratuitous ARP on enp0s3 for 192.168.56.231 Feb 16 00:04:10 repmgrpri Keepalived_vrrp[24942]: Sending gratuitous ARP on enp0s3 for 192.168.56.231 Feb 16 00:04:10 repmgrpri Keepalived_vrrp[24942]: Sending gratuitous ARP on enp0s3 for 192.168.56.231 Feb 16 00:04:10 repmgrpri Keepalived_vrrp[24942]: Sending gratuitous ARP on enp0s3 for 192.168.56.231 Feb 16 00:04:10 repmgrpri Keepalived_vrrp[24942]: Sending gratuitous ARP on enp0s3 for 192.168.56.231 Feb 16 00:07:46 repmgrpri Keepalived_vrrp[24942]: Script `checkpg` now returning 1 Feb 16 00:08:16 repmgrpri Keepalived_vrrp[24942]: VRRP_Script(checkpg) failed (exited with status 1) Feb 16 00:08:16 repmgrpri Keepalived_vrrp[24942]: (VI_pgusdp) Entering FAULT STATE Feb 16 00:08:16 repmgrpri Keepalived_vrrp[24942]: (VI_pgusdp) sent 0 priority Feb 16 00:08:16 repmgrpri Keepalived_vrrp[24942]: (VI_pgusdp) removing VIPs. |
3.7.12 验证测试
从其他节点连接192.168.56.231,可以查看登陆成功。Vip 挂载成功
-bash-4.2$ psql -d repmgr -h 192.168.56.231 psql (12.4) Type "help" for help. repmgr=# |
开源去o的第四步-高可用数据库选型 https://cdn.modb.pro/db/324737
开源去o的第三步-数据库物理资源评估 https://www.modb.pro/db/246594
开源去o的第二步-数据库规划设计 https://www.modb.pro/db/242694
开源去o的第一步-关系数据库选型 https://www.modb.pro/db/239808




