暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

记SUSE系统安装postgresql+keepalived高可用遇到的问题


应客户需求搭建postgresql一主两从架构,并配置高可用

安装客户要求版本(用户已创建,相关依赖软件已安装,防火墙及hosts等文件系统组已配置)

    tar -xvf postgresql-12.12.tar.gz -C /home/postgres/
    cd /home/postgres/postgresql-12.12
    mkdir /opt/pg1212
    mkdir /data/
    ./configure --prefix=/opt/pg1212
    make && make install
    chown -R postgres:users /data/
    chown -R postgres:users /opt/pg1212

复制

主库初始化

    su - postgres
    /opt/pg1212/bin/initdb -D /data

复制

配置环境变量

    export LD_LIBRARY_PATH=/opt/pg1212/lib
    export PGDATA=/data/
    export PGHOME=/opt/pg1212/
    export PATH=$PATH:/opt/1212/bin/

复制

修改参数文件

    vim /data/postgresql.conf
    listen_addresses = '*'
    max_connections = 1000
    shared_buffers = 4GB
    dynamic_shared_memory_type = posix
    wal_level = replica
    max_wal_size = 1GB
    min_wal_size = 80MB
    archive_mode = on
    archive_command = 'cp %p /data/archivedir/%f '
    wal_keep_segments = 512
    wal_sender_timeout = 60s
    track_commit_timestamp = on
    hot_standby = on
    max_standby_archive_delay = 30s
    max_standby_streaming_delay = 30s
    wal_receiver_status_interval = 10s
    log_destination = 'stderr'
    logging_collector = on  # Enable capturing of stderr and csvlog
    log_directory = 'log'
    log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
    log_file_mode = 0600
    log_rotation_age = 1d
    log_rotation_size = 100MB
    log_timezone = 'Asia/Shanghai'
    datestyle = 'iso, mdy'
    timezone = 'Asia/Shanghai'
    lc_messages = 'en_US.UTF-8'
    lc_monetary = 'en_US.UTF-8'
    lc_numeric = 'en_US.UTF-8'
    lc_time = 'en_US.UTF-8'

复制

启动主库,创建同步用户,修改策略

    /opt/pg1212/bin/pg_ctl -d /data/ start
    psql
    create role repuser with login replication password 'Rp#0927!';
    vim pg_hba.conf
    host    replication     repuser        192.168.7.0/24        md5

复制

安装keepalived软件

    zypper in keepalived-1.4.5-11.1.x86_64.rpm

复制

配置keepalived.conf文件(主节点)

    zypper in keepalived-1.4.5-11.1.x86_64.rpm ! Configuration File for keepalived
    global_defs {
       notification_email {
         acassen@firewall.loc
         failover@firewall.loc
         sysadmin@firewall.loc
       }
       notification_email_from Alexandre.Cassen@firewall.loc
       smtp_server 192.168.200.1
       smtp_connect_timeout 30
       router_id LVS_DEVEL
       vrrp_skip_check_adv_addr
       vrrp_garp_interval 0
       vrrp_gna_interval 0
    }
    vrrp_script check_pg_alived {
        script "/etc/keepalived/pg_check.sh"
        interval 1
        weight 3
    }
    vrrp_instance VI_1 {
        state MASTER
        interface eth0
        virtual_router_id 62
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass xxxx
        }
         track_script {
            check_pg_alived
         }
        virtual_ipaddress {
           192.168.7.65
        }

复制

!!!这里和之前Centos、Redhat系统不太一样的是:最后不需要添加默认网关(所有节点都不添加),最开始以为是脚本的问题,经过多次验证发现是vitrual_routes那里的问题导致,添加之后在执行脚本切换时,会导致网关消失不可用,无法连接,必须重启网络服务才行。所以配置文件中取消了默认网关的设置。


配置keepalived.conf文件(从节点1)

     ! Configuration File for keepalived
    global_defs {
       notification_email {
         acassen@firewall.loc
         failover@firewall.loc
         sysadmin@firewall.loc
       }
       notification_email_from Alexandre.Cassen@firewall.loc
       smtp_server 192.168.200.1
       smtp_connect_timeout 30
       router_id LVS_DEVEL
       vrrp_skip_check_adv_addr
       vrrp_garp_interval 0
       vrrp_gna_interval 0
    }
    vrrp_script check_pg_alived {
        script "/etc/keepalived/pg_check.sh"
        interval 1
        weight 2
    }
    vrrp_instance VI_1 {
        state BACKUP
        interface eth0
        virtual_router_id 62
        priority 90
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass xxxx
        }
         track_script {
            check_pg_alived
         }
        virtual_ipaddress {
           192.168.7.65
        }

复制

配置keepalived.conf文件(从节点2)

       ! Configuration File for keepalived
    global_defs {
       notification_email {
         acassen@firewall.loc
         failover@firewall.loc
         sysadmin@firewall.loc
       }
       notification_email_from Alexandre.Cassen@firewall.loc
       smtp_server 192.168.200.1
       smtp_connect_timeout 30
       router_id LVS_DEVEL
       vrrp_skip_check_adv_addr
       vrrp_garp_interval 0
       vrrp_gna_interval 0
    }
    vrrp_script check_pg_alived {
        script "/etc/keepalived/pg_check.sh"
        interval 1
        weight 1
    }
    vrrp_instance VI_1 {
        state BACKUP
        interface eth0
        virtual_router_id 62
        priority 88
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass xxxx
        }
         track_script {
            check_pg_alived
         }
        virtual_ipaddress {
           192.168.7.65
        }

复制

配置Keepalived日志

       ! Configuration File for keepalived
    global_defs {
       notification_email {
         acassen@firewall.loc
         failover@firewall.loc
         sysadmin@firewall.loc
       }
       notification_email_from Alexandre.Cassen@firewall.loc
       smtp_server 192.168.200.1
       smtp_connect_timeout 30
       router_id LVS_DEVEL
       vrrp_skip_check_adv_addr
       vrrp_garp_interval 0
       vrrp_gna_interval 0
    }
    vrrp_script check_pg_alived {
        script "/etc/keepalived/pg_check.sh"
        interval 1
        weight 1
    }
    vrrp_instance VI_1 {
        state BACKUP
        interface eth0
        virtual_router_id 62
        priority 88
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass xxxx
        }
         track_script {
            check_pg_alived
         }
        virtual_ipaddress {
           192.168.7.65
        } cat  /etc/sysconfig/keepalived
    # Options for keepalived. See `keepalived --help' output and keepalived(8) and
    # keepalived.conf(5) man pages for a list of all options. Here are the most
    # common ones :
    #
    # --vrrp               -P    Only run with VRRP subsystem.
    # --check              -C    Only run with Health-checker subsystem.
    # --dont-release-vrrp  -V    Dont remove VRRP VIPs & VROUTEs on daemon stop.
    # --dont-release-ipvs  -I    Dont remove IPVS topology on daemon stop.
    # --dump-conf          -d    Dump the configuration data.
    # --log-detail         -D    Detailed log messages.
    # --log-facility       -S    0-7 Set local syslog facility (default=LOG_DAEMON)
    #
    KEEPALIVED_OPTIONS="-D  -d -S 0"

复制

另外SUSE系统的keepalived日志也不在/var/log/keepalived.log,目前没有找到对应的文件位置

启动keepalived服务(不设置开机自启,因为主节点权重较高,如果主节点抢占到VIP,但数据库服务未启动,会导致无法连接的情况)

       systemctl start keepalived

复制

搭建从库并启动(连接IP写VIP)

    /opt/pg1212/bin/pg_basebackup -h 192.168.7.65 -D /data/ -F p -X stream -v -P -U repuser -R
    输入密码:
    pg_ctl start

复制

查看主从状态

    postgres=# select pid,usename,application_name,client_addr,state,sync_state from pg_stat_replication ;
      pid  | usename | application_name | client_addr |   state   | sync_state
    -------+---------+------------------+-------------+-----------+------------
     20336 | repuser | walreceiver      | 192.168.7.64  | streaming | async
     24072 | repuser | walreceiver      | 192.168.7.63  | streaming | async
    postgres=# select pg_is_in_recovery();
     pg_is_in_recovery
    -------------------
     f
    (1 row)

复制

模拟主节点宕机

  postgres@suse7_62:~> pg_ctl stop

复制

查看权重较高的从节点keepalived状态

    suse7_63:/etc/keepalived/log # systemctl status keepalived
    鈼[0m keepalived.service - LVS and VRRP High Availability Monitor
       Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
       Active: active (running) since Sat 2022-10-08 23:01:59 CST; 56s ago
      Process: 13370 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
     Main PID: 13371 (keepalived)
        Tasks: 3 (limit: 512)
       CGroup: /system.slice/keepalived.service
               鈹溾攢13371 /usr/sbin/keepalived -D -d -S 0
               鈹溾攢13372 /usr/sbin/keepalived -D -d -S 0
               鈹斺攢13373 /usr/sbin/keepalived -D -d -S 0
    Oct 08 23:02:04 suse7_63 su[13557]: pam_unix(su-l:session): session opened for user postgres by (uid=0)
    Oct 08 23:02:08 suse7_63 Keepalived_vrrp[13373]: Sending gratuitous ARP on eth0 for 192.168.7.65
    Oct 08 23:02:08 suse7_63 Keepalived_vrrp[13373]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.7.65
    Oct 08 23:02:08 suse7_63 Keepalived_vrrp[13373]: Sending gratuitous ARP on eth0 for 192.168.7.65
    Oct 08 23:02:08 suse7_63 Keepalived_vrrp[13373]: Sending gratuitous ARP on eth0 for 192.168.7.65
    Oct 08 23:02:08 suse7_63 Keepalived_vrrp[13373]: Sending gratuitous ARP on eth0 for 192.168.7.65
    Oct 08 23:02:08 suse7_63 Keepalived_vrrp[13373]: Sending gratuitous ARP on eth0 for 192.168.7.65
    Oct 08 23:02:19 suse7_63 Keepalived_vrrp[13373]: VRRP_Instance(VI_1) Master received advert with higher priority 100, ours 92
    Oct 08 23:02:19 suse7_63 Keepalived_vrrp[13373]: VRRP_Instance(VI_1) Entering BACKUP STATE
    Oct 08 23:02:19 suse7_63 Keepalived_vrrp[13373]: VRRP_Instance(VI_1) removing protocol VIPs.

复制

VIP自动漂移

    suse7_63:/etc/keepalived/log # ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
        link/ether 00:50:56:bd:c2:f1 brd ff:ff:ff:ff:ff:ff
        inet 192.168.7.63/22 brd 56.1.75.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet 192.168.7.65/32 scope global eth0
           valid_lft forever preferred_lft forever

复制

权重高的从节点63自动提升为主库

    suse7_63:/etc/keepalived/log # su - postgres
    postgres@suse7_63:~> psql -h 192.168.7.65 -U postgres
    psql (12.12)
    Type "help" for help.
    postgres=# select pid,usename,application_name,client_addr,state,sync_state from pg_stat_replication ;
      pid  | usename | application_name | client_addr |   state   | sync_state
    -------+---------+------------------+-------------+-----------+------------
     20336 | repuser | walreceiver      | 192.168.7.64  | streaming | async
    postgres=# select pg_is_in_recovery();
    pg_is_in_recovery
    f
    (1 row)
    postgres=# create table t1 (id int);
    CREATE TABLE
    postgres=# drop table t1 ;
    DROP TABLE

复制

数据库可以正常使用,发现异常时及时恢复原主节点,先将原主节点62添加为现有主节点63的从库后再启动keepalived服务,因权重最高,VIP漂移回原主节点62,通过检测脚本提升为主库,再将63添加为从库即可,这样可以有效避免数据丢失。

点击此处阅读原文

文章转载自开源软件联盟PostgreSQL分会,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论