暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

HAC集群中,计划重新初始化数据库使用原集群配置的操作方法

瀚高PG实验室 2022-04-09
328

目录

环境

文档用途

详细信息

环境

系统平台:N/A

版本:4.5

文档用途

HAC集群环境中,因某种特殊原因需要删除当前data目录并重建数据库,能够快速搭建集群;避免重新安装。

详细信息

1、所有节点停止hghac服务,删除原data目录,重新在主节点initdb(原配置的HAC集群文件不变)

    [root@db data]# systemctl stop hghac-vip

    [root@db data]# initdb -e sm4 -c "echo *******" -D db/hgdbdata/data

    复制

    (左右滑动查看完整内容)

    2、启动节点1的HAC服务,此时集群信息显示异常

      [root@db data]# opt/HighGo/tools/hghac/hghactl -c opt/HighGo/tools/hghac/hghac.yaml list

      + Cluster: ha (7072987311974756506) +-----------+

      | Member | Host | Role | State | TL | Lag in MB |

      +--------+------+------+-------+----+-----------+

      +--------+------+------+-------+----+-----------+

      [root@db data]# systemctl status hghac-vip
      复制

      (左右滑动查看完整内容)

      ● hghac-vip.service - hghac

        Loaded: loaded (/etc/systemd/system/hghac-vip.service; enabled; vendor preset: disabled)

        Active: failed (Result: exit-code) since Fri 2022-03-18 12:16:07 CST; 3min 7s ago

        Process: 44961 ExecStart=/opt/HighGo/tools/hghac/hghac opt/HighGo/tools/hghac/hghac.yaml (code=exited, status=1/FAILURE)

        Main PID: 44961 (code=exited, status=1/FAILURE)



        Mar 18 12:16:05 db systemd[1]: Started hghac.

        Mar 18 12:16:07 db systemd[1]: hghac-vip.service: main process exited, code=exited, status=1/FAILURE

        Mar 18 12:16:07 db systemd[1]: Unit hghac-vip.service entered failed state.

        Mar 18 12:16:07 db systemd[1]: hghac-vip.service failed.

        [root@db data]# systemctl start hghac-vip

        [root@db data]# systemctl status hghac-vip
        复制

        (左右滑动查看完整内容)

        ● hghac-vip.service - hghac

          Loaded: loaded (/etc/systemd/system/hghac-vip.service; enabled; vendor preset: disabled)

          Active: failed (Result: exit-code) since Fri 2022-03-18 12:19:26 CST; 2min 13s ago

          Process: 45581 ExecStart=/opt/HighGo/tools/hghac/hghac opt/HighGo/tools/hghac/hghac.yaml (code=exited, status=1/FAILURE)

          Main PID: 45581 (code=exited, status=1/FAILURE)

          Mar 18 12:19:24 db systemd[1]: Started hghac.

          Mar 18 12:19:26 db systemd[1]: hghac-vip.service: main process exited, code=exited, status=1/FAILURE

          Mar 18 12:19:26 db systemd[1]: Unit hghac-vip.service entered failed state.

          Mar 18 12:19:26 db systemd[1]: hghac-vip.service failed.
          复制

          (左右滑动查看完整内容)

          3、HAC集群日志中会报错集群的identifier与原来不一致(因为重新建库了):

            [root@db hghalog]# pwd

            /db/hgdbdata/hghalog

            [root@db hghalog]# tail -f patroni.log

            2022-03-18 12:16:06,807 INFO: Selected new etcd server http://192.168.80.111:2379

            2022-03-18 12:16:06,828 INFO: No PostgreSQL configuration items changed, nothing to reload.

            2022-03-18 12:16:06,890 CRITICAL: system ID mismatch, node hghaca belongs to a different cluster: 7072987311974756506 != 7076286699020760566

            2022-03-18 12:19:25,967 INFO: Selected new etcd server http://192.168.80.113:2379

            2022-03-18 12:19:25,992 INFO: No PostgreSQL configuration items changed, nothing to reload.

            2022-03-18 12:19:26,063 CRITICAL: system ID mismatch, node hghaca belongs to a different cluster: 7072987311974756506 != 7076286699020760566
            复制

            (左右滑动查看完整内容)

            4、各节点重启etcd和hghac服务后,还是报错如上。

              [root@db ~]# opt/HighGo/tools/hghac/etcd/amd64/etcdctl endpoint status --write-out=table

              +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

              | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |

              +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

              | http://192.168.80.111:2379 | ddbfd190d03ca278 | 3.4.15 | 20 kB | false | false | 218 | 1686066 | 1686066 | |

              | http://192.168.80.112:2379 | 1c703f0b65f7bddb | 3.4.15 | 20 kB | false | false | 218 | 1686066 | 1686066 | |

              | http://192.168.80.113:2379 | 92255e8f5c9ebfcd | 3.4.15 | 20 kB | true | false | 218 | 1686066 | 1686066 | |

              +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

              [root@db ~]# systemctl start hghac-vip

              [root@db ~]# opt/HighGo/tools/hghac/hghactl -c opt/HighGo/tools/hghac/hghac.yaml list

              + Cluster: ha (7072987311974756506) +-----------+

              | Member | Host | Role | State | TL | Lag in MB |

              +--------+------+------+-------+----+-----------+

              +--------+------+------+-------+----+-----------+
              复制

              (左右滑动查看完整内容)

              5、原因分析:因为etcd的库文件中记录了此信息,需重新生成etcd的相关信息

                [root@db etcd]# pwd

                /opt/HighGo/tools/etcd

                [root@db etcd]# ls

                hgdw1.etcd

                [root@db etcd]# pwd

                /opt/HighGo/tools/etcd

                [root@db etcd]# ls

                hgdw1.etcd

                [root@db etcd]# mv hgdw1.etcd hgdw1.etcd.bak <--所有节点都改名此目录或删除此目录

                [root@db etcd]# systemctl stop etcd

                [root@db etcd]# systemctl start etcd

                [root@db etcd]# pwd

                /opt/HighGo/tools/etcd

                [root@db etcd]# ll

                total 0

                drwx------ 3 root root 20 Mar 18 12:34 hgdw1.etcd

                drwx------. 3 root root 20 Mar 18 12:27 hgdw1.etcd.bak <--重启etcd会重新生成该目录及其下的所有文件,

                [root@db etcd]# opt/HighGo/tools/hghac/etcd/amd64/etcdctl endpoint status --write-out=table

                +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

                | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |

                +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

                | http://192.168.80.111:2379 | ddbfd190d03ca278 | 3.4.15 | 20 kB | true | false | 2 | 8 | 8 | |

                | http://192.168.80.112:2379 | 1c703f0b65f7bddb | 3.4.15 | 20 kB | false | false | 2 | 8 | 8 | |

                | http://192.168.80.113:2379 | 92255e8f5c9ebfcd | 3.4.15 | 20 kB | false | false | 2 | 8 | 8 | |

                +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

                [root@db etcd]#

                [root@db etcd]# systemctl start hghac-vip <--此时启动HAC,集群信息显示正常

                [root@db etcd]# opt/HighGo/tools/hghac/hghactl -c opt/HighGo/tools/hghac/hghac.yaml list

                + Cluster: ha (7076286699020760566) ----+---------+----+-----------+-----------------+

                | Member | Host | Role | State | TL | Lag in MB | Pending restart |

                +--------+---------------------+--------+---------+----+-----------+-----------------+


                | hghaca | 192.168.80.111:5866 | Leader | running | 2 | | * |


                +--------+---------------------+--------+---------+----+-----------+-----------------+


                [root@db etcd]#
                复制

                (左右滑动查看完整内容)

                启动其他节点的HAC,结果如下:

                  [root@db etcd]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list


                  + Cluster: ha (7076286699020760566) -----+---------+----+-----------+-----------------+


                  | Member | Host | Role | State | TL | Lag in MB | Pending restart |


                  +--------+---------------------+---------+---------+----+-----------+-----------------+


                  | hghaca | 192.168.80.111:5866 | Leader | running | 2 | | * |


                  | hghacb | 192.168.80.112:5866 | Replica | running | 2 | 0 | * |


                  | hghacc | 192.168.80.113:5866 | Replica | running | 2 | 0 | * |


                  +--------+---------------------+---------+---------+----+-----------+-----------------+
                  复制

                  (左右滑动查看完整内容)


                  文章转载自瀚高PG实验室,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                  评论