暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

mysql5.6部署MHA

真菌的博客 2021-04-20
281

MHA部署

半同步复制

要求

  • MySQL 5.5及以上版本

  • 变量have_dynamic_loading为YES

  • 异步复制已经存在

概念简介

  • 异步复制:

    MySQL默认的复制即是异步的,主库在执行完客户端提交的事务后会立即将结果返给给客户端,并不关心从库是否已经接收并处理,这样就会有一个问题,主如果crash掉了,此时主上已经提交的事务可能并没有传到从上,如果此时,强行将从提升为主,可能导致新主上的数据不完整

  • 半同步复制:

    主库在执行完客户端提交的事务后不是立刻返回给客户端,而是等待至少一个从库接收到并写到relay log中才返回给客户端。相对于异步复制,半同步复制提高了数据的安全性,同时它也造成了一定程度的延迟,这个延迟最少是一个TCP/IP往返的时间。所以,半同步复制最好在低延时的网络中使

开启半同步复制

1. 加载插件

用户需要有SUPER权限

主: 
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

从: 
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

2. 查看插件是否已经加载成功

mysql> SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS WHERE PLUGIN_NAME LIKE '%semi%';

3. 启动半同步复制

主: 
mysql> SET GLOBAL rpl_semi_sync_master_enabled = 1;

从: 
mysql> SET GLOBAL rpl_semi_sync_slave_enabled = 1;

或者在配置文件中指定:

主:

plugin-load=rpl_semi_sync_master=semisync_master.so   
rpl_semi_sync_master_enabled=1

复制

从:

plugin-load=rpl_semi_sync_slave=semisync_slave.so
rpl_semi_sync_slave_enabled=1

复制

在有的高可用架构下,master和slave需同时启动,以便在切换后能继续使用半同步复制:

plugin-load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl-semi-sync-master-enabled = 1
rpl-semi-sync-slave-enabled = 1

复制

4. 重启从上的IO线程

mysql> STOP SLAVE IO_THREAD;

mysql> START SLAVE IO_THREAD;

5. 验证半同步是否在运行

主: mysql> show status like 'Rpl_semi_sync_master_status';

从: mysql> show status like 'Rpl_semi_sync_slave_status';

6.gtid开启

在my.cnf配置中加入

    gtid-mode=on
enforce-gtid-consistency=true
master-info-repository=TABLE
relay-log-info-repository=TABLE

复制

搭建MHA

一. 配置ssh无密钥证书

1. Master上生成秘钥

    cd root/.ssh   
ssh-keygen -t dsa -P '' -f id_dsa #生成秘钥
cat id_dsa.pub >> authorized_keys #复制秘钥

复制



2. Master生成完秘钥后,在Candicate master及Slave中按相同方式生成秘钥

3. 构建通用的authorized_keys

在Master上(172.16.150.179)

cd root/.ssh
scp id_dsa.pub root@172.16.150.180:/root/.ssh/id_dsa.pub.179
scp id_dsa.pub root@172.16.150.181:/root/.ssh/id_dsa.pub.179

复制

以相同方式将180和181上的id_dsa.pub文件传至179

cat id_dsa.pub.180 >> authorized_keys
cat id_dsa.pub.181 >> authorized_keys

复制

4. 验证三台机器间是否实现了ssh免密码互通

二. 安装MHA

1. 安装MHA Node

  • 安装Node前需要安装依赖

    yum install perl-DBD-MySQL -y
    yum install cpan

    复制
  • 在所有节点安装mha Node

    tar zxvf mha4mysql-node-0.56.tar.gz
    cd mha4mysql-node-0.56
    perl Makefile.PL
    make && make install

    复制

2. 安装MHA Manager(179上)

  • MHA Manager依赖模块

    yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y

    复制
  • 安装Manager

    tar zxvf mha4mysql-manager-0.56.tar.gz   
    cd mha4mysql-manager-0.56
    perl Makefile.PL    
    make && make install

    复制
  • 复制相关脚本

    cd tmp/mha4mysql-manager-0.56/samples/scripts/
    cp * usr/local/bin/

    复制

3. 配置MHA

  • 创建目录

    mkdir -p etc/masterha
    cp tmp/mha4mysql-manager-0.56/samples/conf/app1.cnf etc/masterha/
    vim etc/masterha/app1.cnf

    复制
  • 配置文件如下:

    [server default]
    manager_workdir=/var/log/masterha/app1
    manager_log=/var/log/masterha/app1/manager.log
    master_binlog_dir=/mysql/binlog
    master_ip_failover_script=/usr/local/bin/master_ip_failover
    master_ip_online_change_script=/usr/local/bin/master_ip_online_change
    password=123456
    user=root
    ping_interval=1
    remote_workdir=/tmp
    repl_password=123456
    repl_user=root
    report_script=/usr/local/bin/send_report
    secondary_check_script=/usr/local/bin/masterha_secondary_check -s 172.16.150.179 -s 172.16.150.180
    ssh_user=root

    [server1]
    hostname=172.16.150.179
    candidate_master=1

    [server2]
    hostname=172.16.150.180
    candidate_master=1

    [server3]
    hostname=172.16.150.181

    复制
  • 在每个Slave节点上设置relay log的清除方式

    mysql> set global relay_log_purge=0;

    复制
  • 修改脚本文件,配置脚本中的vip切换

    vim usr/local/bin/master_ip_failover

    #!/usr/bin/env perl

    use strict;
    use warnings FATAL => 'all';

    use Getopt::Long;

    my (
    $command, $ssh_user, $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip, $new_master_port
    );

    my $vip = '172.16.159.101/24';
    my $key = '1';
    my $ssh_start_vip = "/sbin/ifconfig bond0:$key $vip";
    my $arp = "/usr/sbin/arping -I bond0 -s 172.16.159.101 -b -c 1 172.16.159.254 ";
    my $ssh_stop_vip = "/sbin/ifconfig bond0:$key down";

    GetOptions(
    'command=s' => \$command,
    'ssh_user=s' => \$ssh_user,
    'orig_master_host=s' => \$orig_master_host,
    'orig_master_ip=s' => \$orig_master_ip,
    'orig_master_port=i' => \$orig_master_port,
    'new_master_host=s' => \$new_master_host,
    'new_master_ip=s' => \$new_master_ip,
    'new_master_port=i' => \$new_master_port,
    );

    exit &main();

    sub main {

    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

    if ( $command eq "stop" || $command eq "stopssh" ) {

    my $exit_code = 1;
    eval {
    $exit_code = 0;
    };
    if ($@) {
    warn "Got Error: $@\n";
    exit $exit_code;
    }
    exit $exit_code;
    }
    elsif ( $command eq "start" ) {

    my $exit_code = 10;
    eval {
    print "Enabling the VIP - $vip on the new master - $new_master_host \n";
    &start_vip();
    $exit_code = 0;
    };
    if ($@) {
    warn $@;
    exit $exit_code;
    }
    exit $exit_code;
    }
    elsif ( $command eq "status" ) {
    print "Checking the Status of the script.. OK \n";
    exit 0;
    }
    else {
    &usage();
    exit 1;
    }
    }

    sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
    `ssh $ssh_user\@$new_master_host \" $arp \"`;
    }
    sub stop_vip() {
    return 0 unless ($ssh_user);
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
    }

    sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

    复制

4. 验证MHA基础功能

  • 检查ssh配置

    masterha_check_ssh --conf=/etc/masterha/app1.cnf

    复制

    正常的返回结果

    [root@localhost conf]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
    Thu Oct 19 10:15:11 2017 - [warning] Global configuration file etc/masterha_default.cnf not found. Skipping.
    Thu Oct 19 10:15:11 2017 - [info] Reading application default configuration from etc/masterha/app1.cnf..
    Thu Oct 19 10:15:11 2017 - [info] Reading server configuration from etc/masterha/app1.cnf..
    Thu Oct 19 10:15:11 2017 - [info] Starting SSH connection tests..
    Thu Oct 19 10:15:12 2017 - [debug]
    Thu Oct 19 10:15:11 2017 - [debug] Connecting via SSH from root@172.16.150.179(172.16.150.179:22) to root@172.16.150.180(172.16.150.180:22)..
    Thu Oct 19 10:15:12 2017 - [debug] ok.
    Thu Oct 19 10:15:12 2017 - [debug] Connecting via SSH from root@172.16.150.179(172.16.150.179:22) to root@172.16.150.181(172.16.150.181:22)..
    Thu Oct 19 10:15:12 2017 - [debug] ok.
    Thu Oct 19 10:15:12 2017 - [debug]
    Thu Oct 19 10:15:12 2017 - [debug] Connecting via SSH from root@172.16.150.180(172.16.150.180:22) to root@172.16.150.179(172.16.150.179:22)..
    Thu Oct 19 10:15:12 2017 - [debug] ok.
    Thu Oct 19 10:15:12 2017 - [debug] Connecting via SSH from root@172.16.150.180(172.16.150.180:22) to root@172.16.150.181(172.16.150.181:22)..
    Thu Oct 19 10:15:12 2017 - [debug] ok.
    Thu Oct 19 10:15:13 2017 - [debug]
    Thu Oct 19 10:15:12 2017 - [debug] Connecting via SSH from root@172.16.150.181(172.16.150.181:22) to root@172.16.150.179(172.16.150.179:22)..
    Thu Oct 19 10:15:13 2017 - [debug] ok.
    Thu Oct 19 10:15:13 2017 - [debug] Connecting via SSH from root@172.16.150.181(172.16.150.181:22) to root@172.16.150.180(172.16.150.180:22)..
    Thu Oct 19 10:15:13 2017 - [debug] ok.
    Thu Oct 19 10:15:13 2017 - [info] All SSH connection tests passed successfully.

    复制
  • 检查整个复制环境状况

    masterha_check_repl --conf=/etc/masterha/app1.cnf

    复制

    正常的返回结果

    [root@localhost conf]# masterha_check_repl --conf=/etc/masterha/app1.cnf
    Thu Oct 19 10:18:17 2017 - [warning] Global configuration file etc/masterha_default.cnf not found. Skipping.
    Thu Oct 19 10:18:17 2017 - [info] Reading application default configuration from etc/masterha/app1.cnf..
    Thu Oct 19 10:18:17 2017 - [info] Reading server configuration from etc/masterha/app1.cnf..
    Thu Oct 19 10:18:17 2017 - [info] MHA::MasterMonitor version 0.56.
    Thu Oct 19 10:18:18 2017 - [info] GTID failover mode = 1
    Thu Oct 19 10:18:18 2017 - [info] Dead Servers:
    Thu Oct 19 10:18:18 2017 - [info] Alive Servers:
    Thu Oct 19 10:18:18 2017 - [info] 172.16.150.179(172.16.150.179:3306)
    Thu Oct 19 10:18:18 2017 - [info] 172.16.150.180(172.16.150.180:3306)
    Thu Oct 19 10:18:18 2017 - [info] 172.16.150.181(172.16.150.181:3306)
    Thu Oct 19 10:18:18 2017 - [info] Alive Slaves:
    Thu Oct 19 10:18:18 2017 - [info] 172.16.150.180(172.16.150.180:3306) Version=5.6.24-log (oldest major version between slaves) log-bin:enabled
    Thu Oct 19 10:18:18 2017 - [info] GTID ON
    Thu Oct 19 10:18:18 2017 - [info] Replicating from 172.16.150.179(172.16.150.179:3306)
    Thu Oct 19 10:18:18 2017 - [info] Primary candidate for the new Master (candidate_master is set)
    Thu Oct 19 10:18:18 2017 - [info] 172.16.150.181(172.16.150.181:3306) Version=5.6.24-log (oldest major version between slaves) log-bin:enabled
    Thu Oct 19 10:18:18 2017 - [info] GTID ON
    Thu Oct 19 10:18:18 2017 - [info] Replicating from 172.16.150.179(172.16.150.179:3306)
    Thu Oct 19 10:18:18 2017 - [info] Current Alive Master: 172.16.150.179(172.16.150.179:3306)
    Thu Oct 19 10:18:18 2017 - [info] Checking slave configurations..
    Thu Oct 19 10:18:18 2017 - [info] read_only=1 is not set on slave 172.16.150.180(172.16.150.180:3306).
    Thu Oct 19 10:18:18 2017 - [info] read_only=1 is not set on slave 172.16.150.181(172.16.150.181:3306).
    Thu Oct 19 10:18:18 2017 - [info] Checking replication filtering settings..
    Thu Oct 19 10:18:18 2017 - [info] binlog_do_db= , binlog_ignore_db=
    Thu Oct 19 10:18:18 2017 - [info] Replication filtering check ok.
    Thu Oct 19 10:18:18 2017 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
    Thu Oct 19 10:18:18 2017 - [info] Checking SSH publickey authentication settings on the current master..
    Thu Oct 19 10:18:18 2017 - [info] HealthCheck: SSH to 172.16.150.179 is reachable.
    Thu Oct 19 10:18:18 2017 - [info]
    172.16.150.179(172.16.150.179:3306) (current master)
    +--172.16.150.180(172.16.150.180:3306)
    +--172.16.150.181(172.16.150.181:3306)

    Thu Oct 19 10:18:18 2017 - [info] Checking replication health on 172.16.150.180..
    Thu Oct 19 10:18:18 2017 - [info] ok.
    Thu Oct 19 10:18:18 2017 - [info] Checking replication health on 172.16.150.181..
    Thu Oct 19 10:18:18 2017 - [info] ok.
    Thu Oct 19 10:18:18 2017 - [info] Checking master_ip_failover_script status:
    Thu Oct 19 10:18:18 2017 - [info] usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.16.150.179 --orig_master_ip=172.16.150.179 --orig_master_port=3306


    IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 172.16.150.220/24===

    Checking the Status of the script.. OK
    Thu Oct 19 10:18:18 2017 - [info] OK.
    Thu Oct 19 10:18:18 2017 - [warning] shutdown_script is not defined.
    Thu Oct 19 10:18:18 2017 - [info] Got exit code 0 (Not master dead).

    MySQL Replication Health is OK.

    复制

5. 启动MHA

  • 开启MHA Manager:

    nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < dev/null > var/log/masterha/app1/manager.log 2>&1 &

    复制
  • 检查MHA Manager状态:

    masterha_check_status --conf=/etc/masterha/app1.cnf
    返回:
    app1 (pid:2816) is running(0:PING_OK), master:172.16.150.179

    复制
  • 查看日志:

    tailf -n 100 var/log/masterha/app1/manager.log

    复制
  • 关闭MHA:

    masterha_stop --conf=/etc/masterha/app1.cnf

    复制

6. 在线切换

  • 关闭mha manager

    masterha_stop --conf=/etc/masterha/app1.cnf

    复制
  • 手动切换

    masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=172.16.150.179 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=1000

    复制
  • 在线切换必须满足的条件:

    • 所有slave的IO线程都在运行;

    • 所有slave的SQL线程都在运行;

    • 所有的show slave status的输出中Seconds_Behind_Master参数小于或者等于running_updates_limit秒,如果在切换过程中不指定running_updates_limit,那么默认情况下running_updates_limit为1秒;

    • 在master端,通过show processlist输出,没有一个更新花费的时间大于running_updates_limit秒;

恢复步骤

1. 重新启动宕机的数据库

2. 将宕机的机器作为新的从加入集群

  • 找出故障时所处的位置:

    grep 'All other slaves should start replication from here' /var/log/masterha/app1/manager.log

    复制
  • 将找出的CHANGE MASTER... 语句作为主从复制语句,在新从库上执行:

    CHANGE MASTER TO MASTER_HOST='172.16.150.180', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='root', MASTER_PASSWORD='xxx';

    复制
  • 开启主从:

    [root@localhost][two]> CHANGE MASTER TO MASTER_HOST='172.16.150.180', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='root', MASTER_PASSWORD='123456';
    Query OK, 0 rows affected, 2 warnings (0.59 sec)

    [root@localhost][two]> start slave;
    Query OK, 0 rows affected (0.18 sec)

    复制
  • 验证主从:

    [root@localhost][two]> start slave;

    复制
  • 在manager节点修改配置文件:

    vim /etc/masterha/app1.cnf

    #加入以下配置 (mha会在切换时切除无效的server配置信息,重启前需要把这些配置添加回去)
    [server1]
    hostname=172.16.150.179
    candidate_master=1

    复制

3. 还原主从关系(可选)

  • 验证MHA复制状态:

    masterha_check_status --conf=/etc/masterha/app1.cnf

    复制
  • 手动切换主从关系

    masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=172.16.150.179 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=1000
    复制


文章转载自真菌的博客,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论