有时候因为一些业务场景的需要,我们可能需要做MySQL的级联复制结构,这种结构是在从库下面再挂从库,形成级联复制A-->B-->C的结构本篇文章不对具体大家级联复制的过程做详细讲解,只说一说在级联复制中遇到的问题帮助大家躲开这些坑和快速定位问题。
有这样一个场景,现在有级联复制A-->B-->C,后来因为业务调整,通过MHA切换成B-->C结构,后来C实例重启后,发现C节点复制中断,经过排查C重新指向了A实例,通过show slave status可以看到具体信息,此时查看mysql.slave_master_info表(如果不知道此表用途的,请自行脑补一下),发现有两条记录,一条记录为以前A实例的信息,另一条是B实例的信息,所以实例C重启后,自动和A建立复制关系(即第一条记录),而不是和B建立复制关系,改变了原本的复制关系,这时候即使在C上执行reset slave或reset slave all;也是无效的,所以这时候有两种方法可以解决这个问题:
在C上执行delete from mysql.slave_master_info where xxxx=A,把A那条记录删除,然后执行change master to master_host=B,xxxxxx后,start slave
或者执行change master to master_host=A,然后执行change master to master_host=B,xxxxx后,start slave
下面是模拟整个实验的过程:
现在结构
172.16.3.37 master
172.16.3.39 slave
172.16.3.38 单独节点
级联复制结构
172.16.3.37-->172.16.3.39-->172.16.3.38
最终结构
172.16.3.39-->172.16.3.38
从172.16.3.39导出整个实例,并导入到172.16.3.38节点上恢复成一个单独实例,查询mysql.slave_master_info表,发现只有一条记录172.16.3.37
localhost.test1>select * from mysql.slave_master_info\G
*************************** 1. row ***************************
Number_of_lines: 23
Master_log_name: mysql-bin.000002
Master_log_pos: 521
Host: 172.16.3.37
User_name: xxxx
User_password: xxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid: a1c2cb85-53c4-11e6-b518-0050569b7494
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
1 row in set (0.00 sec)
172.16.3.38和172.16.3.39建立复制关系,执行
change master to master_host='172.16.3.39',master_user='xxxxx',master_password='xxxxxx',master_port=7777,master_auto_position=1 ;
发现mysql.slave_master_info有两条记录,并且刚刚change的主库172.16.3.39是第二条记录,间接主库172.16.3.37是第一条记录。也就是说后change的主库会顺序在表mysql.slave_master_info的记录中添加,如下:
localhost.test1>select * from mysql.slave_master_info\G
*************************** 1. row ***************************
Number_of_lines: 23
Master_log_name: mysql-bin.000002
Master_log_pos: 521
Host: 172.16.3.37
User_name: xxxx
User_password: xxxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid: a1c2cb85-53c4-11e6-b518-0050569b7494
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
*************************** 2. row ***************************
Number_of_lines: 23
Master_log_name:
Master_log_pos: 4
Host: 172.16.3.39
User_name: xxxxx
User_password: xxxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid: 40d547fb-53ca-11e6-b53c-0050569b2b52
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
2 rows in set (0.00 sec)
在172.16.3.39上检查mysql.slave_master_info发现有一条记录172.16.3.37,如下:
localhost.mysql>select * from mysql.slave_master_info\G
*************************** 1. row ***************************
Number_of_lines: 23
Master_log_name:
Master_log_pos: 4
Host: 172.16.3.37
User_name: xxxxx
User_password: xxxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid: a1c2cb85-53c4-11e6-b518-0050569b7494
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
1 row in set (0.00 sec)
MHA切换,把172.16.3.39提升为主库master,然后在172.16.3.38上查询如下,发现还是有两条记录
localhost.test1>select * from mysql.slave_master_info\G
*************************** 1. row ***************************
Number_of_lines: 23
Master_log_name: mysql-bin.000002
Master_log_pos: 521
Host: 172.16.3.37
User_name: xxxxx
User_password: xxxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid: a1c2cb85-53c4-11e6-b518-0050569b7494
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
*************************** 2. row ***************************
Number_of_lines: 23
Master_log_name: mysql-bin.000002
Master_log_pos: 231
Host: 172.16.3.39
User_name: xxxxx
User_password: xxxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid: 40d547fb-53ca-11e6-b53c-0050569b2b52
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
2 rows in set (0.01 sec)
重启172.16.3.38实例,查询mysql.slave_master_info后,发现还是两条记录与上面一样,但是重启后查询show slave status状态,发现主库竟然变成了172.16.3.37,如下
localhost.(none)>show slave status\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 172.16.3.37
Master_User: xxxxx
Master_Port: 7777
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 231
Relay_Log_File: mysql-relay.000003
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 231
Relay_Log_Space: 704
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 0
Master_UUID: a1c2cb85-53c4-11e6-b518-0050569b7494
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set: 40d547fb-53ca-11e6-b53c-0050569b2b52:1-4,
a1c2cb85-53c4-11e6-b518-0050569b7494:1-2
Auto_Position: 1
1 row in set (0.00 sec)
此时就是遇到的坑,复制程序自动获取了mysql.slave_master_info表中的第一条记录建立了主从复制关系,如果想恢复重启前的复制关系,有两种方法可以彻底解决这个问题
在C上执行delete from mysql.slave_master_info where xxxx=A,把A那条记录删除,然后执行change master to master_host=B,xxxxxx后,start slave,此操作省略
或者执行change master to master_host=A,然后执行change master to master_host=B,xxxxx后,start slave,操作如下
localhost.(none)>change master to master_host='172.16.3.37';
Query OK, 0 rows affected (0.01 sec)
localhost.(none)>select * from mysql.slave_master_info\G
*************************** 1. row ***************************
Number_of_lines: 23
Master_log_name:
Master_log_pos: 4
Host: 172.16.3.37
User_name: xxxxx
User_password: xxxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid:
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
1 row in set (0.00 sec)
localhost.(none)>change master to master_host='172.16.3.39',master_user='xxxxx',master_password='xxxxx',master_port=7777,master_auto_position=1
-> ;
Query OK, 0 rows affected, 2 warnings (0.03 sec)
localhost.(none)>select * from mysql.slave_master_info\G
*************************** 1. row ***************************
Number_of_lines: 23
Master_log_name:
Master_log_pos: 4
Host: 172.16.3.39
User_name: xxxxx
User_password: xxxxx
Port: 7777
Connect_retry: 60
Enabled_ssl: 0
Ssl_ca:
Ssl_capath:
Ssl_cert:
Ssl_cipher:
Ssl_key:
Ssl_verify_server_cert: 0
Heartbeat: 15
Bind:
Ignored_server_ids: 0
Uuid:
Retry_count: 86400
Ssl_crl:
Ssl_crlpath:
Enabled_auto_position: 1
1 row in set (0.00 sec)