📢 随着业务持续发展,数据量和并发量达到一定程度,传统数据库可能面临性能、可扩展性、可用性等问题。Apache ShardingSphere 是一款分布式的数据库生态系统, 可以将任意数据库转换为分布式数据库,并通过数据分片、弹性伸缩、加密等能力对原有数据库进行增强。本篇就为大家讲解 ShardingSphere 的数据迁移功能,并通过实战讲解搭建分布式数据库的整个流程。
数据迁移功能介绍
MySQL PostgreSQL openGauss
| 数据库 | 版本支持 | 环境要求 | 权限要求 |
|---|---|---|---|
| MySQL | 5.1.15 ~ 8.0.x | my.cnf 配置 log-bin=binlog binlog-format=row binlog-row-image=full | GRANT REPLICATION SLAVE,REPLICATION CLIENT ON .TO ${usernamet}@${host} |
| PostgreSQL | 9.4 以上版本 | postgresql.conf 配置 wal_level = logical max_wal_senders = 10 max_replication_slots = 10 max_connections = 600 | pg_hba.conf 配置 host all ${username} 0.0.0.0/0 md5 |
| openGauss | 2.0.1 ~ 3.1.x | postgresgl.conf 配置 wal_level = logical max_wal_senders = 10 max_replication_slots = 10 max_connections = 600 wal_sender_timeout = 0 | pg_hba.conf 配置 host all ${username} 0.0.0.0/0 md5 |
数据迁移相关概念

数据迁移流程

数据迁移阶段

搭建分布式数据库实战
环境准备
ShardingSphere 版本
MySQL数据库
注册中心
部署架构

数据库权限配置

GRANT CREATE, ALTER, DROP, SELECT, INSERT, UPDATE, DELETE, INDEX ON target_ds_0.* TO target_user;
GRANT CREATE, ALTER, DROP, SELECT, INSERT, UPDATE, DELETE, INDEX ON target_ds_1.* TO target_user;
操作步骤
初始化源端数据
CREATE TABLE t_user
(
id int auto_increment,
username varchar(64) null,
password varchar(255) null,
mobile varchar(64) null,
constraint t_user_pk primary key (id)
);
INSERT INTO t_user (id, username, password, mobile) VALUES (1, 'jack', '123456', '13111111111');
INSERT INTO t_user (id, username, password, mobile) VALUES (2, 'rose', '234567', '13111111112');
INSERT INTO t_user (id, username, password, mobile) VALUES (3, 'mike', 'aaa123', '13111111113');
INSERT INTO t_user (id, username, password, mobile) VALUES (4, 'bob', 'aaabbb', '13111111114');
初始化 ShardingSphere Proxy 规则
mysql> create database sharding_db;
Query OK, 0 rows affected (0.12 sec)
mysql> use sharding_db
Database changed
使用 DistSQL 添加存储单元
REGISTER STORAGE UNIT target_ds_0 (
URL="jdbc:mysql://localhost:3306/target_ds_0?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true",
USER="target_user",
PASSWORD="root"
);
REGISTER STORAGE UNIT target_ds_1 (
URL="jdbc:mysql://localhost:3306/target_ds_1?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true",
USER="target_user",
PASSWORD="root"
);
REGISTER STORAGE UNIT read_ds_0 (
URL="jdbc:mysql://localhost:3308/target_ds_0?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true",
USER="target_user",
PASSWORD="root"
);
REGISTER STORAGE UNIT read_ds_1 (
URL="jdbc:mysql://localhost:3308/target_ds_1?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true",
USER="target_user",
PASSWORD="root"
);
初始化规则定义
CREATE READWRITE_SPLITTING RULE rw_ds_0 (
WRITE_STORAGE_UNIT=target_ds_0,
READ_STORAGE_UNITS(read_ds_0),
TYPE(NAME="random")
);
CREATE READWRITE_SPLITTING RULE rw_ds_1 (
WRITE_STORAGE_UNIT=target_ds_1,
READ_STORAGE_UNITS(read_ds_1),
TYPE(NAME="random")
);
CREATE SHARDING TABLE RULE t_user(
STORAGE_UNITS(rw_ds_0, rw_ds_1),
SHARDING_COLUMN=id,
TYPE(NAME="hash_mod",PROPERTIES("sharding-count"="4")),
KEY_GENERATE_STRATEGY(COLUMN=id,TYPE(NAME="snowflake"))
);
CREATE ENCRYPT RULE t_user (
COLUMNS((NAME=password,CIPHER=password_cipher,ENCRYPT_ALGORITHM(TYPE(NAME='AES',PROPERTIES('aes-key-value'='123456abc'))))
));
开始数据迁移
1. 添加迁移数据源
迁移的数据源是外部的,需要通过 DistSQL 注册
REGISTER MIGRATION SOURCE STORAGE UNIT source_ds (
URL="jdbc:mysql://${source_database_url:port}/source_ds?serverTimezone=UTC&useSSL=false",
USER="root",
PASSWORD="root"
);
执行迁移命令
MIGRATE TABLE source_ds.t_user INTO sharding_db.t_user;
mysql> MIGRATE TABLE source_ds.t_user INTO sharding_db.t_user;
Query OK, 0 rows affected (1.06 sec)
查看迁移进度
mysql> SHOW MIGRATION LIST;
+--------------------------------------------+------------------+----------------+--------+---------------------+-----------+
| id | tables | job_item_count | active | create_time | stop_time |
+--------------------------------------------+------------------+----------------+--------+---------------------+-----------+
| j0102p000041c4912117c302e9facd92f9a74a478c | source_ds.t_user | 1 | true | 2023-06-24 09:44:51 | NULL |
+--------------------------------------------+------------------+----------------+--------+---------------------+-----------+
1 row in set (0.03 sec)
id:jobId,唯一标识 tables:迁移的表名 job_item_count:迁移 job 的分片数 active:job 是否在运行中 create_time:job 创建时间 stop_time:job 停止时间
mysql> SHOW MIGRATION STATUS j0102p000041c4912117c302e9facd92f9a74a478c;
+------+-------------+------------------+--------------------------+--------+-------------------------+-------------------------------+--------------------------+---------------+
| item | data_source | tables | status | active | processed_records_count | inventory_finished_percentage | incremental_idle_seconds | error_message |
+------+-------------+------------------+--------------------------+--------+-------------------------+-------------------------------+--------------------------+---------------+
| 0 | source_ds | source_ds.t_user | EXECUTE_INCREMENTAL_TASK | true | 4 | 100 | 321 | |
+------+-------------+------------------+--------------------------+--------+-------------------------+-------------------------------+--------------------------+---------------+
1 row in set (0.04 sec)
item:job分片任务编号 data_source:迁移数据源名称 tables:迁移表名 status:job状态,具体可以参考 JobStatus active:job 是否在运行中 processed_records_count:已经处理的数据量 inventory_finished_percentage:全量完成百分比 incremental_idle_seconds:增量空闲时间 error_message:错误信息
验证迁移前后的数据一致性
数据迁移进入到增量迁移的阶段 需要一定时间的业务只读窗口期
mysql> CHECK MIGRATION j0102p000041c4912117c302e9facd92f9a74a478c BY TYPE (NAME='DATA_MATCH');
Query OK, 0 rows affected (0.48 sec)
mysql> SHOW MIGRATION CHECK STATUS j0102p000041c4912117c302e9facd92f9a74a478c;
+------------------+--------+---------------------+---------------------+-------------------+-------------------------+-------------------------+------------------+---------------+
| tables | result | check_failed_tables | finished_percentage | remaining_seconds | check_begin_time | check_end_time | duration_seconds | error_message |
+------------------+--------+---------------------+---------------------+-------------------+-------------------------+-------------------------+------------------+---------------+
| source_ds.t_user | true | | 100 | 0 | 2023-06-24 10:05:28.483 | 2023-06-24 10:05:29.821 | 1 | |
+------------------+--------+---------------------+---------------------+-------------------+-------------------------+-------------------------+------------------+---------------+
1 row in set (0.06 sec)
tables:表名 result:校验结果,true 表示通过,false 表示存在不一致 check_failed_tables:校验失败的表 finished_percentage:进度 remaining_seconds:剩余时间(估算) check_begin_time:校验开始时间 check_end_time:校验结束时间
mysql> SHOW MIGRATION CHECK ALGORITHMS;
+-------------+--------------------------------------------------------------+----------------------------+
| type | supported_database_types | description |
+-------------+--------------------------------------------------------------+----------------------------+
| CRC32_MATCH | MySQL,MariaDB,H2 | Match CRC32 of records. |
| DATA_MATCH | SQL92,MySQL,MariaDB,PostgreSQL,openGauss,Oracle,SQLServer,H2 | Match raw data of records. |
+-------------+--------------------------------------------------------------+----------------------------+
2 rows in set (0.10 sec)
CRC32_MATCH:循环冗余校验,通过校验码来判断是否存在数据不一致,效率快,但是不支持断点续传,且只支持MySQL DATA_MATCH:逐行挨个比对数据,效率稍慢但是支持断点续传和异构数据库
目标端开启数据加密的情况需要使用 DATA_MATCH,因为底层的字段名称或者是值会存在不一致。异构迁移也需要使用 DATA_MATCH。
提交迁移作业
mysql> COMMIT MIGRATION j0102p000041c4912117c302e9facd92f9a74a478c;
Query OK, 0 rows affected (2.79 sec)
总结
🔗 参考
[1] ShardingSphere-数据迁移 官网文档:https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-proxy/migration/usage/
[2] GitHub issue 列表:https://github.com/apache/shardingsphere/issues
[3] 中文社区:https://community.sphere-ex.com/




