暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Elasticsearch 数据迁移工具 ESM 发布 0.5.0

弹性搜索 2020-12-23
563

ESM 更新版本 v0.5.0,经测 3 节点集群,在线 Elasticsearch 数据导入导出可以得到每分钟一千万条。


特性:

  • 新增 buffer_count 来控制内存占用,避免大量并发造成的 OOM

  • 添加压缩参数,支持 GZIP 压缩 HTTP 请求流量

改进:

  • 优化性能,重用 Buffer,使用 FasthttpClient,吞吐能力提升


下载地址:

https://github.com/medcl/esm/releases/tag/v0.5.0


性能数据:

在一台  3 个节点的集群上面(3 * c5d.4xlarge, 16C,32GB,10Gbps),对一千万的 NGINX 访问日志使用 ESM 进行数据的导入导出,只需要耗时 55 秒,如果是导出到别的独立集群,可能更高。


./esm -s https://localhost:8000 -d https://localhost:8000 -x kibana_sample_data_logs -y logs-test -m elastic:medcl123 -n elastic:medcl123  --regenerate_id --repeat_times=5
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test -y logs-test1 -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test1 -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500


[12-19 06:29:40] [INF] [main.go:537,main] data migration finished.
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_id
[12-19 06:31:20] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [============================================] 100.00% 55s
Bulk 10062602 / 10064570 [=============================================] 99.98% 55s
[12-19 06:32:15] [INF] [main.go:537,main] data migration finished.
复制

上周末对 ESM 做了使用的介绍:

活动视频如下:

完整的演示脚本如下:

#生成测试数据
./esm -s https://localhost:8000 -d https://localhost:8000 -x kibana_sample_data_logs -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id --repeat_times=5
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test -y logs-test1 -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test1 -y logs-test -m elastic:medcl123 -n elastic:medcl123  --regenerate_id  -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500


# 导数据
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 
# 设置 worker
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 -b 5 


# 调整 Buffer
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 -b 5 --buffer_count=1000000 


# 调大 Slice
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 


# 优化索引


DELETE _template/logs


GET _template/logs
PUT _template/logs
{
"order": 0,
"index_patterns": [
"logs*"
],
"settings": {
"codec": "default",
"index": {
"number_of_shards": "12",
"number_of_replicas": "0",
"refresh_interval": "-1",
"translog.sync_interval": "30s",
"translog.durability": "async",
"translog.flush_threshold_size": "10g"
}
},
"mappings": {
"dynamic_templates": [
{
"strings": {
"mapping": {
"ignore_above": 256,
"type": "keyword"
},
"match_mapping_type": "string"
}
}
]
},
"aliases": {}
}

DELETE logs122
PUT logs122
GET logs122


POST logs122/_refresh


GET logs122/_search


GET _cat/indices




#开启压缩
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000


#自动生成 ID
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --compress true --regenerate_id


#走网关
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id




root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id
[12-19 06:10:53] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [===============================================================================================================] 100.00% 1m5s
Bulk 10062580 / 10064570 [=================================================================================================================] 99.98% 1m5s
[12-19 06:11:58] [INF] [main.go:537,main] data migration finished.
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id
[12-19 06:14:29] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [===============================================================================================================] 100.00% 1m4s
Bulk 10062586 / 10064570 [=================================================================================================================] 99.98% 1m4s
[12-19 06:15:33] [INF] [main.go:537,main] data migration finished.


PUT _template/logs
{
"order": 0,
"index_patterns": [
"logs*"
],
"settings": {
"codec": "default",
"index": {
"number_of_shards": "12",
"number_of_replicas": "0",
"refresh_interval": "-1",
"translog.sync_interval": "90s",
"translog.durability": "async",
"translog.flush_threshold_size": "10g"
}
},
"mappings": {
"dynamic_templates": [
{
"strings": {
"mapping": {
"ignore_above": 256,
"type": "keyword"
},
"match_mapping_type": "string"
}
}
]
},
"aliases": {}
}


root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_id
Scroll 1367104 / 1 [=======================================================================================================] 136710400.00% 2562047h47m16s
[12-19 06:17:46] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [================================================================================================================] 100.00% 56s
Bulk 10062603 / 10064570 [==================================================================================================================] 99.98% 56s
[12-19 06:18:42] [INF] [main.go:537,main] data migration finished.


PUT _template/logs
{
"order": 0,
"index_patterns": [
"logs*"
],
"settings": {
"codec": "default",
"index": {
"number_of_shards": "24",
"number_of_replicas": "0",
"refresh_interval": "-1",
"translog.sync_interval": "90s",
"translog.durability": "async",
"translog.flush_threshold_size": "10g"
}
},
"mappings": {
"dynamic_templates": [
{
"strings": {
"mapping": {
"ignore_above": 256,
"type": "keyword"
},
"match_mapping_type": "string"
}
}
]
},
"aliases": {}
}


[12-19 06:29:40] [INF] [main.go:537,main] data migration finished.
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_id
Scroll 1367104 / 1 [=======================================================================================================] 136710400.00% 2562047h47m16s
logs1kw
[12-19 06:31:20] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [================================================================================================================] 100.00% 55s
Bulk 10062602 / 10064570 [==================================================================================================================] 99.98% 55s
[12-19 06:32:15] [INF] [main.go:537,main] data migration finished.
复制


文章转载自弹性搜索,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论