ESM 更新版本 v0.5.0,经测 3 节点集群,在线 Elasticsearch 数据导入导出可以得到每分钟一千万条。
特性:
新增 buffer_count 来控制内存占用,避免大量并发造成的 OOM
添加压缩参数,支持 GZIP 压缩 HTTP 请求流量
改进:
优化性能,重用 Buffer,使用 FasthttpClient,吞吐能力提升
下载地址:
https://github.com/medcl/esm/releases/tag/v0.5.0
性能数据:
在一台 3 个节点的集群上面(3 * c5d.4xlarge, 16C,32GB,10Gbps),对一千万的 NGINX 访问日志使用 ESM 进行数据的导入导出,只需要耗时 55 秒,如果是导出到别的独立集群,可能更高。
./esm -s https://localhost:8000 -d https://localhost:8000 -x kibana_sample_data_logs -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id --repeat_times=5
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test -y logs-test1 -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test1 -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500
[12-19 06:29:40] [INF] [main.go:537,main] data migration finished.
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_id
[12-19 06:31:20] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [============================================] 100.00% 55s
Bulk 10062602 / 10064570 [=============================================] 99.98% 55s
[12-19 06:32:15] [INF] [main.go:537,main] data migration finished.
复制
上周末对 ESM 做了使用的介绍:
活动视频如下:
完整的演示脚本如下:
#生成测试数据
./esm -s https://localhost:8000 -d https://localhost:8000 -x kibana_sample_data_logs -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id --repeat_times=5
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test -y logs-test1 -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs-test1 -y logs-test -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=20 -b 20 --repeat_times=500
# 导数据
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123
# 设置 worker
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 -b 5
# 调整 Buffer
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 -b 5 --buffer_count=1000000
# 调大 Slice
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000
# 优化索引
DELETE _template/logs
GET _template/logs
PUT _template/logs
{
"order": 0,
"index_patterns": [
"logs*"
],
"settings": {
"codec": "default",
"index": {
"number_of_shards": "12",
"number_of_replicas": "0",
"refresh_interval": "-1",
"translog.sync_interval": "30s",
"translog.durability": "async",
"translog.flush_threshold_size": "10g"
}
},
"mappings": {
"dynamic_templates": [
{
"strings": {
"mapping": {
"ignore_above": 256,
"type": "keyword"
},
"match_mapping_type": "string"
}
}
]
},
"aliases": {}
}
DELETE logs122
PUT logs122
GET logs122
POST logs122/_refresh
GET logs122/_search
GET _cat/indices
#开启压缩
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000
#自动生成 ID
./esm -s https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -d https://ip-172-31-8-52.ap-northeast-1.compute.internal:9200 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --compress true --regenerate_id
#走网关
./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id
[12-19 06:10:53] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [===============================================================================================================] 100.00% 1m5s
Bulk 10062580 / 10064570 [=================================================================================================================] 99.98% 1m5s
[12-19 06:11:58] [INF] [main.go:537,main] data migration finished.
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --regenerate_id
[12-19 06:14:29] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [===============================================================================================================] 100.00% 1m4s
Bulk 10062586 / 10064570 [=================================================================================================================] 99.98% 1m4s
[12-19 06:15:33] [INF] [main.go:537,main] data migration finished.
PUT _template/logs
{
"order": 0,
"index_patterns": [
"logs*"
],
"settings": {
"codec": "default",
"index": {
"number_of_shards": "12",
"number_of_replicas": "0",
"refresh_interval": "-1",
"translog.sync_interval": "90s",
"translog.durability": "async",
"translog.flush_threshold_size": "10g"
}
},
"mappings": {
"dynamic_templates": [
{
"strings": {
"mapping": {
"ignore_above": 256,
"type": "keyword"
},
"match_mapping_type": "string"
}
}
]
},
"aliases": {}
}
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_id
Scroll 1367104 / 1 [=======================================================================================================] 136710400.00% 2562047h47m16s
[12-19 06:17:46] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [================================================================================================================] 100.00% 56s
Bulk 10062603 / 10064570 [==================================================================================================================] 99.98% 56s
[12-19 06:18:42] [INF] [main.go:537,main] data migration finished.
PUT _template/logs
{
"order": 0,
"index_patterns": [
"logs*"
],
"settings": {
"codec": "default",
"index": {
"number_of_shards": "24",
"number_of_replicas": "0",
"refresh_interval": "-1",
"translog.sync_interval": "90s",
"translog.durability": "async",
"translog.flush_threshold_size": "10g"
}
},
"mappings": {
"dynamic_templates": [
{
"strings": {
"mapping": {
"ignore_above": 256,
"type": "keyword"
},
"match_mapping_type": "string"
}
}
]
},
"aliases": {}
}
[12-19 06:29:40] [INF] [main.go:537,main] data migration finished.
root@ip-172-31-13-181:/tmp# ./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 -w 40 --sliced_scroll_size=60 -b 5 --buffer_count=2000000 --regenerate_id
Scroll 1367104 / 1 [=======================================================================================================] 136710400.00% 2562047h47m16s
logs1kw
[12-19 06:31:20] [INF] [main.go:506,main] start data migration..
Scroll 10064570 / 10064570 [================================================================================================================] 100.00% 55s
Bulk 10062602 / 10064570 [==================================================================================================================] 99.98% 55s
[12-19 06:32:15] [INF] [main.go:537,main] data migration finished.
复制
文章转载自弹性搜索,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。
评论
相关阅读
2025年4月中国数据库流行度排行榜:OB高分复登顶,崖山稳驭撼十强
墨天轮编辑部
2540次阅读
2025-04-09 15:33:27
数据库国产化替代深化:DBA的机遇与挑战
代晓磊
1184次阅读
2025-04-27 16:53:22
2025年3月国产数据库中标情况一览:TDSQL大单622万、GaussDB大单581万……
通讯员
857次阅读
2025-04-10 15:35:48
2025年4月国产数据库中标情况一览:4个千万元级项目,GaussDB与OceanBase大放异彩!
通讯员
678次阅读
2025-04-30 15:24:06
数据库,没有关税却有壁垒
多明戈教你玩狼人杀
583次阅读
2025-04-11 09:38:42
天津市政府数据库框采结果公布,7家数据库产品入选!
通讯员
569次阅读
2025-04-10 12:32:35
国产数据库需要扩大场景覆盖面才能在竞争中更有优势
白鳝的洞穴
547次阅读
2025-04-14 09:40:20
【活动】分享你的压箱底干货文档,三篇解锁进阶奖励!
墨天轮编辑部
487次阅读
2025-04-17 17:02:24
一页概览:Oracle GoldenGate
甲骨文云技术
464次阅读
2025-04-30 12:17:56
GoldenDB数据库v7.2焕新发布,助力全行业数据库平滑替代
GoldenDB分布式数据库
457次阅读
2025-04-30 12:17:50