最近学习在Linux系统做SNMP测试告警接收消息调整,通过的是Prometheus+alertmanager+SNMP发送告警。
SNMP下载
1.Precompiled binaries:https://github.com/maxwo/snmp_notifier/releases
2.Docker Images:https://hub.docker.com/r/maxwo/snmp-notifier
3.Compiling the binary:git clone https://github.com/maxwo/snmp_notifier.git
Compiling the binary安装SNMP
git clone https://github.com/maxwo/snmp_notifier.git
cd snmp_notifier
make build
./snmp_notifier
解决git clone时的问题:Failed connect to github.com:443; Connection refused
原因:github的一些域名的DNS解析被污染,导致DNS解析过程无法通过域名取得正确的IP地址。
查找真实IP方法:打开 https://www.ipaddress.com/ 输入访问不了的域名,查找得到IP Address 140.82.113.3
sudo vim /etc/hosts
140.82.113.3 github.com
#末尾添加
解决make build编译go的过程中经常会遇到这样的报错:dial tcp 172.217.163.49:443: connect: connection refused。原因:被墙了,直接在命令行执行走代理。
go env -w GOPROXY=https://goproxy.cn
Promethues配置告警规则文件
groups:
- name: node_rule
rules:
- alert: Server Status
expr: up == 0
for: 10s
labels:
severity: critical
service: node
annotations:
summary: "Instance Down"
description: "Server {{$labels.instance}}: Instance Down"
ip: "{{$labels.ip}}"
groups:
- name: MogDB_Rule
rules:
- alert: MogDB Status
expr: pg_up == 0
for: 10s
labels:
severity: critical
service: MogDB
annotations:
summary: "MogDB Database Down"
description: "Database {{$labels.instance}}: MogDB Database Down"
ip: "{{$labels.ip}}"
alertmanager配置文件
The Alertmanager should be configured with the SNMP notifier as alert receiver
route:
group_wait: 10s
group_interval: 30s
repeat_interval: 1m
receiver: 'snmp_notifier'
receivers:
- name: 'snmp_notifier'
webhook_configs:
- url: http://snmp.notifier.service:9464/alerts
- inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
snmp notifier启动配置
$ ./snmp_notifier --help
nohup ./snmp_notifier --web.listen-address=snmp.notifier.service:9464 --snmp.trap-description-template description-template.tpl --log.level=debug > snmp.log 2>&1 &
简单用例
Traps include 3 fields:
- a trap unique ID;
- the alert/trap status;
- a description of the alerts.
$ snmptrapd -m ALL -m +SNMP-NOTIFIER-MIB -f -Of -Lo -c scripts/snmptrapd.conf
Agent Address: 0.0.0.0
Agent Hostname: localhost
Date: 16 - 42 - 48 - 7 - 11 - 4461579
Enterprise OID: .
Trap Type: Cold Start
Trap Sub-Type: 0
Community/Infosec Context: TRAP2, SNMP v2c, community public
Uptime: 0
Description: Cold Start
PDU Attribute/Value Pair Array:
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (697200) 1:56:12.00
.iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private.enterprises.98789.0.1
.iso.org.dod.internet.private.enterprises.98789.0.1.1 = STRING: "1.3.6.1.4.1.98789.0.1[alertname=Server Status,instance=129db,job=129db,service=node,severity=critical]"
.iso.org.dod.internet.private.enterprises.98789.0.1.2 = STRING: "critical"
.iso.org.dod.internet.private.enterprises.98789.0.1.3 = STRING: "Status: critical
- Alert: Server Status
Summary: Instance Down
Description: Server 129db: Instance Down
ip:"
.iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "129db"
--------------
Agent Address: 0.0.0.0
Agent Hostname: localhost
Date: 16 - 42 - 48 - 7 - 11 - 4461579
Enterprise OID: .
Trap Type: Cold Start
Trap Sub-Type: 0
Community/Infosec Context: TRAP2, SNMP v2c, community public
Uptime: 0
Description: Cold Start
PDU Attribute/Value Pair Array:
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (703200) 1:57:12.00
.iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private.enterprises.98789.0.1
.iso.org.dod.internet.private.enterprises.98789.0.1.1 = STRING: "1.3.6.1.4.1.98789.0.1[alertname=Server Status,instance=129db,job=129db,service=node,severity=critical]"
.iso.org.dod.internet.private.enterprises.98789.0.1.2 = STRING: "info"
.iso.org.dod.internet.private.enterprises.98789.0.1.3 = STRING: "Status: OK"
.iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "129db"
--------------
#以上是服务器down后和服务器恢复的告警
Agent Address: 0.0.0.0
Agent Hostname: localhost
Date: 16 - 42 - 48 - 7 - 11 - 4461579
Enterprise OID: .
Trap Type: Cold Start
Trap Sub-Type: 0
Community/Infosec Context: TRAP2, SNMP v2c, community public
Uptime: 0
Description: Cold Start
PDU Attribute/Value Pair Array:
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (784500) 2:10:45.00
.iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private.enterprises.98789.0.1
.iso.org.dod.internet.private.enterprises.98789.0.1.1 = STRING: "1.3.6.1.4.1.98789.0.1[alertname=MogDB Status,instance=129db,job=129db,server=192.168.134.129:5432,service=MogDB,severity=critical]"
.iso.org.dod.internet.private.enterprises.98789.0.1.2 = STRING: "critical"
.iso.org.dod.internet.private.enterprises.98789.0.1.3 = STRING: "Status: critical
- Alert: MogDB Status
Summary: MogDB Database Down
Description: Database 129db: MogDB Database Down
ip:"
.iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "192.168.134.129"
--------------
Agent Address: 0.0.0.0
Agent Hostname: localhost
Date: 16 - 42 - 48 - 7 - 11 - 4461579
Enterprise OID: .
Trap Type: Cold Start
Trap Sub-Type: 0
Community/Infosec Context: TRAP2, SNMP v2c, community public
Uptime: 0
Description: Cold Start
PDU Attribute/Value Pair Array:
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (793500) 2:12:15.00
.iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private.enterprises.98789.0.1
.iso.org.dod.internet.private.enterprises.98789.0.1.1 = STRING: "1.3.6.1.4.1.98789.0.1[alertname=MogDB Status,instance=129db,job=129db,server=192.168.134.129:5432,service=MogDB,severity=critical]"
.iso.org.dod.internet.private.enterprises.98789.0.1.2 = STRING: "info"
.iso.org.dod.internet.private.enterprises.98789.0.1.3 = STRING: "Status: OK"
.iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "192.168.134.129"
--------------
#以上是MogDB Down后和MogDB恢复出现的告警
最后修改时间:2022-02-22 09:17:05
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




