簡介
Grafana是一款用Go語言開發(fā)的開源數(shù)據(jù)可視化工具,可以做數(shù)據(jù)監(jiān)控和數(shù)據(jù)統(tǒng)計忆肾,帶有告警功能挪丢。目前使用grafana的公司有很多登刺,如paypal、ebay爱葵、intel等施戴。
功能說明
告警通知
開啟告警
grafana只有graph支持告警通知。grafana的告警通知渠道有很多種萌丈,像Email赞哗、Teams、釘釘?shù)榷加兄С帧?br> 在grafana.ini中開啟告警:
#################################### Alerting ############################
[alerting]
# Disable alerting engine & UI features
enabled = true #開啟
# Makes it possible to turn off alert rule execution but alerting UI is visible
execute_alerts = true #開啟
# Default setting for new alert rules. Defaults to categorize error and timeouts as alerting. (alerting, keep_state)
;error_or_timeout = alerting
# Default setting for how Grafana handles nodata or null values in alerting. (alerting, no_data, keep_state, ok)
;nodata_or_nullvalues = no_data
# Alert notifications can include images, but rendering many images at the same time can overload the server
# This limit will protect the server from render overloading and make sure notifications are sent out quickly
;concurrent_render_limit = 5
郵件通知
STMP服務器配置
要能發(fā)送郵件通知辆雾,首先需要在配置文件grafana.ini中配置郵件服務器等信息:
#################################### SMTP / Emailing ##########################
[smtp]
enabled = true #是否允許開啟
host = #發(fā)送服務器地址肪笋,可以再郵箱的配置教程中找到:
user = 你的郵箱
# If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
password = 這個密碼是你開啟smtp服務生成的密碼
;cert_file =
;key_file =
skip_verify = true
from_address = 你的郵箱
from_name = Grafana
# EHLO identity in SMTP dialog (defaults to instance_name)
;ehlo_identity = dashboard.example.com
[emails]
;welcome_email_on_sign_up = false
例子:
#################################### SMTP / Emailing #####################
[smtp]
enabled = true
host = smtp.163.com:25
user = 495804928@163.com
# If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
password = **********
cert_file =
key_file =
skip_verify = true
from_address = 495804928@163.com
from_name = Grafana
ehlo_identity = 163.com
修改完配置,記得重啟Grafana服務
具體配置
具體郵件和釘釘配置可以參考:
https://yq.aliyun.com/articles/255169
重點注意
1.怎么解決Template variables are not supported in alert queries?
由于Prometheus告警不支持變量藤乙,我們怎么在有大量變量的面板中對監(jiān)控指標進行告警呢猜揪,具體步驟如下:
1.我們需要在該面板Queries中添加一個沒有變量的Query,如圖中C就沒法用于監(jiān)控坛梁,D才能用于監(jiān)控而姐,因為C有變量。
2.然后設置成Disable query這樣圖表中就不會有該指標數(shù)據(jù)划咐,圖中標記的部分拴念。
3.然后在面板的Alert中選擇該Query用于監(jiān)控即可,如圖中query(D,15s,now),多說一句尖殃,圖中Evaluate every是檢查告警的周期丈莺,檢查到異常會變黃划煮,如果在For配置的時間內還沒有恢復就會變紅送丰,然后觸發(fā)告警了。
注意:一定要給監(jiān)控指標配置Legend弛秋,如果直接釘釘直接發(fā)送process_cpu_usage{application="server2",instance="localhost:9754",job="server2"}這樣的可能會提示字符問題器躏,建議Legend定義為例如:應用:{{application}},實例: {{instance}}蟹略,在告警信息就能知道那個應用的那個實例有問題登失。