Alertmanager通過(guò)命令行標(biāo)志和配置文件進(jìn)行配置活箕。 雖然命令行標(biāo)志配置了不可變的系統(tǒng)參數(shù),但配置文件定義了禁止規(guī)則氧吐,通知路由和通知接收器讹蘑。
可視化編輯器可以幫助構(gòu)建路由樹。
要查看所有可用的命令行標(biāo)志筑舅,請(qǐng)運(yùn)行alertmanager -h
座慰。
Alertmanager可以在運(yùn)行時(shí)重新加載其配置。 如果新配置格式不正確翠拣,則不會(huì)應(yīng)用更改并記錄錯(cuò)誤版仔。 通過(guò)向進(jìn)程發(fā)送SIGHUP
或向/-/reload
端點(diǎn)發(fā)送HTTP POST請(qǐng)求來(lái)觸發(fā)配置重新加載。
一误墓、配置文件
指定要加載的配置文件蛮粮,使用--config.file
標(biāo)志
./alertmanager --config.file=simple.yml
該文件以YAML格式寫入,由下面描述的方案定義谜慌。括號(hào)表示參數(shù)是可選的然想。對(duì)于非列表參數(shù),該值設(shè)置為指定的默認(rèn)值欣范。
通用占位符定義如下:
-
<duration>
:與正則表達(dá)式匹配的持續(xù)時(shí)間[0-9]+(ms|[smhdwy])
-
<labelname>
:與正則表達(dá)式匹配的字符串[a-zA-Z _][a-zA-Z0-9 _]*
-
<labelvalue>
:一串unicode字符 -
<filepath>
:當(dāng)前工作目錄中的有效路徑 -
<boolean>
:一個(gè)可以取值為true
或false
的布爾值 -
<string>
:常規(guī)字符串 -
<secret>
:一個(gè)秘密的常規(guī)字符串变泄,例如密碼 -
<tmpl_string>
:在使用前進(jìn)行模板擴(kuò)展的字符串 -
<tmpl_secret>
:在使用之前進(jìn)行模板擴(kuò)展的字符串,它是一個(gè)秘密
其他占位符是單獨(dú)指定的恼琼。
可以在此處找到有效的示例文件妨蛹。
全局配置指定在所有其他配置上下文中有效的參數(shù)。它們還可用作其他配置節(jié)的默認(rèn)值晴竞。
global:
# ResolveTimeout is the time after which an alert is declared resolved
# if it has not been updated.
[ resolve_timeout: <duration> | default = 5m ]
# The default SMTP From header field.
[ smtp_from: <tmpl_string> ]
# The default SMTP smarthost used for sending emails, including port number.
# Port number usually is 25, or 587 for SMTP over TLS (sometimes referred to as STARTTLS).
# Example: smtp.example.org:587
[ smtp_smarthost: <string> ]
# The default hostname to identify to the SMTP server.
[ smtp_hello: <string> | default = "localhost" ]
[ smtp_auth_username: <string> ]
# SMTP Auth using LOGIN and PLAIN.
[ smtp_auth_password: <secret> ]
# SMTP Auth using PLAIN.
[ smtp_auth_identity: <string> ]
# SMTP Auth using CRAM-MD5.
[ smtp_auth_secret: <secret> ]
# The default SMTP TLS requirement.
[ smtp_require_tls: <bool> | default = true ]
# The API URL to use for Slack notifications.
[ slack_api_url: <secret> ]
[ victorops_api_key: <secret> ]
[ victorops_api_url: <string> | default = "https://alert.victorops.com/integrations/generic/20131114/alert/" ]
[ pagerduty_url: <string> | default = "https://events.pagerduty.com/v2/enqueue" ]
[ opsgenie_api_key: <secret> ]
[ opsgenie_api_url: <string> | default = "https://api.opsgenie.com/" ]
[ hipchat_api_url: <string> | default = "https://api.hipchat.com/" ]
[ hipchat_auth_token: <secret> ]
[ wechat_api_url: <string> | default = "https://qyapi.weixin.qq.com/cgi-bin/" ]
[ wechat_api_secret: <secret> ]
[ wechat_api_corp_id: <string> ]
# The default HTTP client configuration
[ http_config: <http_config> ]
# Files from which custom notification template definitions are read.
# The last component may use a wildcard matcher, e.g. 'templates/*.tmpl'.
templates:
[ - <filepath> ... ]
# The root node of the routing tree.
route: <route>
# A list of notification receivers.
receivers:
- <receiver> ...
# A list of inhibition rules.
inhibit_rules:
[ - <inhibit_rule> ... ]
二蛙卤、<route>
路由塊定義路由樹中的節(jié)點(diǎn)及其子節(jié)點(diǎn)。 如果未設(shè)置噩死,則其可選配置參數(shù)將從其父節(jié)點(diǎn)繼承颤难。
每個(gè)警報(bào)都在配置的頂級(jí)路由中進(jìn)入路由樹,該路由必須匹配所有警報(bào)(即沒有任何已配置的匹配器)甜滨。 然后它遍歷子節(jié)點(diǎn)乐严。 如果將continue
設(shè)置為false,則在第一個(gè)匹配的子項(xiàng)后停止衣摩。 如果匹配節(jié)點(diǎn)上的continue
為true昂验,則警報(bào)將繼續(xù)與后續(xù)兄弟節(jié)點(diǎn)匹配。 如果警報(bào)與節(jié)點(diǎn)的任何子節(jié)點(diǎn)都不匹配(沒有匹配的子節(jié)點(diǎn)艾扮,或者不存在)既琴,則根據(jù)當(dāng)前節(jié)點(diǎn)的配置參數(shù)處理警報(bào)。
[ receiver: <string> ]
# The labels by which incoming alerts are grouped together. For example,
# multiple alerts coming in for cluster=A and alertname=LatencyHigh would
# be batched into a single group.
#
# To aggregate by all possible labels use the special value '...' as the sole label name, for example:
# group_by: ['...']
# This effectively disables aggregation entirely, passing through all
# alerts as-is. This is unlikely to be what you want, unless you have
# a very low alert volume or your upstream notification system performs
# its own grouping.
[ group_by: '[' <labelname>, ... ']' ]
# Whether an alert should continue matching subsequent sibling nodes.
[ continue: <boolean> | default = false ]
# A set of equality matchers an alert has to fulfill to match the node.
match:
[ <labelname>: <labelvalue>, ... ]
# A set of regex-matchers an alert has to fulfill to match the node.
match_re:
[ <labelname>: <regex>, ... ]
# How long to initially wait to send a notification for a group
# of alerts. Allows to wait for an inhibiting alert to arrive or collect
# more initial alerts for the same group. (Usually ~0s to few minutes.)
[ group_wait: <duration> | default = 30s ]
# How long to wait before sending a notification about new alerts that
# are added to a group of alerts for which an initial notification has
# already been sent. (Usually ~5m or more.)
[ group_interval: <duration> | default = 5m ]
# How long to wait before sending a notification again if it has already
# been sent successfully for an alert. (Usually ~3h or more).
[ repeat_interval: <duration> | default = 4h ]
# Zero or more child routes.
routes:
[ - <route> ... ]
例子:
# The root route with all parameters, which are inherited by the child
# routes if they are not overwritten.
route:
receiver: 'default-receiver'
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
group_by: [cluster, alertname]
# All alerts that do not match the following child routes
# will remain at the root node and be dispatched to 'default-receiver'.
routes:
# All alerts with service=mysql or service=cassandra
# are dispatched to the database pager.
- receiver: 'database-pager'
group_wait: 10s
match_re:
service: mysql|cassandra
# All alerts with the team=frontend label match this sub-route.
# They are grouped by product and environment rather than cluster
# and alertname.
- receiver: 'frontend-pager'
group_by: [product, environment]
match:
team: frontend
三泡嘴、<inhibit_rule>
當(dāng)存在與另一組匹配器匹配的警報(bào)(源)時(shí)甫恩,禁止規(guī)則將匹配一組匹配器的警報(bào)(目標(biāo))靜音。 目標(biāo)和源警報(bào)必須具有相同列表中標(biāo)簽名稱的equal
標(biāo)簽值酌予。
從語(yǔ)義上講磺箕,缺少標(biāo)簽和具有空值的標(biāo)簽是equal
的奖慌。 因此,如果源和目標(biāo)警報(bào)中都缺少所有相同的標(biāo)簽名稱松靡,則禁用規(guī)則將適用简僧。
為了防止警報(bào)抑制自身,禁止規(guī)則將永遠(yuǎn)不會(huì)禁止與規(guī)則的目標(biāo)和源側(cè)匹配的警報(bào)雕欺。 但是岛马,我們建議以警報(bào)永遠(yuǎn)不會(huì)匹配雙方的方式選擇目標(biāo)和源匹配器。 理由更容易屠列,并且不會(huì)觸發(fā)這種特殊情況啦逆。
# Matchers that have to be fulfilled in the alerts to be muted.
target_match:
[ <labelname>: <labelvalue>, ... ]
target_match_re:
[ <labelname>: <regex>, ... ]
# Matchers for which one or more alerts have to exist for the
# inhibition to take effect.
source_match:
[ <labelname>: <labelvalue>, ... ]
source_match_re:
[ <labelname>: <regex>, ... ]
# Labels that must have an equal value in the source and target
# alert for the inhibition to take effect.
[ equal: '[' <labelname>, ... ']' ]
四、<http_config>
http_config
允許配置接收器用于與基于HTTP的API服務(wù)通信的HTTP客戶端笛洛。
# Note that `basic_auth`, `bearer_token` and `bearer_token_file` options are
# mutually exclusive.
# Sets the `Authorization` header with the configured username and password.
# password and password_file are mutually exclusive.
basic_auth:
[ username: <string> ]
[ password: <secret> ]
[ password_file: <string> ]
# Sets the `Authorization` header with the configured bearer token.
[ bearer_token: <secret> ]
# Sets the `Authorization` header with the bearer token read from the configured file.
[ bearer_token_file: <filepath> ]
# Configures the TLS settings.
tls_config:
[ <tls_config> ]
# Optional proxy URL.
[ proxy_url: <string> ]
五夏志、<tls_config>
tls_config
允許配置TLS連接。
# CA certificate to validate the server certificate with.
[ ca_file: <filepath> ]
# Certificate and key files for client cert authentication to the server.
[ cert_file: <filepath> ]
[ key_file: <filepath> ]
# ServerName extension to indicate the name of the server.
# http://tools.ietf.org/html/rfc4366#section-3.1
[ server_name: <string> ]
# Disable validation of the server certificate.
[ insecure_skip_verify: <boolean> | default = false]
六撞蜂、<receiver>
Receiver是一個(gè)或多個(gè)通知集成的命名配置盲镶。
我們沒有主動(dòng)添加新的接收器,我們建議通過(guò)webhook接收器實(shí)現(xiàn)自定義通知集成蝌诡。
# The unique name of the receiver.
name: <string>
# Configurations for several notification integrations.
email_configs:
[ - <email_config>, ... ]
hipchat_configs:
[ - <hipchat_config>, ... ]
pagerduty_configs:
[ - <pagerduty_config>, ... ]
pushover_configs:
[ - <pushover_config>, ... ]
slack_configs:
[ - <slack_config>, ... ]
opsgenie_configs:
[ - <opsgenie_config>, ... ]
webhook_configs:
[ - <webhook_config>, ... ]
victorops_configs:
[ - <victorops_config>, ... ]
wechat_configs:
[ - <wechat_config>, ... ]
七溉贿、<email_config>
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = false ]
# The email address to send notifications to.
to: <tmpl_string>
# The sender address.
[ from: <tmpl_string> | default = global.smtp_from ]
# The SMTP host through which emails are sent.
[ smarthost: <string> | default = global.smtp_smarthost ]
# The hostname to identify to the SMTP server.
[ hello: <string> | default = global.smtp_hello ]
# SMTP authentication information.
[ auth_username: <string> | default = global.smtp_auth_username ]
[ auth_password: <secret> | default = global.smtp_auth_password ]
[ auth_secret: <secret> | default = global.smtp_auth_secret ]
[ auth_identity: <string> | default = global.smtp_auth_identity ]
# The SMTP TLS requirement.
[ require_tls: <bool> | default = global.smtp_require_tls ]
# TLS configuration.
tls_config:
[ <tls_config> ]
# The HTML body of the email notification.
[ html: <tmpl_string> | default = '{{ template "email.default.html" . }}' ]
# The text body of the email notification.
[ text: <tmpl_string> ]
# Further headers email header key/value pairs. Overrides any headers
# previously set by the notification implementation.
[ headers: { <string>: <tmpl_string>, ... } ]
八、<hipchat_config>
HipChat通知使用Build Your Own集成浦旱。
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = false ]
# The HipChat Room ID.
room_id: <tmpl_string>
# The auth token.
[ auth_token: <secret> | default = global.hipchat_auth_token ]
# The URL to send API requests to.
[ api_url: <string> | default = global.hipchat_api_url ]
# See https://www.hipchat.com/docs/apiv2/method/send_room_notification
# A label to be shown in addition to the sender's name.
[ from: <tmpl_string> | default = '{{ template "hipchat.default.from" . }}' ]
# The message body.
[ message: <tmpl_string> | default = '{{ template "hipchat.default.message" . }}' ]
# Whether this message should trigger a user notification.
[ notify: <boolean> | default = false ]
# Determines how the message is treated by the alertmanager and rendered inside HipChat. Valid values are 'text' and 'html'.
[ message_format: <string> | default = 'text' ]
# Background color for message.
[ color: <tmpl_string> | default = '{{ if eq .Status "firing" }}red{{ else }}green{{ end }}' ]
# The HTTP client's configuration.
[ http_config: <http_config> | default = global.http_config ]
九宇色、<pagerduty_config>
PagerDuty通知通過(guò)PagerDuty API發(fā)送。 PagerDuty提供了有關(guān)如何在此集成的文檔颁湖。
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = true ]
# The following two options are mutually exclusive.
# The PagerDuty integration key (when using PagerDuty integration type `Events API v2`).
routing_key: <tmpl_secret>
# The PagerDuty integration key (when using PagerDuty integration type `Prometheus`).
service_key: <tmpl_secret>
# The URL to send API requests to
[ url: <string> | default = global.pagerduty_url ]
# The client identification of the Alertmanager.
[ client: <tmpl_string> | default = '{{ template "pagerduty.default.client" . }}' ]
# A backlink to the sender of the notification.
[ client_url: <tmpl_string> | default = '{{ template "pagerduty.default.clientURL" . }}' ]
# A description of the incident.
[ description: <tmpl_string> | default = '{{ template "pagerduty.default.description" .}}' ]
# Severity of the incident.
[ severity: <tmpl_string> | default = 'error' ]
# A set of arbitrary key/value pairs that provide further detail
# about the incident.
[ details: { <string>: <tmpl_string>, ... } | default = {
firing: '{{ template "pagerduty.default.instances" .Alerts.Firing }}'
resolved: '{{ template "pagerduty.default.instances" .Alerts.Resolved }}'
num_firing: '{{ .Alerts.Firing | len }}'
num_resolved: '{{ .Alerts.Resolved | len }}'
} ]
# Images to attach to the incident.
images:
[ <image_config> ... ]
# Links to attach to the incident.
links:
[ <link_config> ... ]
# The HTTP client's configuration.
[ http_config: <http_config> | default = global.http_config ]
9.1 <image_config>
這些字段記錄在PagerDuty API文檔中宣蠕。
source: <tmpl_string>
alt: <tmpl_string>
text: <tmpl_string>
9.2 <link_config>
這些字段記錄在PagerDuty API文檔中。
href: <tmpl_string>
text: <tmpl_string>
十甥捺、<pushover_config>
推送通知通過(guò)Pushover API發(fā)送抢蚀。
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = true ]
# The recipient user’s user key.
user_key: <secret>
# Your registered application’s API token, see https://pushover.net/apps
token: <secret>
# Notification title.
[ title: <tmpl_string> | default = '{{ template "pushover.default.title" . }}' ]
# Notification message.
[ message: <tmpl_string> | default = '{{ template "pushover.default.message" . }}' ]
# A supplementary URL shown alongside the message.
[ url: <tmpl_string> | default = '{{ template "pushover.default.url" . }}' ]
# Priority, see https://pushover.net/api#priority
[ priority: <tmpl_string> | default = '{{ if eq .Status "firing" }}2{{ else }}0{{ end }}' ]
# How often the Pushover servers will send the same notification to the user.
# Must be at least 30 seconds.
[ retry: <duration> | default = 1m ]
# How long your notification will continue to be retried for, unless the user
# acknowledges the notification.
[ expire: <duration> | default = 1h ]
# The HTTP client's configuration.
[ http_config: <http_config> | default = global.http_config ]
十一、<slack_config>
Slack通知通過(guò)Slack webhooks發(fā)送镰禾。 通知包含附件皿曲。
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = false ]
# The Slack webhook URL.
[ api_url: <secret> | default = global.slack_api_url ]
# The channel or user to send notifications to.
channel: <tmpl_string>
# API request data as defined by the Slack webhook API.
[ icon_emoji: <tmpl_string> ]
[ icon_url: <tmpl_string> ]
[ link_names: <boolean> | default = false ]
[ username: <tmpl_string> | default = '{{ template "slack.default.username" . }}' ]
# The following parameters define the attachment.
actions:
[ <action_config> ... ]
[ callback_id: <tmpl_string> | default = '{{ template "slack.default.callbackid" . }}' ]
[ color: <tmpl_string> | default = '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}' ]
[ fallback: <tmpl_string> | default = '{{ template "slack.default.fallback" . }}' ]
fields:
[ <field_config> ... ]
[ footer: <tmpl_string> | default = '{{ template "slack.default.footer" . }}' ]
[ pretext: <tmpl_string> | default = '{{ template "slack.default.pretext" . }}' ]
[ short_fields: <boolean> | default = false ]
[ text: <tmpl_string> | default = '{{ template "slack.default.text" . }}' ]
[ title: <tmpl_string> | default = '{{ template "slack.default.title" . }}' ]
[ title_link: <tmpl_string> | default = '{{ template "slack.default.titlelink" . }}' ]
[ image_url: <tmpl_string> ]
[ thumb_url: <tmpl_string> ]
# The HTTP client's configuration.
[ http_config: <http_config> | default = global.http_config ]
11.1 <action_config>
這些字段記錄在Slack API文檔中。
type: <tmpl_string>
text: <tmpl_string>
url: <tmpl_string>
[ style: <tmpl_string> [ default = '' ]
11.2 <field_config>
這些字段記錄在Slack API文檔中吴侦。
title: <tmpl_string>
value: <tmpl_string>
[ short: <boolean> | default = slack_config.short_fields ]
十二屋休、<opsgenie_config>
OpsGenie通知通過(guò)OpsGenie API發(fā)送。
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = true ]
# The API key to use when talking to the OpsGenie API.
[ api_key: <secret> | default = global.opsgenie_api_key ]
# The host to send OpsGenie API requests to.
[ api_url: <string> | default = global.opsgenie_api_url ]
# Alert text limited to 130 characters.
[ message: <tmpl_string> ]
# A description of the incident.
[ description: <tmpl_string> | default = '{{ template "opsgenie.default.description" . }}' ]
# A backlink to the sender of the notification.
[ source: <tmpl_string> | default = '{{ template "opsgenie.default.source" . }}' ]
# A set of arbitrary key/value pairs that provide further detail
# about the incident.
[ details: { <string>: <tmpl_string>, ... } ]
# Comma separated list of team responsible for notifications.
[ teams: <tmpl_string> ]
# Comma separated list of tags attached to the notifications.
[ tags: <tmpl_string> ]
# Additional alert note.
[ note: <tmpl_string> ]
# Priority level of alert. Possible values are P1, P2, P3, P4, and P5.
[ priority: <tmpl_string> ]
# The HTTP client's configuration.
[ http_config: <http_config> | default = global.http_config ]
十三备韧、<victorcops_config>
VictorOps通知通過(guò)VictorOps API發(fā)送出去
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = true ]
# The API key to use when talking to the VictorOps API.
[ api_key: <secret> | default = global.victorops_api_key ]
# The VictorOps API URL.
[ api_url: <string> | default = global.victorops_api_url ]
# A key used to map the alert to a team.
routing_key: <tmpl_string>
# Describes the behavior of the alert (CRITICAL, WARNING, INFO).
[ message_type: <tmpl_string> | default = 'CRITICAL' ]
# Contains summary of the alerted problem.
[ entity_display_name: <tmpl_string> | default = '{{ template "victorops.default.entity_display_name" . }}' ]
# Contains long explanation of the alerted problem.
[ state_message: <tmpl_string> | default = '{{ template "victorops.default.state_message" . }}' ]
# The monitoring tool the state message is from.
[ monitoring_tool: <tmpl_string> | default = '{{ template "victorops.default.monitoring_tool" . }}' ]
# The HTTP client's configuration.
[ http_config: <http_config> | default = global.http_config ]
十四劫樟、<webhook_config>
webhook接收器允許配置通用接收器。
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = true ]
# The endpoint to send HTTP POST requests to.
url: <string>
# The HTTP client's configuration.
[ http_config: <http_config> | default = global.http_config ]
Alertmanager將以下列JSON格式將HTTP POST請(qǐng)求發(fā)送到配置的端點(diǎn):
{
"version": "4",
"groupKey": <string>, // key identifying the group of alerts (e.g. to deduplicate)
"status": "<resolved|firing>",
"receiver": <string>,
"groupLabels": <object>,
"commonLabels": <object>,
"commonAnnotations": <object>,
"externalURL": <string>, // backlink to the Alertmanager.
"alerts": [
{
"status": "<resolved|firing>",
"labels": <object>,
"annotations": <object>,
"startsAt": "<rfc3339>",
"endsAt": "<rfc3339>",
"generatorURL": <string> // identifies the entity that caused the alert
},
...
]
}
有一個(gè)與此功能集成的列表。
十五叠艳、<wechat_config>
微信通知通過(guò)微信API發(fā)送奶陈。
# Whether or not to notify about resolved alerts.
[ send_resolved: <boolean> | default = false ]
# The API key to use when talking to the WeChat API.
[ api_secret: <secret> | default = global.wechat_api_secret ]
# The WeChat API URL.
[ api_url: <string> | default = global.wechat_api_url ]
# The corp id for authentication.
[ corp_id: <string> | default = global.wechat_api_corp_id ]
# API request data as defined by the WeChat API.
[ message: <tmpl_string> | default = '{{ template "wechat.default.message" . }}' ]
[ agent_id: <string> | default = '{{ template "wechat.default.agent_id" . }}' ]
[ to_user: <string> | default = '{{ template "wechat.default.to_user" . }}' ]
[ to_party: <string> | default = '{{ template "wechat.default.to_party" . }}' ]
[ to_tag: <string> | default = '{{ template "wechat.default.to_tag" . }}' ]``
十六、
Prometheus官網(wǎng)地址:https://prometheus.io/
我的Github:https://github.com/Alrights/prometheus