Kube Prometheus項目地址
https://github.com/coreos/kube-prometheus
項目的Helm安裝包地址
https://github.com/helm/charts/blob/master/stable/prometheus-operator
Prometheus官網(wǎng)地址
Prometheus Operator項目地址
https://github.com/coreos/prometheus-operator/
一個部署樣例
https://github.com/coreos/kube-prometheus/blob/master/examples/example-app/
Prometheus Operator是什么
Prometheus Operator是運行在Kubernetes之上的監(jiān)控和告警工具。部署時不用創(chuàng)建和修改prometheus的配置文件,所有的操作通過創(chuàng)建prometheus自己的資源對象來實現(xiàn)烫葬。對于監(jiān)控配置的修改可以做到實時生效。
Prometheus Operator的自定義資源(CustomResourceDefinitions CRD)
- Prometheus: 定義Prometheus監(jiān)控系統(tǒng)的部署伟阔。
- ServiceMonitor:監(jiān)控一組service。該service需要暴露監(jiān)控數(shù)據(jù)义图,供prometheus收集减俏。
- PodMonitor:監(jiān)控一組pod。
- PrometheusRule:Prometheus的規(guī)則文件碱工。包含告警規(guī)則娃承。
- AlertManager:定義告警管理器的部署。
QuickStart
下載kube-prometheus項目怕篷。
git clone https://github.com/coreos/kube-prometheus.git
執(zhí)行:
# Create the namespace and CRDs, and then wait for them to be availble before creating the remaining resources
kubectl create -f manifests/setup
# 下面命令為等待setup過程運行完畢
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/
移除Kube Prometheus
執(zhí)行:
kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup
訪問儀表盤
可以使用port forward方式訪問儀表盤历筝。
訪問Prometheus
$ kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090
訪問Grafana
$ kubectl --namespace monitoring port-forward svc/grafana 3000
訪問Alert Manager
$ kubectl --namespace monitoring port-forward svc/alertmanager-main 9093
這些服務(wù)的端口可以通過localhost訪問到。
注意:如果需要通過其他地址訪問廊谓,需要增加address參數(shù)梳猪。舉例如下:
$ kubectl --namespace monitoring port-forward --address 0.0.0.0 svc/prometheus-k8s 9090
手動部署prometheus operator
上面步驟使用的是Kube Prometheus。該項目內(nèi)置了一系列prometheus operator的資源對象配置蒸痹,可以做到一鍵安裝春弥。
Prometheus operator也可以手工方式部署。
安裝Prometheus Operator
- Git下載Prometheus Operator項目
git clone https://github.com/coreos/prometheus-operator.git
- 執(zhí)行命令叠荠,創(chuàng)建prometheus-operator對象和相關(guān)CRD
kubectl apply -f bundle.yaml
- 啟用prometheus資源對象的RBAC規(guī)則
創(chuàng)建ServiceAccount:
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
創(chuàng)建ClusterRole:
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
創(chuàng)建ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: default
- 創(chuàng)建prometheus資源對象
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
team: frontend
podMonitorSelector:
matchLabels:
team: frontend
resources:
requests:
memory: 400Mi
enableAdminAPI: false
通過serviceMonitorSelector
和podMonitorSelector
決定哪些ServiceMonitor和PodMonitor生效匿沛。如果選擇器為空({}
)意味著會選擇所有的對象。
- 部署自己的應(yīng)用榛鼎。
下面舉一個例子:
創(chuàng)建一個Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-app
spec:
replicas: 3
selector:
matchLabels:
app: example-app
template:
metadata:
labels:
app: example-app
spec:
containers:
- name: example-app
image: fabxc/instrumented_app
ports:
- name: web
containerPort: 8080
這里假定我們的監(jiān)控數(shù)據(jù)在8080端口暴露逃呼。
再創(chuàng)建一個service,即訪問監(jiān)控數(shù)據(jù)的service者娱。
kind: Service
apiVersion: v1
metadata:
name: example-app
labels:
app: example-app
spec:
selector:
app: example-app
ports:
- name: web
port: 8080
- 創(chuàng)建ServiceMonitor
這一步我們需要Prometheus讀取上一步創(chuàng)建的service暴露的監(jiān)控數(shù)據(jù)抡笼。需要借助于ServiceMonitor完成。
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: web
注意:
- 這里的selector需要匹配上一步創(chuàng)建出來的service黄鳍。
- endpoints的port只能配置為service中的命名端口推姻,不能使用數(shù)字。
- 需要確保prometheus對象的
serviceMonitorSelector
和serviceMonitorNamespaceSelector
匹配這一步創(chuàng)建出的ServiceMonitor對象际起。
- 暴露prometheus端口
如果需要暴露prometheus端口可以在集群外訪問拾碌,需要執(zhí)行此步驟吐葱。
apiVersion: v1
kind: Service
metadata:
name: prometheus
spec:
type: NodePort
ports:
- name: web
nodePort: 30900
port: 9090
protocol: TCP
targetPort: web
selector:
prometheus: prometheus
這里使用創(chuàng)建了一個使用NodePort的Service街望。
Prometheus資源對象
Prometheus資源對象的作用相當于整個Prometheus的配置中心校翔。
Prometheus資源對象描述文件如下:
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
creationTimestamp: "2020-02-12T04:38:38Z"
generation: 1
labels:
prometheus: k8s
name: k8s
namespace: monitoring
resourceVersion: "3745"
selfLink: /apis/monitoring.coreos.com/v1/namespaces/monitoring/prometheuses/k8s
uid: 3d66375e-b8fb-453b-bcd2-a9ef1fd75387
spec:
alerting:
alertmanagers:
- name: alertmanager-main
namespace: monitoring
port: web
baseImage: quay.io/prometheus/prometheus
nodeSelector:
kubernetes.io/os: linux
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
replicas: 2
resources:
requests:
memory: 400Mi
ruleSelector:
matchLabels:
prometheus: k8s
role: alert-rules
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: v2.15.2
其中,限制監(jiān)控范圍的配置有如下四個:
- podMonitorNamespaceSelector:掃描哪個namespace下的PodMonitor灾前,如果為空防症,則掃描所有的namespace。
- serviceMonitorNamespaceSelector:掃描哪個namespace下的ServiceMonitor哎甲,如果為空蔫敲,則掃描所有的namespace。
- podMonitorSelector:通過selector配置掃描哪些PodMonitor炭玫。如果為空奈嘿,則掃描所有PodMonitor。
- serviceMonitorSelector:通過selector配置掃描哪些ServiceMonitor吞加。如果為空裙犹,則掃描所有ServiceMonitor。
除此之外還有一個ruleSelector
衔憨,只有匹配該selector的PrometheusRules才會被讀取叶圃。因此我們?nèi)缡怯媚J的prometheus配置,自己創(chuàng)建的PrometheusRules需要有如下兩個標簽:
prometheus: k8s
role: alert-rules
指定Prometheus的遠程存儲
生產(chǎn)環(huán)境Prometheus的監(jiān)控數(shù)據(jù)需要落地到數(shù)據(jù)庫中践图。
建議使用Influx數(shù)據(jù)庫掺冠。它和Prometheus的兼容性最好。
安裝InfluxDB
InfluxDB官網(wǎng)鏈接:https://www.influxdata.com/
下載安裝并啟動服務(wù)即可码党。
# 啟動InfluxDB
systemctl start influxdb
# 進入InfluxDB
influx
創(chuàng)建一個名為prometheus
的數(shù)據(jù)庫:
curl -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE prometheus"
編譯并運行Remote storage adapter
Prometheus使用Influx作為遠程存儲需要一個remote_storage_adapter
德崭。remote_storage_adapter
可以支持Graphite, Influxdb和Opentsdb。其中Influxdb支持READ和WRITE模式揖盘。
使用Git clone源代碼之后眉厨,執(zhí)行go build
命令編譯。
接下來運行Remote storage adapter
./remote_storage_adapter --influxdb-url=http://localhost:8086/ --influxdb.database=prometheus --influxdb.retention-policy=autogen
注意:這里Influxdb默認端口是8086扣讼,使用的數(shù)據(jù)庫名為prometheus缺猛。
配置prometheus資源對象
涉及的配置項解釋如下:
- remoteRead 獲取數(shù)據(jù)的URL
- remoteWrite 寫入數(shù)據(jù)的URL
修改prometheus資源對象的配置文件,增加:
spec:
remoteRead:
- url: "http://localhost:9201/read"
remoteWrite:
- url: "http://localhost:9201/write"
注意:9201端口是remote_storage_adapter
默認監(jiān)聽的端口椭符。
PS:prometheus原生配置文件的配置方法如下:
# Remote write configuration (for Graphite, OpenTSDB, or InfluxDB).
remote_write:
- url: "http://localhost:9201/write"
# Remote read configuration (for InfluxDB only at the moment).
remote_read:
- url: "http://localhost:9201/read"
ServiceMonitor資源資源對象
配置Prometheus從一個Service讀取監(jiān)控信息荔燎。
首先配置一個service,用來指定監(jiān)控信息暴露端口销钝。
kind: Service
apiVersion: v1
metadata:
name: example-app
labels:
app: example-app
spec:
selector:
app: example-app
ports:
- name: web
port: 8080
監(jiān)控信息從這個pod的8080端口暴露有咨。
再創(chuàng)建一個ServiceMonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: web
port這個地方必須使用命名端口。
PodMonitor
配置Prometheus從一個Pod讀取監(jiān)控信息蒸健。
注意:目前配置項作用尚未明確座享,這里給出部分配置項婉商。
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example-pod-monitor
namespace: default
labels:
app: example
spec:
podMetricsEndpoints:
selector:
podTargetLabels:
sampleLimit:
jobLabel:
PrometheusRule
用于配置告警規(guī)則。
示例如下:
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: prometheus-k8s-rules
spec:
groups:
- name: k8s.rules
rules:
- alert: KubeletDown
annotations:
message: Kubelet has disappeared from Prometheus target discovery.
expr: |
absent(up{job="kubelet"} == 1)
for: 15m
labels:
severity: critical
和Ingress配合使用
除了使用NodePort暴露prometheus服務(wù)到集群外渣叛,我們還可以使用Ingress的方式暴露服務(wù)丈秩。
Ingress的配置如下所示:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: monitoring
annotations:
nginx.ingress.kubernetes.io/rewrite-target: "/$1"
spec:
rules:
- http:
paths:
- backend:
serviceName: prometheus
servicePort: 9090
path: /prometheus/(.*)
該Ingress將/prometheus/
映射為prometheus
這個service。此時可以通過http://hostname/prometheus/
訪問到Prometheus server淳衙。但有個問題蘑秽,頁面的靜態(tài)資源沒法加載。
為了解決這個問題箫攀,接下來需要為Prometheus server添加一個context path的配置肠牲。
Prometheus對象有一個externalUrl
的配置項,它包含了context path的功能靴跛,需要配置為完整的對外暴露的URL缀雳。如下所示:
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: main
spec:
replicas: 2
version: v2.15.2
externalUrl: http://hostname/prometheus/
resources:
requests:
memory: 400Mi
更詳細的使用方式可參考:
https://coreos.com/operators/prometheus/docs/latest/user-guides/exposing-prometheus-and-alertmanager.html
使用示例
https://github.com/coreos/kube-prometheus/blob/master/examples/example-app/