Prometheus可是個好東西,云原生時代監(jiān)控領(lǐng)域的現(xiàn)象級產(chǎn)品底扳,常與Grafana搭配使用寂嘉,是當(dāng)前互聯(lián)網(wǎng)企業(yè)的首選監(jiān)控解決方案。
一班挖、安裝Prometheus
安裝主要有YAML鲁捏、Operater兩種,先從YAML開始可以更好的理解細節(jié)(Operater最終也是生成的yml文件)。需要考慮幾個點:
- 訪問權(quán)限
- 配置文件
- 存儲卷
訪問權(quán)限相關(guān)的配置:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs:
- get
- watch
- list
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- watch
- list
- nonResourceURLs: ["/metrics"]
verbs:
- get
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: smac
labels:
app: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: smac
roleRef:
kind: ClusterRole
name: prometheus
apiGroup: rbac.authorization.k8s.io
配置文件configmap
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-rules
namespace: smac
labels:
app: prometheus
data:
cpu-usage.rule: |
#因篇幅過長给梅,此處內(nèi)容忽略
---
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-conf
namespace: smac
labels:
app: prometheus
data:
prometheus.yml: |-
#因篇幅過長假丧,此處內(nèi)容忽略
存儲卷相關(guān)的配置,建議使用StorageClass动羽,官方不建議使用NFS包帚,極端情況會導(dǎo)致數(shù)據(jù)丟失,配置如下:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: prometheus-pvc
namespace: smac
labels:
app: prometheus
annotations:
volume.beta.kubernetes.io/storage-class: "local"
finalizers:
- kubernetes.io/pvc-protection
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
后面运吓,就是常規(guī)的deployment和service的配置:
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
app: prometheus
name: prometheus
namespace: smac
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus
securityContext:
runAsUser: 0
containers:
- name: prometheus
image: prom/prometheus:v2.29.1
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /prometheus
name: prometheus-data-volume
- name: prometheus-conf-volume #注意渴邦,此處不能用subPath,會導(dǎo)致configmap的熱更新失效
mountPath: /etc/prometheus
- name: prometheus-rules-volume
mountPath: /etc/prometheus/rules
ports:
- containerPort: 9090
protocol: TCP
volumes:
- name: prometheus-data-volume
persistentVolumeClaim:
claimName: prometheus-data-pvc
- name: prometheus-conf-volume
configMap:
name: prometheus-conf
- name: prometheus-rules-volume
configMap:
name: prometheus-rules
---
#service
kind: Service
apiVersion: v1
metadata:
annotations:
prometheus.io/scrape: 'true'
labels:
app: prometheus
name: prometheus-service
namespace: smac
spec:
ports:
- port: 9090
targetPort: 9090
selector:
app: prometheus
type: NodePort
二拘哨、配置熱更新
接下來谋梭,我們要在prometheus中添加一個job。修改configmap中的prometheus.yml倦青,增加如下內(nèi)容:
scrape_configs:
...
- job_name: "demo-service"
metrics_path: "/actuator/prometheus"
static_configs:
- targets: ["10.233.97.135:8080"]
嗯瓮床?發(fā)現(xiàn)并沒有生效誒?難道需要重啟产镐?有沒有熱更新的方式隘庄?
于是一通搜索,得到以下結(jié)論:
Prometheus支持熱更新癣亚,在啟動時通過參數(shù)--web.enable-lifecycle開啟丑掺,之后通過 curl -X POST http://localhost:9090/-/reload 即可更新。
于是調(diào)整配置如下:
containers:
- name: prometheus
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--web.enable-lifecycle'
調(diào)整之后進行了重新部署逃糟,然后修改configmap的內(nèi)容吼鱼,按照上面的命令執(zhí)行就可以了。但是绰咽,手動更新依然很麻煩菇肃,能不能自動更細呢?于是取募,又一通搜索琐谤,發(fā)現(xiàn)了一款神器:configmap-reload
,于是趕緊配置上:
containers:
- name: prometheus
image: prom/prometheus:v2.29.1
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--web.enable-lifecycle'
...
- name: prometheus-configmap-reloader
image: 'jimmidyson/configmap-reload:v0.3.0'
args:
- '--webhook-url=http://localhost:9090/-/reload'
- '--volume-dir=/etc/prometheus' #此處的volume-dir應(yīng)該與volumes的定義完全一致
volumeMounts:
#注意玩敏,此處不能用subPath斗忌,會導(dǎo)致configmap的熱更新失效
- name: prometheus-conf-volume
mountPath: /etc/prometheus
調(diào)整之后,發(fā)現(xiàn)容器組里多了一個prometheus-configmap-reloader的pod旺聚。此時再嘗試织阳,修改configmap后過一小會兒(大概10s,不要問我為什么砰粹,你懂的)唧躲,新增的配置項生效了,我們從Targets中發(fā)現(xiàn)了‘demo-service’。
注意:configmap如果使用subPath進行掛載弄痹,將無法自動更新饭入。
三、Job的服務(wù)發(fā)現(xiàn)
上面我們添加一個job肛真,targets中是指定的ip:port谐丢,在k8s中顯然是不實用的,我們必須要實現(xiàn)動態(tài)獲取target蚓让。好在prometheus已經(jīng)支持了該功能乾忱,原理是通過apiserver輪詢pod信息。通過kubernetes_sd_configs凭疮,可以實現(xiàn)各種資源的服務(wù)發(fā)現(xiàn)饭耳,下面是pod的配置示例:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
這個是官方例子串述,具體label的清單以及含義执解,請參照官網(wǎng):Prometheus#Configuration#kubernetes_sd_config
我的實戰(zhàn)配置如下:
scrape_configs:
- job_name: "demo-service"
metrics_path: "/actuator/prometheus"
static_configs:
- targets: ["demo-service:8080"]
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
regex: demo-service
action: keep
- source_labels: [__meta_kubernetes_pod_label_app]
action: replace
target_label: application
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
此配置的效果是從所有pod中尋找label:app=demo-serivce的pod,將pod的地址和端口替換上面配置中targets的內(nèi)容纲酗,并添加一個application=demo-service的label衰腌。
至此,基本滿足平時的使用了觅赊,再往后就是高可用HA右蕊、第三方存儲、PromQL的實戰(zhàn)等高級內(nèi)容吮螺,敬請期待~饶囚!