Prometheus簡介(基于Kubernetes)

# Prometheus簡介(基于Kubernetes)

本文中不包含Alertmanager和遠(yuǎn)程存儲的內(nèi)容恃锉,下次有時間在補充9啄痢8羰ⅰ犹菱!

1、Prometheus簡介


*Prometheus是一個開源的系統(tǒng)監(jiān)控工具吮炕。根據(jù)配置的任務(wù)(job)以http/s周期性的收刮(scrape/pull)指定目標(biāo)(target)上的指標(biāo)(metric)腊脱。目標(biāo)(target)可以以靜態(tài)方式或者自動發(fā)現(xiàn)方式指定。Prometheus將收刮(scrape)的指標(biāo)(metric)保存在本地或者遠(yuǎn)程存儲上龙亲。

*Prometheus以pull方式來收集指標(biāo)陕凹。對比push方式,pull可以集中配置鳄炉、針對不同的視角搭建不同的監(jiān)控系統(tǒng)杜耙;

*Prometheus于2016年加入CNCF,是繼kubernetes之后拂盯,第二個加入CNCF的開源項目佑女!

1.1、體系結(jié)構(gòu)


Prometheus Server:核心組件谈竿,負(fù)責(zé)收刮和存儲時序數(shù)據(jù)(time series data)团驱,并且提供查詢接口;

Jobs/Exporters:客戶端空凸,監(jiān)控并采集指標(biāo)嚎花,對外暴露HTTP服務(wù)(/metrics);目前已經(jīng)有很多的軟件原生就支持Prometjeus呀洲,提供/metrics紊选,可以直接使用啼止;對于像操作系統(tǒng)已經(jīng)不提供/metrics的應(yīng)用,可以使用現(xiàn)有的exporters或者開發(fā)自己的exporters來提供/metrics服務(wù)兵罢;

Pushgateway:針對push系統(tǒng)設(shè)計族壳,Short-lived jobs定時將指標(biāo)push到Pushgateway,再由Prometheus Server從Pushgateway上pull趣些;

Alertmanager:報警組件仿荆,根據(jù)實現(xiàn)配置的規(guī)則(rule)進(jìn)行響應(yīng),例如發(fā)送郵件坏平;

Web UI:Prometheus內(nèi)置一個簡單的Web控制臺拢操,可以查詢指標(biāo),查看配置信息或者Service Discovery等舶替,實際工作中令境,查看指標(biāo)或者創(chuàng)建儀表盤通常使用Grafana,Prometheus作為Grafana的數(shù)據(jù)源顾瞪;

1.2舔庶、數(shù)據(jù)結(jié)構(gòu)

Prometheus按照時間序列存儲指標(biāo),每一個指標(biāo)都由Notation + Samples組成:

*Notation:通常有指標(biāo)名稱與一組label組成:

<metric name>{<label name>=<label value>, ...}

*Samples:樣品陈醒,通常包含一個64位的浮點值和一個毫秒級的時間戳

2惕橙、安裝部署

2.1、環(huán)境清單

系統(tǒng)環(huán)境

root@master:~# uname -a

Linux master 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

root@master:~# kubectl get nodes -o wide

NAME? ? ? STATUS? ? ROLES? ? AGE? ? ? VERSION? ? ? ? ? EXTERNAL-IP? OS-IMAGE? ? ? ? ? ? KERNEL-VERSION? ? CONTAINER-RUNTIME

master? ? Ready? ? master? ? 11d? ? ? v1.9.0+coreos.0? <none>? ? ? ? Ubuntu 16.04.2 LTS? 4.4.0-62-generic? docker://17.12.0-ce

node1? ? Ready? ? <none>? ? 11d? ? ? v1.9.0+coreos.0? <none>? ? ? ? Ubuntu 16.04.2 LTS? 4.4.0-62-generic? docker://17.12.0-ce

node2? ? Ready? ? <none>? ? 11d? ? ? v1.9.0+coreos.0? <none>? ? ? ? Ubuntu 16.04.2 LTS? 4.4.0-62-generic? docker://17.12.0-ce

node3? ? Ready? ? <none>? ? 11d? ? ? v1.9.0+coreos.0? <none>? ? ? ? Ubuntu 16.04.2 LTS? 4.4.0-62-generic? docker://17.12.0-ce

root@master:~# kubectl get pods --all-namespaces -o wide

NAMESPACE? ? NAME? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? READY? ? STATUS? ? RESTARTS? AGE? ? ? IP? ? ? ? ? ? ? ? NODE

kube-system? calico-node-64btj? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.213? node3

kube-system? calico-node-8wqtc? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.211? node1

kube-system? calico-node-hrmql? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.210? master

kube-system? calico-node-wvgtc? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.212? node2

kube-system? kube-apiserver-master? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.210? master

kube-system? kube-controller-manager-master? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.210? master

kube-system? kube-dns-7d9c4d7876-wxss9? ? ? ? ? ? ? 3/3? ? ? Running? 0? ? ? ? ? 11d? ? ? 10.233.75.2? ? ? node2

kube-system? kube-dns-7d9c4d7876-xbxbg? ? ? ? ? ? ? 3/3? ? ? Running? 0? ? ? ? ? 11d? ? ? 10.233.102.129? ? node1

kube-system? kube-proxy-gprzq? ? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.211? node1

kube-system? kube-proxy-k9gpk? ? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.213? node3

kube-system? kube-proxy-kwl5c? ? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.212? node2

kube-system? kube-proxy-plxpc? ? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.210? master

kube-system? kube-scheduler-master? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.210? master

kube-system? kube-state-metrics-868cf44b5f-g8qfj? ? 2/2? ? ? Running? 0? ? ? ? ? 6d? ? ? ? 10.233.102.157? ? node1

kube-system? kubedns-autoscaler-564b455d77-7rm9g? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 10.233.75.1? ? ? node2

kube-system? kubernetes-dashboard-767994d8b8-wmzs7? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 10.233.75.3? ? ? node2

kube-system? nginx-proxy-node1? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.211? node1

kube-system? nginx-proxy-node2? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.212? node2

kube-system? nginx-proxy-node3? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 192.168.115.213? node3

kube-system? tiller-deploy-f9b69765d-lvw8k? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 11d? ? ? 10.233.71.5? ? ? node3

root@master:~# showmount -e

Export list for master:

/nfs *


創(chuàng)建namespace

root@master:~/kubernetes/prometheus# cat namespace.yml

---

apiVersion: v1

kind: Namespace

metadata:

? name: ns-monitor

? labels:

? ? name: ns-monitor

root@master:~/kubernetes/prometheus# kubectl apply -f namespace.yml


2.2钉跷、部署node-exporter

node-exporter.yml文件內(nèi)容

---

kind: DaemonSet

apiVersion: apps/v1beta2

metadata:

? labels:

? ? app: node-exporter

? name: node-exporter

? namespace: ns-monitor

spec:

? revisionHistoryLimit: 10

? selector:

? ? matchLabels:

? ? ? app: node-exporter

? template:

? ? metadata:

? ? ? labels:

? ? ? ? app: node-exporter

? ? spec:

? ? ? containers:

? ? ? ? - name: node-exporter

? ? ? ? ? image: 192.168.101.88:5000/prom/node-exporter:v0.15.2

? ? ? ? ? ports:

? ? ? ? ? ? - containerPort: 9100

? ? ? ? ? ? ? protocol: TCP

? ? ? hostNetwork: true

? ? ? hostPID: true

? ? ? tolerations:

? ? ? ? - effect: NoSchedule

? ? ? ? ? operator: Exists

---

kind: Service

apiVersion: v1

metadata:

? labels:

? ? app: node-exporter

? name: node-exporter-service

? namespace: ns-monitor

spec:

? ports:

? ? - port: 9100

? ? ? targetPort: 9100

? selector:

? ? app: node-exporter

? clusterIP: None

部署node-exporter

root@master:~/kubernetes/prometheus# kubectl apply -f node-exporter.yml

root@master:~/kubernetes/prometheus# kubectl get pods -n ns-monitor -o wide

NAME? ? ? ? ? ? ? ? ? ? ? ? ? READY? ? STATUS? ? RESTARTS? AGE? ? ? IP? ? ? ? ? ? ? ? NODE

node-exporter-br7wz? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 3h? ? ? ? 192.168.115.210? master

node-exporter-jzc6f? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 3h? ? ? ? 192.168.115.212? node2

node-exporter-t9s2f? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 3h? ? ? ? 192.168.115.213? node3

node-exporter-trh52? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 3h? ? ? ? 192.168.115.211? node1


Node-exporter用于采集kubernetes集群中各個節(jié)點的物理指標(biāo)弥鹦,比如:Memory、CPU等爷辙”蚧担可以直接在每個物理節(jié)點是直接安裝,這里我們使用DaemonSet部署到每個節(jié)點上膝晾,使用 hostNetwork: true 和 hostPID: true 使其獲得Node的物理指標(biāo)信息栓始;

配置tolerations使其在master節(jié)點也啟動一個pod,我的集群默認(rèn)情況下血当,master不參與負(fù)載幻赚;

查看node-exporter指標(biāo)信息

使用瀏覽器訪問任意節(jié)點的9100端口

2.3、部署Prometheus

prometheus.yml文件內(nèi)容

---

apiVersion: rbac.authorization.k8s.io/v1beta1

kind: ClusterRole

metadata:

? name: prometheus

rules:

? - apiGroups: [""] # "" indicates the core API group

? ? resources:

? ? ? - nodes

? ? ? - nodes/proxy

? ? ? - services

? ? ? - endpoints

? ? ? - pods

? ? verbs:

? ? ? - get

? ? ? - watch

? ? ? - list

? - apiGroups:

? ? ? - extensions

? ? resources:

? ? ? - ingresses

? ? verbs:

? ? ? - get

? ? ? - watch

? ? ? - list

? - nonResourceURLs: ["/metrics"]

? ? verbs:

? ? ? - get

---

apiVersion: v1

kind: ServiceAccount

metadata:

? name: prometheus

? namespace: ns-monitor

? labels:

? ? app: prometheus

---

apiVersion: rbac.authorization.k8s.io/v1beta1

kind: ClusterRoleBinding

metadata:

? name: prometheus

subjects:

? - kind: ServiceAccount

? ? name: prometheus

? ? namespace: ns-monitor

roleRef:

? kind: ClusterRole

? name: prometheus

? apiGroup: rbac.authorization.k8s.io

---

apiVersion: v1

kind: ConfigMap

metadata:

? name: prometheus-conf

? namespace: ns-monitor

? labels:

? ? app: prometheus

data:

? prometheus.yml: |-

? ? # my global config

? ? global:

? ? ? scrape_interval:? ? 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.

? ? ? evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

? ? ? # scrape_timeout is set to the global default (10s).

? ? # Alertmanager configuration

? ? alerting:

? ? ? alertmanagers:

? ? ? - static_configs:

? ? ? ? - targets:

? ? ? ? ? # - alertmanager:9093

? ? # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.

? ? rule_files:

? ? ? # - "first_rules.yml"

? ? ? # - "second_rules.yml"

? ? # A scrape configuration containing exactly one endpoint to scrape:

? ? # Here it's Prometheus itself.

? ? scrape_configs:

? ? ? # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

? ? ? - job_name: 'prometheus'

? ? ? ? # metrics_path defaults to '/metrics'

? ? ? ? # scheme defaults to 'http'.

? ? ? ? static_configs:

? ? ? ? ? - targets: ['localhost:9090']

? ? ? - job_name: 'grafana'

? ? ? ? static_configs:

? ? ? ? ? - targets:

? ? ? ? ? ? ? - 'grafana-service.ns-monitor:3000'

? ? ? - job_name: 'kubernetes-apiservers'

? ? ? ? kubernetes_sd_configs:

? ? ? ? - role: endpoints

? ? ? ? # Default to scraping over https. If required, just disable this or change to

? ? ? ? # `http`.

? ? ? ? scheme: https

? ? ? ? # This TLS & bearer token file config is used to connect to the actual scrape

? ? ? ? # endpoints for cluster components. This is separate to discovery auth

? ? ? ? # configuration because discovery & scraping are two separate concerns in

? ? ? ? # Prometheus. The discovery auth config is automatic if Prometheus runs inside

? ? ? ? # the cluster. Otherwise, more config options have to be provided within the

? ? ? ? # <kubernetes_sd_config>.

? ? ? ? tls_config:

? ? ? ? ? ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

? ? ? ? ? # If your node certificates are self-signed or use a different CA to the

? ? ? ? ? # master CA, then disable certificate verification below. Note that

? ? ? ? ? # certificate verification is an integral part of a secure infrastructure

? ? ? ? ? # so this should only be disabled in a controlled environment. You can

? ? ? ? ? # disable certificate verification by uncommenting the line below.

? ? ? ? ? #

? ? ? ? ? # insecure_skip_verify: true

? ? ? ? bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

? ? ? ? # Keep only the default/kubernetes service endpoints for the https port. This

? ? ? ? # will add targets for each API server which Kubernetes adds an endpoint to

? ? ? ? # the default/kubernetes service.

? ? ? ? relabel_configs:

? ? ? ? - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

? ? ? ? ? action: keep

? ? ? ? ? regex: default;kubernetes;https

? ? ? # Scrape config for nodes (kubelet).

? ? ? #

? ? ? # Rather than connecting directly to the node, the scrape is proxied though the

? ? ? # Kubernetes apiserver.? This means it will work if Prometheus is running out of

? ? ? # cluster, or can't connect to nodes for some other reason (e.g. because of

? ? ? # firewalling).

? ? ? - job_name: 'kubernetes-nodes'

? ? ? ? # Default to scraping over https. If required, just disable this or change to

? ? ? ? # `http`.

? ? ? ? scheme: https

? ? ? ? # This TLS & bearer token file config is used to connect to the actual scrape

? ? ? ? # endpoints for cluster components. This is separate to discovery auth

? ? ? ? # configuration because discovery & scraping are two separate concerns in

? ? ? ? # Prometheus. The discovery auth config is automatic if Prometheus runs inside

? ? ? ? # the cluster. Otherwise, more config options have to be provided within the

? ? ? ? # <kubernetes_sd_config>.

? ? ? ? tls_config:

? ? ? ? ? ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

? ? ? ? bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

? ? ? ? kubernetes_sd_configs:

? ? ? ? - role: node

? ? ? ? relabel_configs:

? ? ? ? - action: labelmap

? ? ? ? ? regex: __meta_kubernetes_node_label_(.+)

? ? ? ? - target_label: __address__

? ? ? ? ? replacement: kubernetes.default.svc:443

? ? ? ? - source_labels: [__meta_kubernetes_node_name]

? ? ? ? ? regex: (.+)

? ? ? ? ? target_label: __metrics_path__

? ? ? ? ? replacement: /api/v1/nodes/${1}/proxy/metrics

? ? ? # Scrape config for Kubelet cAdvisor.

? ? ? #

? ? ? # This is required for Kubernetes 1.7.3 and later, where cAdvisor metrics

? ? ? # (those whose names begin with 'container_') have been removed from the

? ? ? # Kubelet metrics endpoint.? This job scrapes the cAdvisor endpoint to

? ? ? # retrieve those metrics.

? ? ? #

? ? ? # In Kubernetes 1.7.0-1.7.2, these metrics are only exposed on the cAdvisor

? ? ? # HTTP endpoint; use "replacement: /api/v1/nodes/${1}:4194/proxy/metrics"

? ? ? # in that case (and ensure cAdvisor's HTTP server hasn't been disabled with

? ? ? # the --cadvisor-port=0 Kubelet flag).

? ? ? #

? ? ? # This job is not necessary and should be removed in Kubernetes 1.6 and

? ? ? # earlier versions, or it will cause the metrics to be scraped twice.

? ? ? - job_name: 'kubernetes-cadvisor'

? ? ? ? # Default to scraping over https. If required, just disable this or change to

? ? ? ? # `http`.

? ? ? ? scheme: https

? ? ? ? # This TLS & bearer token file config is used to connect to the actual scrape

? ? ? ? # endpoints for cluster components. This is separate to discovery auth

? ? ? ? # configuration because discovery & scraping are two separate concerns in

? ? ? ? # Prometheus. The discovery auth config is automatic if Prometheus runs inside

? ? ? ? # the cluster. Otherwise, more config options have to be provided within the

? ? ? ? # <kubernetes_sd_config>.

? ? ? ? tls_config:

? ? ? ? ? ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

? ? ? ? bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

? ? ? ? kubernetes_sd_configs:

? ? ? ? - role: node

? ? ? ? relabel_configs:

? ? ? ? - action: labelmap

? ? ? ? ? regex: __meta_kubernetes_node_label_(.+)

? ? ? ? - target_label: __address__

? ? ? ? ? replacement: kubernetes.default.svc:443

? ? ? ? - source_labels: [__meta_kubernetes_node_name]

? ? ? ? ? regex: (.+)

? ? ? ? ? target_label: __metrics_path__

? ? ? ? ? replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

? ? ? # Scrape config for service endpoints.

? ? ? #

? ? ? # The relabeling allows the actual service scrape endpoint to be configured

? ? ? # via the following annotations:

? ? ? #

? ? ? # * `prometheus.io/scrape`: Only scrape services that have a value of `true`

? ? ? # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need

? ? ? # to set this to `https` & most likely set the `tls_config` of the scrape config.

? ? ? # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.

? ? ? # * `prometheus.io/port`: If the metrics are exposed on a different port to the

? ? ? # service then set this appropriately.

? ? ? - job_name: 'kubernetes-service-endpoints'

? ? ? ? kubernetes_sd_configs:

? ? ? ? - role: endpoints

? ? ? ? relabel_configs:

? ? ? ? - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]

? ? ? ? ? action: keep

? ? ? ? ? regex: true

? ? ? ? - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]

? ? ? ? ? action: replace

? ? ? ? ? target_label: __scheme__

? ? ? ? ? regex: (https?)

? ? ? ? - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]

? ? ? ? ? action: replace

? ? ? ? ? target_label: __metrics_path__

? ? ? ? ? regex: (.+)

? ? ? ? - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]

? ? ? ? ? action: replace

? ? ? ? ? target_label: __address__

? ? ? ? ? regex: ([^:]+)(?::\d+)?;(\d+)

? ? ? ? ? replacement: $1:$2

? ? ? ? - action: labelmap

? ? ? ? ? regex: __meta_kubernetes_service_label_(.+)

? ? ? ? - source_labels: [__meta_kubernetes_namespace]

? ? ? ? ? action: replace

? ? ? ? ? target_label: kubernetes_namespace

? ? ? ? - source_labels: [__meta_kubernetes_service_name]

? ? ? ? ? action: replace

? ? ? ? ? target_label: kubernetes_name

? ? ? # Example scrape config for probing services via the Blackbox Exporter.

? ? ? #

? ? ? # The relabeling allows the actual service scrape endpoint to be configured

? ? ? # via the following annotations:

? ? ? #

? ? ? # * `prometheus.io/probe`: Only probe services that have a value of `true`

? ? ? - job_name: 'kubernetes-services'

? ? ? ? metrics_path: /probe

? ? ? ? params:

? ? ? ? ? module: [http_2xx]

? ? ? ? kubernetes_sd_configs:

? ? ? ? - role: service

? ? ? ? relabel_configs:

? ? ? ? - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]

? ? ? ? ? action: keep

? ? ? ? ? regex: true

? ? ? ? - source_labels: [__address__]

? ? ? ? ? target_label: __param_target

? ? ? ? - target_label: __address__

? ? ? ? ? replacement: blackbox-exporter.example.com:9115

? ? ? ? - source_labels: [__param_target]

? ? ? ? ? target_label: instance

? ? ? ? - action: labelmap

? ? ? ? ? regex: __meta_kubernetes_service_label_(.+)

? ? ? ? - source_labels: [__meta_kubernetes_namespace]

? ? ? ? ? target_label: kubernetes_namespace

? ? ? ? - source_labels: [__meta_kubernetes_service_name]

? ? ? ? ? target_label: kubernetes_name

? ? ? # Example scrape config for probing ingresses via the Blackbox Exporter.

? ? ? #

? ? ? # The relabeling allows the actual ingress scrape endpoint to be configured

? ? ? # via the following annotations:

? ? ? #

? ? ? # * `prometheus.io/probe`: Only probe services that have a value of `true`

? ? ? - job_name: 'kubernetes-ingresses'

? ? ? ? metrics_path: /probe

? ? ? ? params:

? ? ? ? ? module: [http_2xx]

? ? ? ? kubernetes_sd_configs:

? ? ? ? ? - role: ingress

? ? ? ? relabel_configs:

? ? ? ? ? - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]

? ? ? ? ? ? action: keep

? ? ? ? ? ? regex: true

? ? ? ? ? - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]

? ? ? ? ? ? regex: (.+);(.+);(.+)

? ? ? ? ? ? replacement: ${1}://${2}${3}

? ? ? ? ? ? target_label: __param_target

? ? ? ? ? - target_label: __address__

? ? ? ? ? ? replacement: blackbox-exporter.example.com:9115

? ? ? ? ? - source_labels: [__param_target]

? ? ? ? ? ? target_label: instance

? ? ? ? ? - action: labelmap

? ? ? ? ? ? regex: __meta_kubernetes_ingress_label_(.+)

? ? ? ? ? - source_labels: [__meta_kubernetes_namespace]

? ? ? ? ? ? target_label: kubernetes_namespace

? ? ? ? ? - source_labels: [__meta_kubernetes_ingress_name]

? ? ? ? ? ? target_label: kubernetes_name

? ? ? # Example scrape config for pods

? ? ? #

? ? ? # The relabeling allows the actual pod scrape endpoint to be configured via the

? ? ? # following annotations:

? ? ? #

? ? ? # * `prometheus.io/scrape`: Only scrape pods that have a value of `true`

? ? ? # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.

? ? ? # * `prometheus.io/port`: Scrape the pod on the indicated port instead of the

? ? ? # pod's declared ports (default is a port-free target if none are declared).

? ? ? - job_name: 'kubernetes-pods'

? ? ? ? kubernetes_sd_configs:

? ? ? ? - role: pod

? ? ? ? relabel_configs:

? ? ? ? - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]

? ? ? ? ? action: keep

? ? ? ? ? regex: true

? ? ? ? - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]

? ? ? ? ? action: replace

? ? ? ? ? target_label: __metrics_path__

? ? ? ? ? regex: (.+)

? ? ? ? - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]

? ? ? ? ? action: replace

? ? ? ? ? regex: ([^:]+)(?::\d+)?;(\d+)

? ? ? ? ? replacement: $1:$2

? ? ? ? ? target_label: __address__

? ? ? ? - action: labelmap

? ? ? ? ? regex: __meta_kubernetes_pod_label_(.+)

? ? ? ? - source_labels: [__meta_kubernetes_namespace]

? ? ? ? ? action: replace

? ? ? ? ? target_label: kubernetes_namespace

? ? ? ? - source_labels: [__meta_kubernetes_pod_name]

? ? ? ? ? action: replace

? ? ? ? ? target_label: kubernetes_pod_name

---

apiVersion: v1

kind: ConfigMap

metadata:

? name: prometheus-rules

? namespace: ns-monitor

? labels:

? ? app: prometheus

data:

? cpu-usage.rule: |

? ? groups:

? ? ? - name: NodeCPUUsage

? ? ? ? rules:

? ? ? ? ? - alert: NodeCPUUsage

? ? ? ? ? ? expr: (100 - (avg by (instance) (irate(node_cpu{name="node-exporter",mode="idle"}[5m])) * 100)) > 75

? ? ? ? ? ? for: 2m

? ? ? ? ? ? labels:

? ? ? ? ? ? ? severity: "page"

? ? ? ? ? ? annotations:

? ? ? ? ? ? ? summary: "{{$labels.instance}}: High CPU usage detected"

? ? ? ? ? ? ? description: "{{$labels.instance}}: CPU usage is above 75% (current value is: {{ $value }})"

---

apiVersion: v1

kind: PersistentVolume

metadata:

? name: "prometheus-data-pv"

? labels:

? ? name: prometheus-data-pv

? ? release: stable

spec:

? capacity:

? ? storage: 5Gi

? accessModes:

? ? - ReadWriteOnce

? persistentVolumeReclaimPolicy: Recycle

? nfs:

? ? path: /nfs/prometheus/data

? ? server: 192.168.115.210

---

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

? name: prometheus-data-pvc

? namespace: ns-monitor

spec:

? accessModes:

? ? - ReadWriteOnce

? resources:

? ? requests:

? ? ? storage: 5Gi

? selector:

? ? matchLabels:

? ? ? name: prometheus-data-pv

? ? ? release: stable

---

kind: Deployment

apiVersion: apps/v1beta2

metadata:

? labels:

? ? app: prometheus

? name: prometheus

? namespace: ns-monitor

spec:

? replicas: 1

? revisionHistoryLimit: 10

? selector:

? ? matchLabels:

? ? ? app: prometheus

? template:

? ? metadata:

? ? ? labels:

? ? ? ? app: prometheus

? ? spec:

? ? ? serviceAccountName: prometheus

? ? ? securityContext:

? ? ? ? runAsUser: 65534

? ? ? ? fsGroup: 65534

? ? ? containers:

? ? ? ? - name: prometheus

? ? ? ? ? image: 192.168.101.88:5000/prom/prometheus:v2.2.1

? ? ? ? ? volumeMounts:

? ? ? ? ? ? - mountPath: /prometheus

? ? ? ? ? ? ? name: prometheus-data-volume

? ? ? ? ? ? - mountPath: /etc/prometheus/prometheus.yml

? ? ? ? ? ? ? name: prometheus-conf-volume

? ? ? ? ? ? ? subPath: prometheus.yml

? ? ? ? ? ? - mountPath: /etc/prometheus/rules

? ? ? ? ? ? ? name: prometheus-rules-volume

? ? ? ? ? ports:

? ? ? ? ? ? - containerPort: 9090

? ? ? ? ? ? ? protocol: TCP

? ? ? volumes:

? ? ? ? - name: prometheus-data-volume

? ? ? ? ? persistentVolumeClaim:

? ? ? ? ? ? claimName: prometheus-data-pvc

? ? ? ? - name: prometheus-conf-volume

? ? ? ? ? configMap:

? ? ? ? ? ? name: prometheus-conf

? ? ? ? - name: prometheus-rules-volume

? ? ? ? ? configMap:

? ? ? ? ? ? name: prometheus-rules

? ? ? tolerations:

? ? ? ? - key: node-role.kubernetes.io/master

? ? ? ? ? effect: NoSchedule

---

kind: Service

apiVersion: v1

metadata:

? annotations:

? ? prometheus.io/scrape: 'true'

? labels:

? ? app: prometheus

? name: prometheus-service

? namespace: ns-monitor

spec:

? ports:

? ? - port: 9090

? ? ? targetPort: 9090

? selector:

? ? app: prometheus

? type: NodePort


說明:

1歹颓、在啟用了RBAC的Kubernetes環(huán)境中坯屿,為Prometheus配置SA及其相關(guān)權(quán)限油湖;

2巍扛、Prometheus默認(rèn)使用本地存儲,默認(rèn)路徑/prometheus乏德,為其設(shè)置PVC撤奸;

3吠昭、使用CM配置Prometheus的prometheus.yml配置文件,掛載到默認(rèn)路徑/etc/prometheus/prometheus.yml胧瓜;

關(guān)于/etc/prometheus/prometheus.yml的配置參考:官方文檔矢棚。

關(guān)于采集Kubernetes指標(biāo)的配置參考:官方事例。

關(guān)于relabel_configs的配置參考:官方文檔府喳。

4蒲肋、以Deployment部署Prometheus實例并配置相應(yīng)的SVC,使用NodePort暴露服務(wù)钝满;

特別注意:

在掛載prometheus-data-volume的時候兜粘,默認(rèn)情況下,掛載點屬于root用戶弯蚜,其他用戶沒有寫入的權(quán)限孔轴,而Prometheus默認(rèn)的運行用戶是nobody:nogroup,所以在在默認(rèn)情況下直接掛載/prometheus將導(dǎo)致prometheus啟動失敗碎捺,解決辦法:

? ? ? serviceAccountName: prometheus

? ? ? securityContext:

? ? ? ? runAsUser: 65534

? ? ? ? fsGroup: 65534

? ? ? containers:


nobody:nogroup的UID和GID都是65534路鹰,可以通過容器內(nèi)的/etc/passwd查看!

部署Prometheus

root@master:~/kubernetes/prometheus# kubectl apply -f prometheus.yml

root@master:~/kubernetes/prometheus# kubectl get pods -n ns-monitor -o wide

NAME? ? ? ? ? ? ? ? ? ? ? ? READY? ? STATUS? ? RESTARTS? AGE? ? ? IP? ? ? ? ? ? ? ? NODE

node-exporter-br7wz? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.210? master

node-exporter-jzc6f? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.212? node2

node-exporter-t9s2f? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.213? node3

node-exporter-trh52? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.211? node1

prometheus-985cd7c77-766sc? 1/1? ? ? Running? 0? ? ? ? ? 20m? ? ? 10.233.71.47? ? ? node3


查看Prometheus的Web UI

使用瀏覽器訪問Prometheus SVC對應(yīng)的NodePort

查看target

查看service-discovery

Prometheus會根據(jù)/etc/prometheus/promethues.yml中的relabel_configs配置對指標(biāo)進(jìn)行處理收厨,比如:dropped晋柱、replace等

Prometheus自己的指標(biāo)

瀏覽器訪問/metrics

2.4、部署grafana

grafana.yml文件內(nèi)容

apiVersion: v1

kind: PersistentVolume

metadata:

? name: "grafana-data-pv"

? labels:

? ? name: grafana-data-pv

? ? release: stable

spec:

? capacity:

? ? storage: 5Gi

? accessModes:

? ? - ReadWriteOnce

? persistentVolumeReclaimPolicy: Recycle

? nfs:

? ? path: /nfs/grafana/data

? ? server: 192.168.115.210

---

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

? name: grafana-data-pvc

? namespace: ns-monitor

spec:

? accessModes:

? ? - ReadWriteOnce

? resources:

? ? requests:

? ? ? storage: 5Gi

? selector:

? ? matchLabels:

? ? ? name: grafana-data-pv

? ? ? release: stable

---

kind: Deployment

apiVersion: apps/v1beta2

metadata:

? labels:

? ? app: grafana

? name: grafana

? namespace: ns-monitor

spec:

? replicas: 1

? revisionHistoryLimit: 10

? selector:

? ? matchLabels:

? ? ? app: grafana

? template:

? ? metadata:

? ? ? labels:

? ? ? ? app: grafana

? ? spec:

? ? ? containers:

? ? ? ? - name: grafana

? ? ? ? ? image: 192.168.101.88:5000/grafana/grafana:5.0.4

? ? ? ? ? env:

? ? ? ? ? ? - name: GF_AUTH_BASIC_ENABLED

? ? ? ? ? ? ? value: "true"

? ? ? ? ? ? - name: GF_AUTH_ANONYMOUS_ENABLED

? ? ? ? ? ? ? value: "false"

? ? ? ? ? readinessProbe:

? ? ? ? ? ? httpGet:

? ? ? ? ? ? ? path: /login

? ? ? ? ? ? ? port: 3000

? ? ? ? ? volumeMounts:

? ? ? ? ? ? - mountPath: /var/lib/grafana

? ? ? ? ? ? ? name: grafana-data-volume

? ? ? ? ? ports:

? ? ? ? ? ? - containerPort: 3000

? ? ? ? ? ? ? protocol: TCP

? ? ? volumes:

? ? ? ? - name: grafana-data-volume

? ? ? ? ? persistentVolumeClaim:

? ? ? ? ? ? claimName: grafana-data-pvc

---

kind: Service

apiVersion: v1

metadata:

? labels:

? ? app: grafana

? name: grafana-service

? namespace: ns-monitor

spec:

? ports:

? ? - port: 3000

? ? ? targetPort: 3000

? selector:

? ? app: grafana

? type: NodePort


說明:

1诵叁、使用NFS存儲Grafana數(shù)據(jù)趣斤、啟用基礎(chǔ)權(quán)限認(rèn)證、禁用匿名訪問黎休;

部署Grafana

root@master:~/kubernetes/prometheus# kubectl apply -f grafana.yml

root@master:~/kubernetes/prometheus# kubectl get pods -n ns-monitor -o wide

NAME? ? ? ? ? ? ? ? ? ? ? ? READY? ? STATUS? ? RESTARTS? AGE? ? ? IP? ? ? ? ? ? ? ? NODE

grafana-55494b59d6-6k4km? ? 1/1? ? ? Running? 0? ? ? ? ? 2d? ? ? ? 10.233.71.0? ? ? node3

node-exporter-br7wz? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.210? master

node-exporter-jzc6f? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.212? node2

node-exporter-t9s2f? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.213? node3

node-exporter-trh52? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 6h? ? ? ? 192.168.115.211? node1

prometheus-985cd7c77-766sc? 1/1? ? ? Running? 0? ? ? ? ? 20m? ? ? 10.233.71.47? ? ? node3


配置Grafana

登錄Grafana浓领,因為使用NodePort暴露服務(wù),通過SVC查看端口势腮,默認(rèn)用戶admin/admin

root@master:~/kubernetes/prometheus# kubectl get svc -n ns-monitor

NAME? ? ? ? ? ? ? ? ? ? TYPE? ? ? ? CLUSTER-IP? ? ? EXTERNAL-IP? PORT(S)? ? ? ? ? AGE

grafana-service? ? ? ? NodePort? ? 10.233.13.130? <none>? ? ? ? 3000:32712/TCP? 2d

node-exporter-service? ClusterIP? None? ? ? ? ? ? <none>? ? ? ? 9100/TCP? ? ? ? 6h

prometheus-service? ? ? NodePort? ? 10.233.57.158? <none>? ? ? ? 9090:32014/TCP? 26m


登錄之后联贩,跟隨Grafana的引導(dǎo)完成設(shè)置

將prometheus配置為數(shù)據(jù)源、導(dǎo)入Prometheus和Grafana的Dashboard

導(dǎo)入Kubernetes的Dashboard模版捎拯,下文附下載鏈接

查看Dashboard

Dashboard中的每一個Panel可以自行編輯泪幌、保存和回滾!

如果instance下拉框顯示有問題署照,點擊右上方的設(shè)置(settings)~變量(Variables)祸泪,修改$instance變量的Regex值,可以直接清空建芙;

配置數(shù)據(jù)源没隘、導(dǎo)入Dashboard、安裝插件等這些操作可以配置到grafana.yml文件中禁荸,但是配置過程比較麻煩右蒲,這里先提供在界面上操作的說明阀湿,后期需要再處理。

3瑰妄、參考資料

https://prometheus.io/docs/

http://docs.grafana.org/

https://github.com/prometheus/prometheus/tree/release-2.2/documentation/examples

https://github.com/giantswarm/kubernetes-prometheus

https://github.com/zalando-incubator/kubernetes-on-aws/pull/861

http://yunlzheng.github.io/2018/01/17/prometheus-sd-and-relabel/

4陷嘴、附件下載

Kubernetes的Grafana監(jiān)控模版:https://pan.baidu.com/s/1y7HDQCPXy9LCAzA01uzIBQ

---------------------

作者:迷途的攻城獅(798570156)

來源:CSDN

原文:https://blog.csdn.net/chenleiking/article/details/80009529

版權(quán)聲明:本文為博主原創(chuàng)文章,轉(zhuǎn)載請附上博文鏈接间坐!

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末灾挨,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子竹宋,更是在濱河造成了極大的恐慌涨醋,老刑警劉巖,帶你破解...
    沈念sama閱讀 206,602評論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件逝撬,死亡現(xiàn)場離奇詭異浴骂,居然都是意外死亡,警方通過查閱死者的電腦和手機宪潮,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,442評論 2 382
  • 文/潘曉璐 我一進(jìn)店門溯警,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人狡相,你說我怎么就攤上這事梯轻。” “怎么了尽棕?”我有些...
    開封第一講書人閱讀 152,878評論 0 344
  • 文/不壞的土叔 我叫張陵喳挑,是天一觀的道長。 經(jīng)常有香客問我滔悉,道長伊诵,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 55,306評論 1 279
  • 正文 為了忘掉前任回官,我火速辦了婚禮曹宴,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘歉提。我一直安慰自己笛坦,他們只是感情好,可當(dāng)我...
    茶點故事閱讀 64,330評論 5 373
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般走孽。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上载佳,一...
    開封第一講書人閱讀 49,071評論 1 285
  • 那天,我揣著相機與錄音,去河邊找鬼楚午。 笑死宴偿,一個胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的诀豁。 我是一名探鬼主播窄刘,決...
    沈念sama閱讀 38,382評論 3 400
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼舷胜!你這毒婦竟也來了娩践?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 37,006評論 0 259
  • 序言:老撾萬榮一對情侶失蹤烹骨,失蹤者是張志新(化名)和其女友劉穎翻伺,沒想到半個月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體沮焕,經(jīng)...
    沈念sama閱讀 43,512評論 1 300
  • 正文 獨居荒郊野嶺守林人離奇死亡吨岭,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 35,965評論 2 325
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了峦树。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片辣辫。...
    茶點故事閱讀 38,094評論 1 333
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖魁巩,靈堂內(nèi)的尸體忽然破棺而出急灭,到底是詐尸還是另有隱情,我是刑警寧澤谷遂,帶...
    沈念sama閱讀 33,732評論 4 323
  • 正文 年R本政府宣布葬馋,位于F島的核電站,受9級特大地震影響肾扰,放射性物質(zhì)發(fā)生泄漏畴嘶。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 39,283評論 3 307
  • 文/蒙蒙 一集晚、第九天 我趴在偏房一處隱蔽的房頂上張望掠廓。 院中可真熱鬧,春花似錦甩恼、人聲如沸蟀瞧。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,286評論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽悦污。三九已至,卻和暖如春钉蒲,著一層夾襖步出監(jiān)牢的瞬間切端,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 31,512評論 1 262
  • 我被黑心中介騙來泰國打工顷啼, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留踏枣,地道東北人昌屉。 一個月前我還...
    沈念sama閱讀 45,536評論 2 354
  • 正文 我出身青樓,卻偏偏與公主長得像茵瀑,于是被迫代替她去往敵國和親间驮。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 42,828評論 2 345

推薦閱讀更多精彩內(nèi)容