背景及監(jiān)控目的
團隊開發(fā)的平臺采用微服務(wù)容器架構(gòu)贮懈,部署在linux虛機服務(wù)器上刻蟹。運行過程中對平臺各容器及服務(wù)器的狀態(tài)基本上處于未知的狀態(tài)糕非,出現(xiàn)問題后需要登陸到服務(wù)器上查看悠就。
通過監(jiān)控想要查看哪些東西:
- 服務(wù)器CPU蚤吹、內(nèi)存例诀、硬盤、網(wǎng)絡(luò)等使用情況
- 各容器的運行情況
需要哪些監(jiān)控組件
- promethus 不再多說裁着,真的牛批
- cadvisor 監(jiān)控容器
- node-exporter 監(jiān)控服務(wù)器
- grafana 自定義可視化視圖
監(jiān)控部署
四個監(jiān)控組件都采用容器的部署方式繁涂,方便起見docker容器內(nèi)都用了root用戶運行,否則prometus和grafana可能會出現(xiàn)權(quán)限問題二驰。prometus訪問cadvisor和node-exporter也未鑒權(quán)扔罪。
promethus
配置文件promethus.yml放在主機/data/prom目錄下,
global:
scrape_interval: 10s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 10s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
- job_name: "node-34"
static_configs:
- targets: ['192.168.3.34:58080', '192.168.3.34:9100']
labels:
Node: 192.168.3.34:9100
執(zhí)行docker run 啟動promethus
docker run \
-d -u root \
-p 9090:9090 \
-v /data/prom:/etc/prometheus \
-v "/etc/localtime:/etc/localtime" \
--name=prometheus \
prom/prometheus:v2.30.3
cadvisor
官方鏡像 gcr.io被墻桶雀,使用docker hub上的cadvisor鏡像
docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=58080:8080 \
--detach=true \
--name=cadvisor \
--privileged \
--device=/dev/kmsg \
peytonyip/cadvisor:v0.39.2
node-exporter
docker run -d -p 59100:9100 \
-v "/proc:/host/proc" \
-v "/sys:/host/sys" \
-v "/:/rootfs" \
-v "/etc/localtime:/etc/localtime" \
--name=node-exporter \
prom/node-exporter
grafana
docker run \
-d -u root --name=grafana \
-p 3000:3000 \
-v "/etc/localtime:/etc/localtime" \
-v /data/grafana:/var/lib/grafana \
grafana/grafana-enterprise:8.2.1
grafana 可視化dashboards
配置promethus數(shù)據(jù)源
設(shè)置promethus的 ip矿酵、port
grafana官方有很多dashboard 模板唬复,導(dǎo)入即可使用