2022-03-20 day 107 kubernetes prometheus

強(qiáng)制刪除monitoring命名空間

kubectl get namespace  monitoring -o json \
            | tr -d "\n" | sed "s/\"finalizers\": \[[^]]\+\]/\"finalizers\": []/" \
            | kubectl replace --raw /api/v1/namespaces/monitoring/finalize -f -

報(bào)錯(cuò)1

[root@master ~]# kubectl get nodes
The connection to the server 10.0.0.10:6443 was refused - did you specify the right host or port?
解決辦法

[root@master ~]# kubectl get nodes
The connection to the server 10.0.0.10:6443 was refused - did you specify the right host or port?
[root@master ~]# kubeadm init
I0320 21:32:41.509087    5017 version.go:252] remote version is much newer: v1.23.5; falling back to: stable-1.19
W0320 21:32:42.543039    5017 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.16
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
    [ERROR Port-6443]: Port 6443 is in use
    [ERROR Port-10259]: Port 10259 is in use
    [ERROR Port-10257]: Port 10257 is in use
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
    [ERROR Port-10250]: Port 10250 is in use
    [ERROR Port-2379]: Port 2379 is in use
    [ERROR Port-2380]: Port 2380 is in use
    [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
[root@master ~]# kubectl get node
NAME     STATUS   ROLES    AGE   VERSION
master   Ready    master   9d    v1.19.3
node1    Ready    node     9d    v1.19.3
node2    Ready    node     9d    v1.19.3

image.png

prometheus安裝部署
1.comfigmap 存儲(chǔ)配置文件
2.RBAC 權(quán)限
3.pv/pvc 存儲(chǔ)位置
4.deployment 安裝
5.service 訪問(wèn)它
6.ingress 網(wǎng)頁(yè)打開它

首先是創(chuàng)建一個(gè)prom的命名空間
01prom-na.yaml

[root@master ~/k8s_yml/prom/prom]# cat 01prom-na.yaml 
apiVersion: v1
kind: Namespace
metadata:
  name: prom

1.先配置configmap資源
02prom-cm.yml

[root@master ~/k8s_yml/prom/prom]# cat 02prom-cm.yml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: prom
data:
  prometheus.yml: |                             
    global:                         #全局配置
      scrape_interval: 15s  #抓取數(shù)據(jù)間隔
      scrape_timeout: 15s  #抓取超時(shí)時(shí)間

    alerting:
      alertmanagers:
      - static_configs:     
        - targets: ["alertmanager:9093"]  
    rule_files:
    - /etc/prometheus/rules.yml


    scrape_configs:        #拉取配置
    - job_name: 'prometheus'   #任務(wù)名稱
      static_configs:             #靜態(tài)配置,寫死的
      - targets: ['localhost:9090']  #抓取數(shù)據(jù)節(jié)點(diǎn)的IP端口剿骨,prometheus自己運(yùn)行各項(xiàng)情況
    - job_name: 'coredns'
      static_configs:
      - targets: ['10.2.0.4:9153','10.2.0.5:9153']
    - job_name: 'mysql'
      static_configs:
      - targets: ['mysql-svc.default:9104']

    - job_name: 'nodes'
      kubernetes_sd_configs:                #k8s自動(dòng)服務(wù)發(fā)現(xiàn)
      - role: node                                  #自動(dòng)發(fā)現(xiàn)類型為node
      relabel_configs:
      - action: replace     
        source_labels: ['__address__']      #需要修改的源標(biāo)簽 
        regex: '(.*):10250'                         #正則表達(dá)式(10.0.0.10):10250
        replacement: '${1}:9100'            #替換后的內(nèi)容10.0.0.10:9100
        target_label: __address__           #將替換后的內(nèi)容覆蓋原來(lái)的標(biāo)簽
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)


    - job_name: 'kubelet'
      kubernetes_sd_configs:
      - role: node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)


    - job_name: 'cadvisor'
      kubernetes_sd_configs:
      - role: node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)   
        replacement: $1
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.*)
        replacement: /metrics/cadvisor
        target_label: __metrics_path__


    - job_name: 'apiservers-endpoints'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_label_component]
        regex: apiserver
        action: keep

    - job_name: 'pods'
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name


  rules.yml: |
    groups:
    - name: test-node-mem
      rules:
      - alert: NodeMemoryUsage
        expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100 > 20
        for: 2m
        labels:
          team: node
        annotations:
          summary: "{{$labels.instance}}: High Memory usage detected"
          description: "{{$labels.instance}}: Memory usage is above 20% (current value is: {{ $value }}"
 
image.png

創(chuàng)建RBAC,因?yàn)镻rometheus有可能要跨命名空間
03prom-rbac.yml

apiVersion: v1
kind: ServiceAccount    #給pod使用
metadata:
  name: prometheus
  namespace: prom
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole #給跨命名空間使用
metadata:
  name: prometheus
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - services
  - endpoints
  - pods
  - nodes/proxy
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - "extensions"
  resources:
  - ingresses
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: prom

04創(chuàng)建pv/pvc
04prom-pv-pvc.yml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-local
  labels:
    app: prometheus
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  storageClassName: local-storage
  local:
    path: /data/k8s/prometheus
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - node2
  persistentVolumeReclaimPolicy: Retain
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-data
  namespace: prom
spec:
  selector:
    matchLabels:
      app: prometheus
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: local-storage

創(chuàng)建deployment
05prom-dp.yml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: prom
  labels:
    app: prometheus
spec:
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus #引用RBAC創(chuàng)建的serviceAccount
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: prometheus-data
      - name: config-volume  
        configMap:
          name: prometheus-config  
      initContainers:        #初始化容器
      - name: fix-permissions
        image: busybox
        volumeMounts:
        - name: data
          mountPath: /prometheus
        command: [chown, -R, "nobody:nobody", /prometheus]  
      containers:
      - name: prometheus
        image: prom/prometheus:v2.24.1
        resources:
          requests:
            cpu: 100m
            memory: 512Mi
          limits:
            cpu: 200m
            memory: 512Mi
        ports:
        - name: http
          containerPort: 9090            
        args:
        - "--config.file=/etc/prometheus/prometheus.yml"  #指定配置文件
        - "--storage.tsdb.path=/prometheus" #tsdb數(shù)據(jù)庫(kù)保留路徑
        - "--storage.tsdb.retention.time=24h" #數(shù)據(jù)保留時(shí)間,默認(rèn)15天
        - "--web.enable-admin-api"  #控制對(duì)admin http api的訪問(wèn)
        - "--web.enable-lifecycle" #支持熱更新,直接執(zhí)行l(wèi)ocalhost:9090/-/reload立即生效
        volumeMounts:
        - name: config-volume
          mountPath: "/etc/prometheus"
        - name: data
          mountPath: "/prometheus"

配置service資源
06prom-svc.yml

apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: prom
  labels:
    app: prometheus
spec:
  selector:
    app: prometheus
  ports:
    - name: web
      port: 9090
      targetPort: http

配置ingress資源
07prom-ingress.yml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus
  namespace: prom
  labels:
    app: prometheus
spec:
  rules:
  - host: prom.k8s.com
    http:
      paths:
      - path: /
        pathType: ImplementationSpecific
        backend:
          service:
            name: prometheus
            port:
              number: 9090

注意要安裝ingress


image.png

電腦端配置host解析


image.png

依次按順序執(zhí)行yml


image.png

記得再node1和node2創(chuàng)建/prometheus目錄


image.png

image.png

1.查看coredns的指標(biāo)端口
kubectl -n kube-system describe cm coredns

2.查看coredns的POD地址
kubectl -n kube-system get pod -o wide

3.訪問(wèn)coredns pod的指標(biāo)接口
curl 10.2.0.2:9153/metrics

4.編輯prometheus配置文件怜奖,添加靜態(tài)配置

  • job_name: 'coredns'
    static_configs:
    • targets: ['10.2.0.2:9153', '10.2.1.61:9153']

5.更新cm
kubectl apply -f 02prom-cm.yml

6.prometheus熱更新配置
kubectl -n pron get pod -o wide
curl -X POST "http://10.2.4.74:9090/-/reload"

7.在web頁(yè)面觀察是否出現(xiàn)了coredns的數(shù)據(jù)
status --> targets --> coredns

新建exporter

cat >prom-node-exporter.yaml<<'EOF'
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: prom
  labels:
    app: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      hostPID: true
      hostIPC: true
      hostNetwork: true
      nodeSelector:
        kubernetes.io/os: linux
      containers:
      - name: node-exporter
        image: prom/node-exporter:v1.1.1
        args:
        - --web.listen-address=$(HOSTIP):9100
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        - --path.rootfs=/host/root
        - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)           #忽略監(jiān)控的磁盤信息
        - --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$        #忽略監(jiān)控的文件系統(tǒng)
        ports:
        - containerPort: 9100
        env:
        - name: HOSTIP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        resources:
          requests:
            cpu: 150m
            memory: 180Mi
          limits:
            cpu: 150m
            memory: 180Mi
        securityContext:
          runAsNonRoot: true
          runAsUser: 65534
        volumeMounts:
        - name: proc
          mountPath: /host/proc
        - name: sys
          mountPath: /host/sys
        - name: root
          mountPath: /host/root
          mountPropagation: HostToContainer
          readOnly: true
      tolerations:
      - operator: "Exists"
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: dev
        hostPath:
          path: /dev
      - name: sys
        hostPath:
          path: /sys
      - name: root
        hostPath:
          path: /
EOF

新建granfa

cat >grafana.yml<<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
  namespace: prom
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      volumes:
      - name: storage
        hostPath:
          path: /data/k8s/grafana/
      nodeSelector:
        kubernetes.io/hostname: node2
      securityContext:
        runAsUser: 0
      containers:
      - name: grafana
        image: grafana/grafana:7.4.3
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
          name: grafana
        env:
        - name: GF_SECURITY_ADMIN_USER
          value: admin
        - name: GF_SECURITY_ADMIN_PASSWORD
          value: admin
        readinessProbe:
          failureThreshold: 10
          httpGet:
            path: /api/health
            port: 3000
            scheme: HTTP
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 30
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /api/health
            port: 3000
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        resources:
          limits:
            cpu: 150m
            memory: 512Mi
          requests:
            cpu: 150m
            memory: 512Mi
        volumeMounts:
        - mountPath: /var/lib/grafana
          name: storage
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: prom
spec:
  ports:
    - port: 3000
  selector:
    app: grafana
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: grafana
  namespace: prom
  labels:
    app: grafana
spec:
  rules:
  - host: grafana.k8s.com
    http:
      paths:
      - path: /
        pathType: ImplementationSpecific
        backend:
          service:
            name: grafana
            port:
              number: 3000
EOF

新建prom.sh

[root@master ~/k8s_yml/prom/prom]# cat prom.sh 
#! /bin/bash
curl -X POST "http://10.1.9.237:9090/-/reload"

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末浅役,一起剝皮案震驚了整個(gè)濱河市步责,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌肥隆,老刑警劉巖,帶你破解...
    沈念sama閱讀 217,277評(píng)論 6 503
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件稚失,死亡現(xiàn)場(chǎng)離奇詭異栋艳,居然都是意外死亡,警方通過(guò)查閱死者的電腦和手機(jī)句各,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,689評(píng)論 3 393
  • 文/潘曉璐 我一進(jìn)店門吸占,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái),“玉大人凿宾,你說(shuō)我怎么就攤上這事矾屯。” “怎么了初厚?”我有些...
    開封第一講書人閱讀 163,624評(píng)論 0 353
  • 文/不壞的土叔 我叫張陵件蚕,是天一觀的道長(zhǎng)。 經(jīng)常有香客問(wèn)我,道長(zhǎng)骤坐,這世上最難降的妖魔是什么绪杏? 我笑而不...
    開封第一講書人閱讀 58,356評(píng)論 1 293
  • 正文 為了忘掉前任,我火速辦了婚禮纽绍,結(jié)果婚禮上蕾久,老公的妹妹穿的比我還像新娘。我一直安慰自己拌夏,他們只是感情好僧著,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,402評(píng)論 6 392
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著障簿,像睡著了一般盹愚。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上站故,一...
    開封第一講書人閱讀 51,292評(píng)論 1 301
  • 那天皆怕,我揣著相機(jī)與錄音,去河邊找鬼西篓。 笑死愈腾,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的岂津。 我是一名探鬼主播虱黄,決...
    沈念sama閱讀 40,135評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼吮成!你這毒婦竟也來(lái)了橱乱?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 38,992評(píng)論 0 275
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤粱甫,失蹤者是張志新(化名)和其女友劉穎泳叠,沒(méi)想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體魔种,經(jīng)...
    沈念sama閱讀 45,429評(píng)論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡析二,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,636評(píng)論 3 334
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了节预。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片叶摄。...
    茶點(diǎn)故事閱讀 39,785評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖安拟,靈堂內(nèi)的尸體忽然破棺而出蛤吓,到底是詐尸還是另有隱情,我是刑警寧澤糠赦,帶...
    沈念sama閱讀 35,492評(píng)論 5 345
  • 正文 年R本政府宣布会傲,位于F島的核電站锅棕,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏淌山。R本人自食惡果不足惜裸燎,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,092評(píng)論 3 328
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望泼疑。 院中可真熱鬧德绿,春花似錦、人聲如沸退渗。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,723評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)会油。三九已至个粱,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間翻翩,已是汗流浹背都许。 一陣腳步聲響...
    開封第一講書人閱讀 32,858評(píng)論 1 269
  • 我被黑心中介騙來(lái)泰國(guó)打工, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留嫂冻,地道東北人梭稚。 一個(gè)月前我還...
    沈念sama閱讀 47,891評(píng)論 2 370
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像絮吵,于是被迫代替她去往敵國(guó)和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子忱屑,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,713評(píng)論 2 354

推薦閱讀更多精彩內(nèi)容