我是 LEE挑豌,老李割以,一個在 IT 行業(yè)摸爬滾打 16 年的技術(shù)老兵接癌。
事件背景
最近我們的 Knative 的應(yīng)用管理和發(fā)布平臺上線了蕊玷,有了工具平臺邮利,那么監(jiān)控報警就是下一個非常重要的環(huán)節(jié),后面的應(yīng)用報警就水到渠成了垃帅。
通過 Knative 官方 Serving 模塊中的監(jiān)控報警文檔實踐延届,發(fā)現(xiàn)官方提供的解決方案是一個極其麻煩的方案。也許是出發(fā)點不一樣贸诚,他們傾向建立一個全新的系統(tǒng)方庭,可是現(xiàn)在 k8s 系統(tǒng)普及這么多了厕吉,難道還有集群不使用 Promthues/Thanos 的嘛?我想有更簡單的辦法就能解決監(jiān)控的問題械念,不要用復(fù)雜的方法來接解決問題头朱。
順便多說一嘴,Knative 官方提供的 Grafana 的監(jiān)控大盤也非常不好用龄减,沒有真正貼合到實際使用需要项钮。
準(zhǔn)備工具
Tips: 我們這邊使用 VictoriaMetrics 替換了 Thanos, 因為在大數(shù)據(jù)查詢和寫入的量情況下 Thanos 實在是表現(xiàn)的不太好希停,所以最后使用了 VictoriaMetrics烁巫。
這個是我們平臺版本的情況:
- Kubernetes: 1.23
- Istio: 1.13
- Knative: 1.5
- Grafana: 8.3.3
- VictoriaMetrics: 1.79
具體實操
既然打算用自己的方法來監(jiān)控 Knative Serving 的控制層,那么 Knative 官方的文檔就沒有什么參考價值了宠能。
監(jiān)控控制層
一個簡單的 knative 會有如下幾個簡單的組件構(gòu)成:
NAME READY STATUS RESTARTS AGE
activator-58b96bdb7d-nf6hf 1/1 Running 0 30d
autoscaler-75c4975cd8-bg2nt 1/1 Running 0 30d
controller-66475c8469-d5w2h 1/1 Running 0 30d
domain-mapping-68768c5ddc-999ng 1/1 Running 0 30d
domainmapping-webhook-d4bbcb544-bjtfz 1/1 Running 0 30d
net-istio-controller-689d984c59-4vtdx 1/1 Running 0 27d
net-istio-webhook-74f9465d86-jtj72 1/1 Running 0 27d
webhook-996d56c7-ms6js 1/1 Running 0 30d
那么就可以針對這些組件定制合適的 metrics 抓取方案亚隙。當(dāng)然抓取前,我們還是稍微瀏覽下 Deployment 里面的配置情況棍潘。
這里用 activator 為例:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
labels:
app.kubernetes.io/component: activator
app.kubernetes.io/name: knative-serving
app.kubernetes.io/version: 1.5.0
name: activator
namespace: knative-serving
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: activator
role: activator
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
creationTimestamp: null
labels:
app: activator
app.kubernetes.io/component: activator
app.kubernetes.io/name: knative-serving
app.kubernetes.io/version: 1.5.0
role: activator
spec:
containers:
- env:
- name: GOGC
value: "500"
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SYSTEM_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: CONFIG_LOGGING_NAME
value: config-logging
- name: CONFIG_OBSERVABILITY_NAME
value: config-observability
- name: METRICS_DOMAIN
value: knative.dev/internal/serving
image: knative-serving/activator:1.5.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 12
httpGet:
httpHeaders:
- name: k-kubelet-probe
value: activator
path: /
port: 8012
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: activator
ports:
- containerPort: 9090
name: metrics ## 就是這里恃鞋,提供 9090 端口作為 metrics 數(shù)據(jù)讀取接口
protocol: TCP
- containerPort: 8008
name: profiling
protocol: TCP
- containerPort: 8012
name: http1
protocol: TCP
- containerPort: 8013
name: h2c
protocol: TCP
readinessProbe:
failureThreshold: 5
httpGet:
httpHeaders:
- name: k-kubelet-probe
value: activator
path: /
port: 8012
scheme: HTTP
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "1"
memory: 600Mi
requests:
cpu: 300m
memory: 60Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: controller
serviceAccountName: controller
terminationGracePeriodSeconds: 600
通過對 activator 的 Deployment 內(nèi)容閱讀得知,9090 端口(命名:metrics亦歉,后面用的到)是對外提供指標(biāo)的位置恤浪。隨后我們對 Knative Serving 中其他的組件提供指標(biāo)的接口做了統(tǒng)計,做了如下列表:
組件名 | Port | 別名 | 描述 |
---|---|---|---|
activator | 9090 | metrics | 連接緩沖器肴楷,是 Knative 重要流量轉(zhuǎn)發(fā)組件水由。負(fù)責(zé)應(yīng)用從 0->1/1->0 過程中 http 請求緩存。 |
autoscaler | 9090 | metrics | 擴(kuò)容控制器赛蔫,是 Knative 控制應(yīng)用 Pod 副本數(shù)量重要組件砂客。根據(jù) queue-proxy 和 activator 反饋的數(shù)據(jù)決定 pod 啟動數(shù)量。 |
controller | 9090 | metrics | 控制器呵恢,是 Knative 控制器服務(wù)協(xié)調(diào)所有公共 Knative 對象和自動伸縮 crd鞠值。當(dāng)用戶將 Knative 服務(wù)應(yīng)用到 Kubernetes API 時,這會創(chuàng)建配置和路由渗钉。 |
webhook | 9090 | metrics | 鉤子彤恶,是 Knative 控制層與 Kubernetes 溝通重要組件。攔截所有 Kubernetes API 調(diào)用以及所有 CRD 插入和更新鳄橘。它設(shè)置默認(rèn)值和拒絕不一致和無效的對象声离,并驗證和改變 Kubernetes API 調(diào)用。 |
從上面的列表真正對業(yè)務(wù)有實質(zhì)性影響的就是這 4 個模塊瘫怜。既然如此术徊,我們就方便編寫抓取監(jiān)控的 Job 了。 這里以 VictoriaMetrics 平臺為例:
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: controller-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 這里就是面提到的接聽端口 9090 的別名
selector:
matchLabels:
app: controller
---
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: autoscaler-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 這里就是面提到的接聽端口 9090 的別名
selector:
matchLabels:
app: autoscaler
---
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: activator-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 這里就是面提到的接聽端口 9090 的別名
selector:
matchLabels:
app: activator
---
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: webhook-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 這里就是面提到的接聽端口 9090 的別名
selector:
matchLabels:
app: webhook
我編寫了 4 個 PodScrape 任務(wù)來監(jiān)控控制層 Pod 的 metrics鲸湃,數(shù)據(jù)被自動收集到了 VictoriaMetrics赠涮,后面方便 Grafana 來做 Dashboard子寓。
監(jiān)控已發(fā)布應(yīng)用
依葫蘆畫瓢,當(dāng)然抓發(fā)布應(yīng)用的 metrics 取前笋除,我們還是稍微瀏覽下 Deployment 里面的配置情況别瞭。這里用 test-app-18 為例:
apiVersion: v1
kind: Pod
metadata:
annotations:
autoscaling.knative.dev/class: kpa.autoscaling.knative.dev
autoscaling.knative.dev/initial-scale: "1"
autoscaling.knative.dev/max-scale: "6"
autoscaling.knative.dev/metric: rps
autoscaling.knative.dev/min-scale: "1"
autoscaling.knative.dev/target: "60"
kubernetes.io/limit-ranger: "LimitRanger plugin set: ephemeral-storage request
for container app; ephemeral-storage limit for container app; ephemeral-storage
request for container queue-proxy; ephemeral-storage limit for container queue-proxy"
serving.knative.dev/creator: system:serviceaccount:default:oms-admin
creationTimestamp: "2022-08-04T07:05:03Z"
generateName: test-app-18-ac403-deployment-988b7b66f-
labels:
k_type: knative # 這里很重要,通過這個 label 我們區(qū)分這個pod 是 knative 應(yīng)用的pod株憾,還是普通的 pod
app: test-app-18
app_id: test-app-18
pod-template-hash: 988b7b66f
service.istio.io/canonical-name: test-app-18
service.istio.io/canonical-revision: test-app-18-ac403
serving.knative.dev/configuration: test-app-18
serving.knative.dev/configurationGeneration: "4"
serving.knative.dev/configurationUID: d896cd40-ce9c-4027-9229-4af9f2aa5630
serving.knative.dev/revision: test-app-18-ac403
serving.knative.dev/revisionUID: 1b3dc38f-5aed-4252-a07b-aefc32f7f9f9
serving.knative.dev/service: test-app-18
serving.knative.dev/serviceUID: 0af741d0-a74f-44dd-ab6e-458a5d3743a2
name: test-app-18-ac403-deployment-988b7b66f-tlw27
namespace: knative-apps
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: test-app-18-ac403-deployment-988b7b66f
uid: 429f5cc4-20e6-4f85-a985-4da1de578844
resourceVersion: "755594194"
uid: c632adba-5c66-4c0b-ac31-67c8c231b591
spec:
containers:
- env:
- name: PORT
value: "8080"
- name: K_REVISION
value: test-app-18-ac403
- name: K_CONFIGURATION
value: test-app-18
- name: K_SERVICE
value: test-app-18
image: knative-apps/fn_test-app-18_qa@sha256:e86ed5117e91b4d11f9e169526d734981deb31c99744d65cb6a6debf9262d97f
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
httpGet:
path: /wait-for-drain
port: 8022
scheme: HTTP
livenessProbe:
failureThreshold: 3
httpGet:
httpHeaders:
- name: K-Kubelet-Probe
value: queue
path: /ping
port: 8080
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: app
ports:
- containerPort: 8080
name: user-port
protocol: TCP
resources:
limits:
cpu: "2"
ephemeral-storage: 7Gi
memory: 4Gi
requests:
cpu: 200m
ephemeral-storage: 256Mi
memory: 409Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-jp8zk
readOnly: true
- env:
- name: SERVING_NAMESPACE
value: knative-apps
- name: SERVING_SERVICE
value: test-app-18
- name: SERVING_CONFIGURATION
value: test-app-18
- name: SERVING_REVISION
value: test-app-18-ac403
- name: QUEUE_SERVING_PORT
value: "8012"
- name: QUEUE_SERVING_TLS_PORT
value: "8112"
- name: CONTAINER_CONCURRENCY
value: "0"
- name: REVISION_TIMEOUT_SECONDS
value: "10"
- name: SERVING_POD
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: SERVING_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SERVING_LOGGING_CONFIG
- name: SERVING_LOGGING_LEVEL
- name: SERVING_REQUEST_LOG_TEMPLATE
value: '{"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl":
"{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}",
"status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent":
"{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp":
"{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s",
"protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"}'
- name: SERVING_ENABLE_REQUEST_LOG
value: "false"
- name: SERVING_REQUEST_METRICS_BACKEND
value: prometheus
- name: TRACING_CONFIG_BACKEND
value: none
- name: TRACING_CONFIG_ZIPKIN_ENDPOINT
- name: TRACING_CONFIG_DEBUG
value: "false"
- name: TRACING_CONFIG_SAMPLE_RATE
value: "0.1"
- name: USER_PORT
value: "8080"
- name: SYSTEM_NAMESPACE
value: knative-serving
- name: METRICS_DOMAIN
value: knative.dev/internal/serving
- name: SERVING_READINESS_PROBE
value: '{"httpGet":{"path":"/ping","port":8080,"host":"127.0.0.1","scheme":"HTTP","httpHeaders":[{"name":"K-Kubelet-Probe","value":"queue"}]},"successThreshold":1}'
- name: ENABLE_PROFILING
value: "false"
- name: SERVING_ENABLE_PROBE_REQUEST_LOG
value: "false"
- name: METRICS_COLLECTOR_ADDRESS
- name: CONCURRENCY_STATE_ENDPOINT
- name: CONCURRENCY_STATE_TOKEN_PATH
value: /var/run/secrets/tokens/state-token
- name: HOST_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: ENABLE_HTTP2_AUTO_DETECTION
value: "false"
image: knative-serving/queue:1.5.0
imagePullPolicy: IfNotPresent
name: queue-proxy
ports:
- containerPort: 8022
name: http-queueadm
protocol: TCP
- containerPort: 9090
name: http-autometric
protocol: TCP
- containerPort: 9091
name: http-usermetric # 就是這里蝙寨,提供 9091 端口作為 metrics 數(shù)據(jù)讀取接口。因為應(yīng)用的流量都被 Queue 轉(zhuǎn)發(fā)嗤瞎,所以在這里統(tǒng)計最好墙歪。
protocol: TCP
- containerPort: 8012
name: queue-port
protocol: TCP
- containerPort: 8112
name: https-port
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
httpHeaders:
- name: K-Network-Probe
value: queue
path: /
port: 8012
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
ephemeral-storage: 7Gi
requests:
cpu: 25m
ephemeral-storage: 256Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-jp8zk
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: false
imagePullSecrets:
- name: key.key
nodeName: 10.11.96.79
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 10
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120
volumes:
- name: kube-api-access-jp8zk
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
通過對 test-app-18 的 Deployment 內(nèi)容閱讀得知,9091 端口(命名:http-usermetric贝奇,后面用的到)是對外提供指標(biāo)的位置虹菲。 這里我也類似的做了一個通用的能夠抓取任何 Namespace 中 Knative 應(yīng)用 Pod 流量情況的 Job(這里有一個挑戰(zhàn):應(yīng)用的 Namespace 不確定,就需要對所有 Namespace 適配)掉瞳。
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: custom-apps-monitor
namespace: knative-serving
spec:
namespaceSelector:
any: true # 這個表示匹配任何 Namespace
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: http-usermetric
selector:
matchLabels:
k_type: knative # 匹配真實的 Pod 區(qū)分應(yīng)用類型的標(biāo)簽
編寫了通用的 PodScrape 任務(wù)來監(jiān)控應(yīng)用 Pod 的 metrics毕源,數(shù)據(jù)被自動收集到了 VictoriaMetrics,后面方便 Grafana 來做 Dashboard陕习。
最終效果
在接入 Grafana 以后霎褐,我這邊也沒有用 Knative 社區(qū)的模板,發(fā)現(xiàn)很多不一定有用该镣。最后決定自定義個比較有意義的監(jiān)控模板冻璃。