目錄
概述
污點和容忍度
Pod調(diào)度順序
示例1: 把Pod調(diào)度到master 對master:NoSchedule標識容忍
示例2: 為節(jié)點添加效用標識NoExecute 驅(qū)逐所有Pod
概述:
污點taints是定義在節(jié)點之上的鍵值型屬性數(shù)據(jù),用于讓節(jié)點拒絕將Pod調(diào)度運行于其上围橡, 除非該Pod對象具有接納節(jié)點污點的容忍度。而容忍度tolerations是定義在 Pod對象上的鍵值型屬性數(shù)據(jù)荤崇,用于配置其可容忍的節(jié)點污點,而且調(diào)度器僅能將Pod對象調(diào)度至其能夠容忍該節(jié)點污點的節(jié)點之上津函,如圖所示
- 一個Pod能否被調(diào)度到節(jié)點上因素有
- 是否節(jié)點有污點
- 節(jié)點上有污點.Pod是否能容忍這個污點
污點和容忍度
污點定義在節(jié)點的node Spec中本辐,而容忍度則定義在Pod的podSpec中,它們都是鍵值型數(shù)據(jù)喳整,但又都額外支持一個效果effect標記,語法格式為key=value:effect裸扶,其中key和value的用法及格式與資源注俯-信息相似框都, 而effect則用于定義對Pod對象的排斥等級,它主要包含以下三種類型效用標識
- NoSchedule
不能容忍此污點的新Pod對象不可調(diào)度至當前節(jié)點呵晨,屬于強制型約束關(guān)系魏保,節(jié)點上現(xiàn)存的Pod對象不受影響。 - PreferNoSchedule
的柔性約束版本摸屠,即不能容忍此污點的新Pod對象盡量不要調(diào)度至當前節(jié)點谓罗,不過無其他節(jié)點可供調(diào)度時也允許接受相應(yīng)的Pod對象。節(jié)點上現(xiàn)存的Pod對象不受影響季二。 - NoExecute
不能容忍此污點的新Pod對象不可調(diào)度至當前節(jié)點檩咱,屬于強制型約束關(guān)系揭措,而且節(jié)點上現(xiàn)存的Pod對象因節(jié)點污點變動或Pod容忍度變動而不再滿足匹配規(guī)則時,Pod對象將被驅(qū)逐刻蚯。
在Pod對象上定義容忍度時绊含,它支持兩種操作符:一種是等值比較Equal,表示容忍度與污點必須在key、value和effect三者之上完全匹配炊汹;另一種是存在性判斷Exists躬充,表示二者的key和effect必須完全匹配,而容忍度中的value字段要使用空值讨便。
Pod調(diào)度順序
一個節(jié)點可以配置使用多個污點充甚,一個Pod對象也可以有多個容忍度,不過二者在進行匹配檢查時應(yīng)遵循如下邏輯霸褒。
- 首先處理每個有著與之匹配的容忍度的污點
- 不能匹配到的污點上津坑,如果存在一個污點使用了NoSchedule效用標識,則拒絕調(diào)度Pod對象至此節(jié)點
- 不能匹配到的污點上傲霸,若沒有任何一個使用了NoSchedule效用標識,但至少有一個使用了PreferNoScheduler眉反,則應(yīng)盡量避免將Pod對象調(diào)度至此節(jié)點
- 如果至少有一個不匹配的污點使用了NoExecute效用標識昙啄,則節(jié)點將立即驅(qū)逐Pod對象,或者不予調(diào)度至給定節(jié)點寸五;另外梳凛,即便容忍度可以匹配到使用了 NoExecute效用標識的污點,若在定義容忍度時還同時使用tolerationSeconds屬性定義了容忍時限梳杏,則超出時限后其也將被節(jié)點驅(qū)逐韧拒。
使用kubeadm部署的Kubernetes集群,其Master節(jié)點將自動添加污點信息以阻止不能容忍此污點的Pod對象調(diào)度至此節(jié)點十性,因此叛溢,用戶手動創(chuàng)建的未特意添加容忍此污點容忍度的Pod對象將不會被調(diào)度至此節(jié)點
示例1: Pod調(diào)度到master 對master:NoSchedule標識容忍
[root@k8s-master Scheduler]# kubectl describe node k8s-master.org #查看master污點 效用標識
...
Taints: node-role.kubernetes.io/master:NoSchedule
Unschedulable: false
[root@k8s-master Scheduler]# cat tolerations-daemonset-demo.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: daemonset-demo
namespace: default
labels:
app: prometheus
component: node-exporter
spec:
selector:
matchLabels:
app: prometheus
component: node-exporter
template:
metadata:
name: prometheus-node-exporter
labels:
app: prometheus
component: node-exporter
spec:
tolerations: #容忍度 容忍master NoSchedule標識
- key: node-role.kubernetes.io/master #是key值
effect: NoSchedule #效用標識
operator: Exists #存在即可
containers:
- image: prom/node-exporter:latest
name: prometheus-node-exporter
ports:
- name: prom-node-exp
containerPort: 9100
hostPort: 9100
[root@k8s-master Scheduler]# kubectl apply -f tolerations-daemonset-demo.yaml
[root@k8s-master Scheduler]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
daemonset-demo-7fgnd 2/2 Running 0 5m15s 10.244.91.106 k8s-node2.org <none> <none>
daemonset-demo-dmd47 2/2 Running 0 5m15s 10.244.70.105 k8s-node1.org <none> <none>
daemonset-demo-jhzwf 2/2 Running 0 5m15s 10.244.42.29 k8s-node3.org <none> <none>
daemonset-demo-rcjmv 2/2 Running 0 5m15s 10.244.59.16 k8s-master.org <none> <none>
示例2: 為節(jié)點添加effect效用標識NoExecute 驅(qū)逐所有Pod
[root@k8s-master Scheduler]# kubectl taint --help
Update the taints on one or more nodes.
* A taint consists of a key, value, and effect. As an argument here, it is expressed as key=value:effect.
* The key must begin with a letter or number, and may contain letters, numbers, hyphens, dots, and underscores, up to
253 characters.
* Optionally, the key can begin with a DNS subdomain prefix and a single '/', like example.com/my-app
* The value is optional. If given, it must begin with a letter or number, and may contain letters, numbers, hyphens,
dots, and underscores, up to 63 characters.
* The effect must be NoSchedule, PreferNoSchedule or NoExecute.
* Currently taint can only apply to node.
Examples: #示例
# Update node 'foo' with a taint with key 'dedicated' and value 'special-user' and effect 'NoSchedule'.
# If a taint with that key and effect already exists, its value is replaced as specified.
kubectl taint nodes foo dedicated=special-user:NoSchedule
# Remove from node 'foo' the taint with key 'dedicated' and effect 'NoSchedule' if one exists.
kubectl taint nodes foo dedicated:NoSchedule-
# Remove from node 'foo' all the taints with key 'dedicated'
kubectl taint nodes foo dedicated-
# Add a taint with key 'dedicated' on nodes having label mylabel=X
kubectl taint node -l myLabel=X dedicated=foo:PreferNoSchedule
# Add to node 'foo' a taint with key 'bar' and no value
kubectl taint nodes foo bar:NoSchedule
[root@k8s-master Scheduler]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
daemonset-demo-7ghhd 1/1 Running 0 23m 192.168.113.35 k8s-node1 <none> <none>
daemonset-demo-cjxd5 1/1 Running 0 23m 192.168.12.35 k8s-node2 <none> <none>
daemonset-demo-lhng4 1/1 Running 0 23m 192.168.237.4 k8s-master <none> <none>
daemonset-demo-x5nhg 1/1 Running 0 23m 192.168.51.54 k8s-node3 <none> <none>
pod-antiaffinity-required-697f7d764d-69vx4 0/1 Pending 0 8s <none> <none> <none> <none>
pod-antiaffinity-required-697f7d764d-7cxp2 1/1 Running 0 8s 192.168.51.55 k8s-node3 <none> <none>
pod-antiaffinity-required-697f7d764d-rpb5r 1/1 Running 0 8s 192.168.12.36 k8s-node2 <none> <none>
pod-antiaffinity-required-697f7d764d-vf2x8 1/1 Running 0 8s 192.168.113.36 k8s-node1 <none> <none>
- 為Node 3打上NoExecute效用標簽,驅(qū)逐Node所有Pod
[root@k8s-master Scheduler]# kubectl taint node k8s-node3 diskfull=true:NoExecute
node/k8s-node3 tainted
[root@k8s-master Scheduler]# kubectl describe node k8s-node3
...
CreationTimestamp: Sun, 29 Aug 2021 22:45:43 +0800
Taints: diskfull=true:NoExecute
- node節(jié)點所有Pod已經(jīng)被驅(qū)逐 但因為Pod 定義為每個節(jié)點只能存在一個同類型Pod 所以會被掛起,不會被在其它節(jié)點創(chuàng)建
[root@k8s-master Scheduler]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
daemonset-demo-7ghhd 1/1 Running 0 31m 192.168.113.35 k8s-node1 <none> <none>
daemonset-demo-cjxd5 1/1 Running 0 31m 192.168.12.35 k8s-node2 <none> <none>
daemonset-demo-lhng4 1/1 Running 0 31m 192.168.237.4 k8s-master <none> <none>
pod-antiaffinity-required-697f7d764d-69vx4 0/1 Pending 0 7m45s <none> <none> <none> <none>
pod-antiaffinity-required-697f7d764d-l86td 0/1 Pending 0 6m5s <none> <none> <none> <none>
pod-antiaffinity-required-697f7d764d-rpb5r 1/1 Running 0 7m45s 192.168.12.36 k8s-node2 <none> <none>
pod-antiaffinity-required-697f7d764d-vf2x8 1/1 Running 0 7m45s 192.168.113.36 k8s-node1 <none> <none>
- 刪除污點 Pod重新被創(chuàng)建
[root@k8s-master Scheduler]# kubectl taint node k8s-node3 diskfull-
node/k8s-node3 untainted
[root@k8s-master Scheduler]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
daemonset-demo-7ghhd 1/1 Running 0 34m 192.168.113.35 k8s-node1 <none> <none>
daemonset-demo-cjxd5 1/1 Running 0 34m 192.168.12.35 k8s-node2 <none> <none>
daemonset-demo-lhng4 1/1 Running 0 34m 192.168.237.4 k8s-master <none> <none>
daemonset-demo-m6g26 0/1 ContainerCreating 0 4s <none> k8s-node3 <none> <none>
pod-antiaffinity-required-697f7d764d-69vx4 0/1 ContainerCreating 0 10m <none> k8s-node3 <none> <none>
pod-antiaffinity-required-697f7d764d-l86td 0/1 Pending 0 9m1s <none> <none> <none> <none>
pod-antiaffinity-required-697f7d764d-rpb5r 1/1 Running 0 10m 192.168.12.36 k8s-node2 <none> <none>
pod-antiaffinity-required-697f7d764d-vf2x8 1/1 Running 0 10m 192.168.113.36 k8s-node1 <none> <none>
參考文檔: