問題背景
在 Kubernetes 中對 Pod 中 Container 的狀況檢查提供了 Probe(探針)機(jī)制巴比,以下希望能夠結(jié)合官方文檔和源代碼理解 Probe 的使用和實(shí)現(xiàn),以及對應(yīng)的最佳實(shí)踐揉稚。
Container Probes 容器探針
Probe 是由Kubelet執(zhí)行的伶唯,對 Container 的定期檢測機(jī)制做粤,用于確定 Container 是否存活孵运,或者是否可以提供服務(wù)(接收訪問流量)卿叽。
Probe被 Kubelet 使用時(shí)根據(jù)其作用分為兩類:
-
livenessProbe
: 表示 Container 是否為正常運(yùn)行(running)狀態(tài)
- 如果探測結(jié)果為
Failure
床佳,Kubelet 會殺掉對應(yīng)的容器滋早,并且根據(jù)其 restart policy 來決定是否重啟; - 如果 Container 沒有提供自定義的 liveness probe,默認(rèn)視為返回
Success
砌们。 - 需要定義
initial delay
來決定什么時(shí)候開始探測杆麸,避免初始化時(shí)間太短導(dǎo)致一直循環(huán)重啟容器;
-
readinessProbe
: 表示 Container 是否可以正常提供服務(wù)
- 如果探測結(jié)果為
Failure
浪感,endpoints controller 會將對應(yīng)的 Pod IP 從所有匹配上的 Service 的 Endpoint 列表中移除昔头; - 默認(rèn)在
initial delay
時(shí)間結(jié)束之前,readiness probe 返回Failure
; - 如果 Container 沒有提供自定義的 readiness probe影兽,默認(rèn)
視為返回Success
揭斧。
Probe 實(shí)際上是通過調(diào)用由Container 實(shí)現(xiàn)的 Handler 來實(shí)現(xiàn)的,可以實(shí)現(xiàn)的 Handler 包括:
-
ExecAction
: 在容器里執(zhí)行一個(gè)制定命令峻堰,如果命令退出時(shí)返回0
讹开,則認(rèn)為檢測成功(Success),否則認(rèn)為檢測失敗(Failure)捐名; -
TCPSocketAction
: 針對容器IP:端口
的組合進(jìn)行 TCP 連接檢查旦万,如果對應(yīng)IP:端口處于開放狀態(tài),則認(rèn)為成功镶蹋,否則認(rèn)為失敵伤摇; -
HTTPGetAction
: 針對容器IP:端口:API路徑
的組合進(jìn)行 HTTP GET 請求贺归,如果 HTTP Response的 Status Code 在200~400
之間淆两,則認(rèn)為檢測成功,否則認(rèn)為失斈恋琼腔;
所以根據(jù)對應(yīng) Handler 的調(diào)用結(jié)果瑰枫,每個(gè) Probe 的探測結(jié)果可以有以下三種類型:
-
Success
: 對應(yīng) Handler 返回成功; -
Failure
: 對應(yīng) Handler 返回失敗; -
Unknown
: 對應(yīng) Handler 無法正常執(zhí)行;
什么時(shí)候應(yīng)該使用 liveness 或者 readiness probes?
看了上面關(guān)于兩種 probe 的介紹后踱葛,就會有一個(gè)問題丹莲,是不是容器是否存活一定要定義 liveness probe 來探測,容器是否可服務(wù)一定要定義 readiness 來探測尸诽?
答案是否定的甥材。
-
關(guān)于容器的存活情況:
- 容器本身的生命周期管理能夠解決的問題,不需要通過 liveness 來進(jìn)行探測性含,比如容器的 PID1 進(jìn)程在發(fā)生錯(cuò)誤的時(shí)候退出的場景洲赵,Kubelet 會根據(jù)容器的狀況和 Pod 的
restartPolicy
來進(jìn)行調(diào)諧; - 當(dāng)你希望不止基于容器本身的存活狀態(tài)商蕴,而是通過某種自定義方式來決定 Kubelet 是否視容器為存活時(shí)叠萍,需要使用 liveness probe,舉個(gè)例子绪商,如果容器的 PID1 進(jìn)程是一個(gè)常駐的 init 進(jìn)程苛谷,而我們希望通過這個(gè) init 啟動(dòng)的 flask 進(jìn)程來判斷容器是否為存活,如果 flask 進(jìn)程啟動(dòng)不成功格郁,就殺掉容器腹殿,并根據(jù)
restartPolicy
進(jìn)行調(diào)諧,這個(gè)時(shí)候可以使用自定義 liveness probe例书。
- 容器本身的生命周期管理能夠解決的問題,不需要通過 liveness 來進(jìn)行探測性含,比如容器的 PID1 進(jìn)程在發(fā)生錯(cuò)誤的時(shí)候退出的場景洲赵,Kubelet 會根據(jù)容器的狀況和 Pod 的
-
關(guān)于容器的可服務(wù)情況:
- 當(dāng)你希望有某一種機(jī)制锣尉,解決容器啟動(dòng)成功,和容器可以提供服務(wù)之間的區(qū)分决采,你需要使用 readiness probe,比如應(yīng)用啟動(dòng)成功自沧,但需要比較長的的初始化時(shí)間后(比如拉取大量初始化數(shù)據(jù))才能正常提供服務(wù),這個(gè)時(shí)候僅僅以容器是否存活來決定服務(wù)狀態(tài)是不夠的树瞭,等到 readiness 探測成功暂幼,容器才會被加入到 endpoint 里去對外提供服務(wù);
- 當(dāng)你希望容器在存活狀態(tài)下移迫,根據(jù)某種條件來讓 Kubelet 認(rèn)為它處于維護(hù)狀態(tài)旺嬉,自動(dòng)把它從 endpoint 中去掉,停止對外提供服務(wù)厨埋,你需要使用和 liveness probe 不同的 readiness probe(容器已經(jīng)啟動(dòng)邪媳,當(dāng)對應(yīng)服務(wù)正在維護(hù)中...);
- 容器本身生命周期能夠解決的服務(wù)問題荡陷,也不需要通過 readiness probe 來探測是否可服務(wù)雨效,比如當(dāng)一個(gè) Pod 被刪除的時(shí)候,Pod 會被置為 unready 狀態(tài)废赞,不管 readiness probe 是否存在徽龟,也不管其結(jié)果如何。
probes 的實(shí)踐
exec-liveness.yaml
:
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: k8s.gcr.io/busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
觀測 Pod 狀態(tài):
root@kmaster135:/home/chenjiaxi01/yaml/pods/probe# kubectl describe pod liveness-exec
Name: liveness-exec
...
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m19s default-scheduler Successfully assigned default/liveness-exec to dnode136
Normal Killing 2m2s kubelet, dnode136 Killing container with id docker://liveness:Container failed liveness probe.. Container will be killed and recreated.
Warning Failed 107s kubelet, dnode136 Failed to pull image "k8s.gcr.io/busybox": rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/busybox/manifests/latest: dial tcp [2404:6800:4008:c06::52]:443: connect: network is unreachable
Warning Failed 107s kubelet, dnode136 Error: ErrImagePull
Normal BackOff 106s kubelet, dnode136 Back-off pulling image "k8s.gcr.io/busybox"
Warning Failed 106s kubelet, dnode136 Error: ImagePullBackOff
Normal Pulling 93s (x3 over 4m8s) kubelet, dnode136 pulling image "k8s.gcr.io/busybox"
Normal Pulled 72s (x2 over 3m18s) kubelet, dnode136 Successfully pulled image "k8s.gcr.io/busybox"
Normal Created 72s (x2 over 3m17s) kubelet, dnode136 Created container
Normal Started 72s (x2 over 3m17s) kubelet, dnode136 Started container
Warning Unhealthy 27s (x6 over 2m42s) kubelet, dnode136 Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
可以看到在 30s 過后唉地,liveness 探測失敗据悔,kubelet 會刪掉容器传透,并根據(jù)默認(rèn)restartPolicy=Always
重啟容器;
發(fā)現(xiàn)有個(gè)問題, Node 上有鏡像极颓,但還是會去嘗試遠(yuǎn)程拉取鏡像朱盐,原因是
imagePullPolicy: Always
,如果想要在本地有對應(yīng)鏡像的時(shí)候不拉取菠隆,應(yīng)該設(shè)置為imagePullPolcy: IfNotPresent
兵琳。
代碼實(shí)現(xiàn)
代碼版本: release-1.12
- Kubelet 中的數(shù)據(jù)結(jié)構(gòu)
pkg/kubelet/kubelet.go
// Kubelet is the main kubelet implementation.
type Kubelet struct {
kubeletConfiguration componentconfig.KubeletConfiguration
...
// Handles container probing.
probeManager prober.Manager
// Manages container health check results.
livenessManager proberesults.Manager
...
}
- 初始化
pkg/kubelet/kubelet.go
// NewMainKubelet instantiates a new Kubelet object along with all the required internal modules.
// No initialization of Kubelet and its modules should happen here.
func NewMainKubelet(kubeCfg *componentconfig.KubeletConfiguration, kubeDeps *KubeletDeps, crOptions *options.ContainerRuntimeOptions, standaloneMode bool, hostnameOverride, nodeIP, providerID string) (*Kubelet, error) {
...
klet := &Kubelet{
hostname: hostname,
nodeName: nodeName,
kubeClient: kubeDeps.KubeClient,
...
}
...
klet.probeManager = prober.NewManager(
klet.statusManager,
klet.livenessManager,
klet.runner,
containerRefManager,
kubeDeps.Recorder)
...
}
- 啟動(dòng)
pkg/kubelet/kubelet.go
// Run starts the kubelet reacting to config updates
func (kl *Kubelet) Run(updates <-chan kubetypes.PodUpdate) {
...
// Start component sync loops.
kl.statusManager.Start()
kl.probeManager.Start()
...
}
- 使用:
Pod 被創(chuàng)建時(shí): pkg/kubelet/kubelet.go
// HandlePodAdditions is the callback in SyncHandler for pods being added from
// a config source.
func (kl *Kubelet) HandlePodAdditions(pods []*v1.Pod) {
start := kl.clock.Now()
sort.Sort(sliceutils.PodsByCreationTime(pods))
for _, pod := range pods {
existingPods := kl.podManager.GetPods()
// Always add the pod to the pod manager. Kubelet relies on the pod
// manager as the source of truth for the desired state. If a pod does
// not exist in the pod manager, it means that it has been deleted in
// the apiserver and no action (other than cleanup) is required.
kl.podManager.AddPod(pod)
...
kl.probeManager.AddPod(pod)
}
}
Pod 被刪除時(shí): pkg/kubelet/kubelet.go
// HandlePodRemoves is the callback in the SyncHandler interface for pods
// being removed from a config source.
func (kl *Kubelet) HandlePodRemoves(pods []*v1.Pod) {
start := kl.clock.Now()
for _, pod := range pods {
kl.podManager.DeletePod(pod)
...
kl.probeManager.RemovePod(pod)
}
-
prober.Manager
接口
pkg/kubelet/prober/prober_manager.go
// Manager manages pod probing. It creates a probe "worker" for every container that specifies a
// probe (AddPod). The worker periodically probes its assigned container and caches the results. The
// manager use the cached probe results to set the appropriate Ready state in the PodStatus when
// requested (UpdatePodStatus). Updating probe parameters is not currently supported.
// TODO: Move liveness probing out of the runtime, to here.
type Manager interface {
// AddPod creates new probe workers for every container probe. This should be called for every
// pod created.
AddPod(pod *v1.Pod)
// RemovePod handles cleaning up the removed pod state, including terminating probe workers and
// deleting cached results.
RemovePod(pod *v1.Pod)
// CleanupPods handles cleaning up pods which should no longer be running.
// It takes a list of "active pods" which should not be cleaned up.
CleanupPods(activePods []*v1.Pod)
// UpdatePodStatus modifies the given PodStatus with the appropriate Ready state for each
// container based on container running status, cached probe results and worker states.
UpdatePodStatus(types.UID, *v1.PodStatus)
// Start starts the Manager sync loops.
Start()
}
prober.Manager
負(fù)責(zé) Pod 探測的管理,提供了五個(gè)方法:
-
AddPod(pod *v1.Pod)
: 在 Pod 創(chuàng)建時(shí)被知道用骇径,為每個(gè)容器 probe 創(chuàng)建新的 probe worker; -
RemovePod(pod *v1.Pod)
: 清理被刪除的 Pod 的 Probe 狀態(tài)躯肌,包括停止 probe wokers 以及清理掉緩存的結(jié)果; -
CleanupPods(activePods []*v1.Pod)
: 清理不需要運(yùn)行的 Pods(??和 RemovePod 的區(qū)別和聯(lián)系破衔?羡榴?); -
UpdatePodStatus(type.UID, *v1.PodStatus)
: 基于容器的運(yùn)行狀態(tài)、緩存的探測結(jié)果运敢,worker 的狀態(tài)來更新 PodStatus; -
Start()
: 啟動(dòng) Manager 同步循環(huán)校仑;
基于上述的五個(gè)方法,Manager 會通過AddPod
在 Pod 創(chuàng)建時(shí)為每個(gè) container創(chuàng)建一個(gè)probe worker
指定對應(yīng)的探針传惠,worker 定期執(zhí)行探測并緩存結(jié)果迄沫。基于緩存的結(jié)果卦方,Manager會在被請求的時(shí)候通過UpdatePodStatus
更新PodStatus
中的Ready
狀態(tài)羊瘩。當(dāng)容器被刪除的時(shí)候,通過RemovePod
回收worker盼砍。
// TODO: Move liveness probing out of the runtime, to here. 如何理解
- 接口的實(shí)現(xiàn):
prober.manager
type manager struct {
// Map of active workers for probes
workers map[probeKey]*worker
// Lock for accessing & mutating workers
workerLock sync.RWMutex
// The statusManager cache provides pod IP and container IDs for probing.
statusManager status.Manager
// readinessManager manages the results of readiness probes
readinessManager results.Manager
// livenessManager manages the results of liveness probes
livenessManager results.Manager
// prober executes the probe actions.
prober *prober
}
prober.manager
包括如下數(shù)據(jù)結(jié)構(gòu):
-
workers
: 維護(hù) probe 和 worker 之間的映射尘吗; -
workerLock
: 訪問 worker 時(shí)需要加鎖; -
statusManager
: 提供 Pod 和 Container 信息; -
readinessManager
: 保存 readiness probes 結(jié)果浇坐; -
livenessManager
: 保存 liveness probes 結(jié)果; -
prober
: 具體執(zhí)行 probe 動(dòng)作;
- worker: probe 探測的主要邏輯
worker 對象封裝了對一個(gè) probe 探測的主要任務(wù)睬捶;
其數(shù)據(jù)結(jié)構(gòu)如下:
pkg/kubelet/prober/worker.go:37
// worker handles the periodic probing of its assigned container. Each worker has a go-routine
// associated with it which runs the probe loop until the container permanently terminates, or the
// stop channel is closed. The worker uses the probe Manager's statusManager to get up-to-date
// container IDs.
type worker struct {
// Channel for stopping the probe.
stopCh chan struct{}
// The pod containing this probe (read-only)
pod *v1.Pod
// The container to probe (read-only)
container v1.Container
// Describes the probe configuration (read-only)
spec *v1.Probe
// The type of the worker.
probeType probeType
// The probe value during the initial delay.
initialValue results.Result
// Where to store this workers results.
resultsManager results.Manager
probeManager *manager
// The last known container ID for this worker.
containerID kubecontainer.ContainerID
// The last probe result for this worker.
lastResult results.Result
// How many times in a row the probe has returned the same result.
resultRun int
// If set, skip probing.
onHold bool
// proberResultsMetricLabels holds the labels attached to this worker
// for the ProberResults metric.
proberResultsMetricLabels prometheus.Labels
}
其方法包括:
-
newWorker
: 根據(jù)用戶傳入的proberType
等參數(shù),初始化一個(gè)對應(yīng)到 container-liveness/readiness 探測任務(wù)的worker近刘; -
run
: 按照用戶指定的Probe.PeriodSeconds
擒贸,周期性執(zhí)行 worker 的doProbe
操作,直到收到退出信號; -
stop
: 發(fā)出終止信號觉渴,停止 woker; -
doProbe
: 真正執(zhí)行探測動(dòng)作介劫,返回探測結(jié)果true
/false
;
主要看doProbe
的具體實(shí)現(xiàn):
// doProbe probes the container once and records the result.
// Returns whether the worker should continue.
func (w *worker) doProbe() (keepGoing bool) {
defer func() { recover() }() // Actually eat panics (HandleCrash takes care of logging)
defer runtime.HandleCrash(func(_ interface{}) { keepGoing = true })
... // 防御式編程案淋,去掉不需要 probe 的情況座韵,比如 Pod 不存在,Container 不存在等
// TODO: in order for exec probes to correctly handle downward API env, we must be able to reconstruct
// the full container environment here, OR we must make a call to the CRI in order to get those environment
// values from the running container.
result, err := w.probeManager.prober.probe(w.probeType, w.pod, status, w.container, w.containerID)
if err != nil {
// Prober error, throw away the result.
return true
}
... // 根據(jù) Probe 的結(jié)果和對應(yīng)配置(比如重試次數(shù)等)踢京,決定是否返回成功
doProbe
對容器的不同情況進(jìn)行分類誉碴,決定是否要進(jìn)行 probe宦棺,并且處理 probe 的結(jié)果,決定是否返回成功(true)翔烁;
下面繼續(xù)看w.probeManager.prober.probe
,分別支持exec
,tcp
,httpGet
三種 Probe 類型旨涝,代碼實(shí)現(xiàn):
pkg/kubelet/prober/prober.go:81
:
// probe probes the container.
func (pb *prober) probe(probeType probeType, pod *v1.Pod, status v1.PodStatus, container v1.Container, containerID kubecontainer.ContainerID) (results.Result, error) {
var probeSpec *v1.Probe
switch probeType {
case readiness:
probeSpec = container.ReadinessProbe
case liveness:
probeSpec = container.LivenessProbe
default:
return results.Failure, fmt.Errorf("Unknown probe type: %q", probeType)
}
...
result, output, err := pb.runProbeWithRetries(probeType, probeSpec, pod, status, container, containerID, maxProbeRetries)
...
}
runProbeWithRetries
封裝了重試邏輯蹬屹,最終調(diào)用到runProbe
,按照不同的 Probe 類型實(shí)現(xiàn)不同的 Probe 具體探測流程白华,基于我們的問題背景慨默,我們目前主要關(guān)心的是 HTTPGet 的具體實(shí)現(xiàn),問題是:
- 用戶是否可以指定 HTTPGet 的 Host弧腥?
- 如果用戶沒有指定厦取,默認(rèn)的 Host 是(猜測是 ClusterIP)?
pkg/kubelet/prober/prober.go:147
func (pb *prober) runProbe(probeType probeType, p *v1.Probe, pod *v1.Pod, status v1.PodStatus, container v1.Container, containerID kubecontainer.ContainerID) (probe.Result, string, error) {
timeout := time.Duration(p.TimeoutSeconds) * time.Second
if p.Exec != nil {
glog.V(4).Infof("Exec-Probe Pod: %v, Container: %v, Command: %v", pod, container, p.Exec.Command)
command := kubecontainer.ExpandContainerCommandOnlyStatic(p.Exec.Command, container.Env)
return pb.exec.Probe(pb.newExecInContainer(container, containerID, command, timeout))
}
if p.HTTPGet != nil {
scheme := strings.ToLower(string(p.HTTPGet.Scheme))
// 1. 用戶可以指定 HTTPGet 的 Host;
// 2. 如果用戶沒有指定管搪,默認(rèn)的 Host 就是 PodIP虾攻。
host := p.HTTPGet.Host
if host == "" {
host = status.PodIP
}
port, err := extractPort(p.HTTPGet.Port, container)
if err != nil {
return probe.Unknown, "", err
}
path := p.HTTPGet.Path
glog.V(4).Infof("HTTP-Probe Host: %v://%v, Port: %v, Path: %v", scheme, host, port, path)
url := formatURL(scheme, host, port, path)
headers := buildHeader(p.HTTPGet.HTTPHeaders)
glog.V(4).Infof("HTTP-Probe Headers: %v", headers)
if probeType == liveness {
return pb.livenessHttp.Probe(url, headers, timeout)
} else { // readiness
return pb.readinessHttp.Probe(url, headers, timeout)
}
}
if p.TCPSocket != nil {
port, err := extractPort(p.TCPSocket.Port, container)
if err != nil {
return probe.Unknown, "", err
}
host := p.TCPSocket.Host
if host == "" {
host = status.PodIP
}
glog.V(4).Infof("TCP-Probe Host: %v, Port: %v, Timeout: %v", host, port, timeout)
return pb.tcp.Probe(host, port, timeout)
}
glog.Warningf("Failed to find probe builder for container: %v", container)
return probe.Unknown, "", fmt.Errorf("Missing probe handler for %s:%s", format.Pod(pod), container.Name)
}
繼續(xù)追查下去會追查到DoHTTPProbe
: pkg/probe/http/http.go:66
// DoHTTPProbe checks if a GET request to the url succeeds.
// If the HTTP response code is successful (i.e. 400 > code >= 200), it returns Success.
// If the HTTP response code is unsuccessful or HTTP communication fails, it returns Failure.
// This is exported because some other packages may want to do direct HTTP probes.
func DoHTTPProbe(url *url.URL, headers http.Header, client HTTPGetInterface) (probe.Result, string, error) {
req, err := http.NewRequest("GET", url.String(), nil)
...
if headers.Get("Host") != "" {
req.Host = headers.Get("Host")
}
res, err := client.Do(req)
if err != nil {
// Convert errors into failures to catch timeouts.
return probe.Failure, err.Error(), nil
}
defer res.Body.Close()
...
if res.StatusCode >= http.StatusOK && res.StatusCode < http.StatusBadRequest {
glog.V(4).Infof("Probe succeeded for %s, Response: %v", url.String(), *res)
return probe.Success, body, nil
}
glog.V(4).Infof("Probe failed for %s with request headers %v, response body: %v", url.String(), headers, body)
return probe.Failure, fmt.Sprintf("HTTP probe failed with statuscode: %d", res.StatusCode), nil
}
發(fā)送 HTTP 請求進(jìn)行探測,至此 HTTPGet Probe 的流程梳理完畢更鲁。
其他知識
select 作為并發(fā)控制的理解
// run periodically probes the container.
func (w *worker) run() {
probeTickerPeriod := time.Duration(w.spec.PeriodSeconds) * time.Second
// If kubelet restarted the probes could be started in rapid succession.
// Let the worker wait for a random portion of tickerPeriod before probing.
time.Sleep(time.Duration(rand.Float64() * float64(probeTickerPeriod)))
probeTicker := time.NewTicker(probeTickerPeriod)
defer func() {
// Clean up.
probeTicker.Stop()
if !w.containerID.IsEmpty() {
w.resultsManager.Remove(w.containerID)
}
w.probeManager.removeWorker(w.pod.UID, w.container.Name, w.probeType)
ProberResults.Delete(w.proberResultsMetricLabels)
}()
probeLoop:
for w.doProbe() {
// Wait for next probe tick.
select {
case <-w.stopCh:
break probeLoop
case <-probeTicker.C:
// continue
}
}
}
這個(gè)probeLoop
的用法不是很理解,直接寫個(gè) sample 來看看:
func main() {
stopCh := make(chan int)
ticker := time.NewTicker(1 * time.Second)
go func() {
time.Sleep(3 * time.Second)
stopCh <- 0
fmt.Println("Send to stopCh")
}()
testLoop:
for {
select {
case <-stopCh:
fmt.Println("Receive from stopCh, break")
break testLoop
case <-ticker.C:
fmt.Println("Running...")
// continue
}
}
fmt.Println("Done")
}
- 定義一個(gè)循環(huán)的名字而已霎箍,如果去掉的話,無法直接 break 整個(gè)循環(huán)澡为,而只是 break 一次循環(huán)漂坏;
-
time.Ticker
的使用方式值得學(xué)習(xí),用于配置定時(shí)任務(wù)媒至,直到收到某個(gè)終止信號顶别; -
for{}
便是一個(gè)一直運(yùn)行的循環(huán),等同于Python中的while(true)
;
worker.stop 的寫法
pkg/kubelet/prober/worker.go:147
// stop stops the probe worker. The worker handles cleanup and removes itself from its manager.
// It is safe to call stop multiple times.
func (w *worker) stop() {
select {
case w.stopCh <- struct{}{}:
default: // Non-blocking.
}
}
這樣寫和以下這么寫有什么區(qū)別:
func (w *worker) stop() {
w.stopCh <- struct{}{}
}
Non-blocking 的寫法拒啰,如果 channel 已經(jīng)寫滿驯绎,不會阻塞住 stop 所在的 Goroutine,上層就算重復(fù)執(zhí)行谋旦,也不會引發(fā)錯(cuò)誤条篷,相當(dāng)于 stop 操作是冪等的,健壯性提高蛤织;
Sample 如下:
var stopCh = make(chan struct{}, 1)
func nonblockingStop() {
select {
case stopCh <- struct{}{}:
fmt.Println("Write to stopCh... Break")
default:
fmt.Println("Cannot write to stopCh... Running")
// non-blocking
}
}
func stop() {
stopCh <- struct{}{}
}
func looping() {
testLoop:
for {
select {
case <-stopCh:
fmt.Println("Receive End Signal...Done")
break testLoop
default:
fmt.Println("Cannot Receive End Signal...Done")
time.Sleep(500 * time.Millisecond)
}
}
}
func main() {
// make stop blocked
go looping()
time.Sleep(time.Second)
for i := 0; i <= 2; i++ {
//stop()
nonblockingStop()
}
time.Sleep(3 * time.Second)
}
執(zhí)行三次stop()會死鎖赴叹,但是 nonblockingStop 不會;