kubernetes垃圾回收器GarbageCollector 源碼分析(三)

kubernetes版本:1.13.2

接兩節(jié):

kubernetes垃圾回收器GarbageCollector Controller源碼分析(一)
kubernetes垃圾回收器GarbageCollector Controller源碼分析(二)

主要步驟

GarbageCollector Controller源碼主要分為以下幾部分:

  1. monitors作為生產(chǎn)者將變化的資源放入graphChanges隊列友浸;同時restMapper定期檢測集群內(nèi)資源類型性芬,刷新monitors
  2. runProcessGraphChangesgraphChanges隊列中取出變化的item,根據(jù)情況放入attemptToDelete隊列往弓;
  3. runProcessGraphChangesgraphChanges隊列中取出變化的item旦装,根據(jù)情況放入attemptToOrphan隊列页衙;
  4. runAttemptToDeleteWorkerattemptToDelete隊列取出,嘗試刪除垃圾資源阴绢;
  5. runAttemptToOrphanWorkerattemptToOrphan隊列取出店乐,處理該孤立的資源;
    在這里插入圖片描述

    上一節(jié)分析了第2,3部分呻袭,本節(jié)分析第4眨八、5部分。

終結器

在閱讀以下代碼時棒妨,有必要先了解一下終結器踪古。

對象的終結器是在對象刪除之前需要執(zhí)行的邏輯,所有的對象在刪除之前券腔,它的終結器字段必須為空伏穆,終結器提供了一個通用的 API,它的功能不只是用于阻止級聯(lián)刪除纷纫,還能過通過它在對象刪除之前加入鉤子:

type ObjectMeta struct {
    // ...
    Finalizers []string
}

終結器在對象被刪之前運行枕扫,每當終結器成功運行之后,就會將它自己從 Finalizers 數(shù)組中刪除辱魁,當最后一個終結器被刪除之后烟瞧,API Server 就會刪除該對象。

在默認情況下染簇,刪除一個對象會刪除它的全部依賴参滴,但是我們在一些特定情況下我們只是想刪除當前對象本身并不想造成復雜的級聯(lián)刪除,垃圾回收機制在這時引入了 OrphanFinalizer锻弓,它會在對象被刪除之前向 Finalizers 數(shù)組添加或者刪除 OrphanFinalizer砾赔。

該終結器會監(jiān)聽對象的更新事件并將它自己從它全部依賴對象的 OwnerReferences 數(shù)組中刪除,與此同時會刪除所有依賴對象中已經(jīng)失效的 OwnerReferences 并將 OrphanFinalizer 從 Finalizers 數(shù)組中刪除。

通過 OrphanFinalizer 我們能夠在刪除一個 Kubernetes 對象時保留它的全部依賴暴心,為使用者提供一種更靈活的辦法來保留和刪除對象妓盲。

同時,也希望可以看一下"垃圾回收"官網(wǎng)文檔
垃圾收集

attemptToDelete隊列

來到代碼$GOPATH\src\k8s.io\kubernetes\pkg\controller\garbagecollector\garbagecollector.go中:

func (gc *GarbageCollector) runAttemptToDeleteWorker() {
    for gc.attemptToDeleteWorker() {
    }
}

從attemptToDelete隊列中取出資源专普,調用gc.attemptToDeleteItem(n)處理悯衬,期間如果出現(xiàn)error,則通過rateLimited重新加回attemptToDelete隊列檀夹。

func (gc *GarbageCollector) attemptToDeleteWorker() bool {
    //從隊列里取出需要嘗試刪除的資源
    item, quit := gc.attemptToDelete.Get()
    gc.workerLock.RLock()
    defer gc.workerLock.RUnlock()
    if quit {
        return false
    }
    defer gc.attemptToDelete.Done(item)
    n, ok := item.(*node)
    if !ok {
        utilruntime.HandleError(fmt.Errorf("expect *node, got %#v", item))
        return true
    }
    err := gc.attemptToDeleteItem(n)
    if err != nil {
        if _, ok := err.(*restMappingError); ok {
            // There are at least two ways this can happen:
            // 1. The reference is to an object of a custom type that has not yet been
            //    recognized by gc.restMapper (this is a transient error).
            // 2. The reference is to an invalid group/version. We don't currently
            //    have a way to distinguish this from a valid type we will recognize
            //    after the next discovery sync.
            // For now, record the error and retry.
            klog.V(5).Infof("error syncing item %s: %v", n, err)
        } else {
            utilruntime.HandleError(fmt.Errorf("error syncing item %s: %v", n, err))
        }
        // retry if garbage collection of an object failed.
        // 如果對象的垃圾收集失敗筋粗,則重試。
        gc.attemptToDelete.AddRateLimited(item)
    } else if !n.isObserved() {
        // requeue if item hasn't been observed via an informer event yet.
        // otherwise a virtual node for an item added AND removed during watch reestablishment can get stuck in the graph and never removed.
        // see https://issue.k8s.io/56121
        klog.V(5).Infof("item %s hasn't been observed via informer yet", n.identity)
        gc.attemptToDelete.AddRateLimited(item)
    }
    return true
}

關鍵方法attemptToDeleteItem:

func (gc *GarbageCollector) attemptToDeleteItem(item *node) error {
    klog.V(2).Infof("processing item %s", item.identity)
    // "being deleted" is an one-way trip to the final deletion. We'll just wait for the final deletion, and then process the object's dependents.
    // item資源被標記為正在刪除,即deletionTimestamp不為nil;且不是正在刪除從資源(這個從上一節(jié)可以看出,只有item被foreground方式刪除時,deletingDependents才會被設置為true)
    // item在刪除中,且為Orphan和Background方式刪除則直接返回
    if item.isBeingDeleted() && !item.isDeletingDependents() {
        klog.V(5).Infof("processing item %s returned at once, because its DeletionTimestamp is non-nil", item.identity)
        return nil
    }
    // TODO: It's only necessary to talk to the API server if this is a
    // "virtual" node. The local graph could lag behind the real status, but in
    // practice, the difference is small.
    //根據(jù)item里的信息獲取object對象體
    latest, err := gc.getObject(item.identity)
    switch {
    case errors.IsNotFound(err):
        // the GraphBuilder can add "virtual" node for an owner that doesn't
        // exist yet, so we need to enqueue a virtual Delete event to remove
        // the virtual node from GraphBuilder.uidToNode.
        klog.V(5).Infof("item %v not found, generating a virtual delete event", item.identity)
        gc.dependencyGraphBuilder.enqueueVirtualDeleteEvent(item.identity)
        // since we're manually inserting a delete event to remove this node,
        // we don't need to keep tracking it as a virtual node and requeueing in attemptToDelete
        item.markObserved()
        return nil
    case err != nil:
        return err
    }

    //uid不匹配
    if latest.GetUID() != item.identity.UID {
        klog.V(5).Infof("UID doesn't match, item %v not found, generating a virtual delete event", item.identity)
        gc.dependencyGraphBuilder.enqueueVirtualDeleteEvent(item.identity)
        // since we're manually inserting a delete event to remove this node,
        // we don't need to keep tracking it as a virtual node and requeueing in attemptToDelete
        //因為我們手動插入刪除事件以刪除此節(jié)點击胜,我們不需要將其作為虛擬節(jié)點跟蹤并在attemptToDelete中重新排隊
        item.markObserved()
        return nil
    }

    // TODO: attemptToOrphanWorker() routine is similar. Consider merging
    // attemptToOrphanWorker() into attemptToDeleteItem() as well.
    // item的從資源正在刪除中,同時刪除其從資源
    if item.isDeletingDependents() {
        return gc.processDeletingDependentsItem(item)
    }

    // compute if we should delete the item
    // 獲取該object里metadata.ownerReference
    // 計算我們是否應刪除該項目
    ownerReferences := latest.GetOwnerReferences()
    if len(ownerReferences) == 0 {
        //沒有owner的不用處理
        klog.V(2).Infof("object %s's doesn't have an owner, continue on next item", item.identity)
        return nil
    }

    //solid(owner存在,owner沒被刪或者終結器不為foregroundDeletion Finalizer); dangling(owner不存在)
    // waitingForDependentsDeletion(owner存在,owner的deletionTimestamp為非nil亏狰,并且有foregroundDeletion Finalizer)owner列表
    solid, dangling, waitingForDependentsDeletion, err := gc.classifyReferences(item, ownerReferences)
    if err != nil {
        return err
    }
    klog.V(5).Infof("classify references of %s.\nsolid: %#v\ndangling: %#v\nwaitingForDependentsDeletion: %#v\n", item.identity, solid, dangling, waitingForDependentsDeletion)

    switch {
    //item對象的owner存在,且不是正在刪除
    case len(solid) != 0:
        klog.V(2).Infof("object %#v has at least one existing owner: %#v, will not garbage collect", solid, item.identity)
        if len(dangling) == 0 && len(waitingForDependentsDeletion) == 0 {
            return nil
        }
        klog.V(2).Infof("remove dangling references %#v and waiting references %#v for object %s", dangling, waitingForDependentsDeletion, item.identity)
        // waitingForDependentsDeletion needs to be deleted from the
        // ownerReferences, otherwise the referenced objects will be stuck with
        // the FinalizerDeletingDependents and never get deleted.
        // waitingForDependentsDeletion需要從 ownerReferences中刪除,否則引用的對象將被
        // FinalizerDeletingDependents所卡住偶摔,并且永遠不會被刪除。
        //需要移除的ownerUids
        ownerUIDs := append(ownerRefsToUIDs(dangling), ownerRefsToUIDs(waitingForDependentsDeletion)...)
        //拼接patch請求參數(shù)
        patch := deleteOwnerRefStrategicMergePatch(item.identity.UID, ownerUIDs...)
        //發(fā)送patch請求
        _, err = gc.patch(item, patch, func(n *node) ([]byte, error) {
            return gc.deleteOwnerRefJSONMergePatch(n, ownerUIDs...)
        })
        return err
    //item對象的owner正在被刪除; 且item有從資源
    case len(waitingForDependentsDeletion) != 0 && item.dependentsLength() != 0:
        deps := item.getDependents()
        // 遍歷item從資源
        for _, dep := range deps {
            if dep.isDeletingDependents() {
                // this circle detection has false positives, we need to
                // apply a more rigorous detection if this turns out to be a
                // problem.
                // there are multiple workers run attemptToDeleteItem in
                // parallel, the circle detection can fail in a race condition.
                klog.V(2).Infof("processing object %s, some of its owners and its dependent [%s] have FinalizerDeletingDependents, to prevent potential cycle, its ownerReferences are going to be modified to be non-blocking, then the object is going to be deleted with Foreground", item.identity, dep.identity)
                // 生成一個補丁促脉,該補丁會取消設置item所有ownerReferences的BlockOwnerDeletion字段,避免阻塞item的owner刪除
                patch, err := item.unblockOwnerReferencesStrategicMergePatch()
                if err != nil {
                    return err
                }
                //執(zhí)行patch
                if _, err := gc.patch(item, patch, gc.unblockOwnerReferencesJSONMergePatch); err != nil {
                    return err
                }
                break
            }
        }
        //item對象的至少一個owner具有foregroundDeletion Finalizer辰斋,并且該對象本身具有依賴項,因此它將在Foreground中刪除
        klog.V(2).Infof("at least one owner of object %s has FinalizerDeletingDependents, and the object itself has dependents, so it is going to be deleted in Foreground", item.identity)
        // the deletion event will be observed by the graphBuilder, so the item
        // will be processed again in processDeletingDependentsItem. If it
        // doesn't have dependents, the function will remove the
        // FinalizerDeletingDependents from the item, resulting in the final
        // deletion of the item.
        // graphBuilder將觀察刪除事件瘸味,因此將在processDeletingDependentsItem中再次處理該項目宫仗。
        // 如果沒有依賴項,該函數(shù)將從項中刪除foregroundDeletion Finalizer旁仿,最終刪除item藕夫。
        policy := metav1.DeletePropagationForeground
        return gc.deleteObject(item.identity, &policy)
    default:
        // item doesn't have any solid owner, so it needs to be garbage
        // collected. Also, none of item's owners is waiting for the deletion of
        // the dependents, so set propagationPolicy based on existing finalizers.
        // item沒有任何實體所有者,因此需要收集垃圾 枯冈。此外毅贮,項目的所有者都沒有等待刪除
        // 依賴項,因此請根據(jù)現(xiàn)有的終結器設置propagationPolicy尘奏。
        var policy metav1.DeletionPropagation
        switch {
        case hasOrphanFinalizer(latest):
            // if an existing orphan finalizer is already on the object, honor it.
            //如果現(xiàn)有的孤兒終結器已經(jīng)在對象上滩褥,請尊重它。
            policy = metav1.DeletePropagationOrphan
        case hasDeleteDependentsFinalizer(latest):
            // if an existing foreground finalizer is already on the object, honor it.
            //如果現(xiàn)有的前景終結器已經(jīng)在對象上炫加,請尊重它瑰煎。
            policy = metav1.DeletePropagationForeground
        default:
            // otherwise, default to background.
            //否則,默認為背景俗孝。
            policy = metav1.DeletePropagationBackground
        }
        klog.V(2).Infof("delete object %s with propagation policy %s", item.identity, policy)
        //刪除孤兒對象
        return gc.deleteObject(item.identity, &policy)
    }
}

主要做以下事情:
1酒甸、item在刪除中,且為Orphan和Background方式刪除則直接返回赋铝;
2插勤、item是foreground方式刪除時,調用processDeletingDependentsItem去處理阻塞其刪除的從資源,將其放到attemptToDelete隊列饮六;
3其垄、獲取item的owner對象集,調用classifyReferences將owner集合分為3類卤橄,分別為solid(owner存在或者終結器不為foregroundDeletion的owner集合), dangling(已經(jīng)不存在了的owner集群), waitingForDependentsDeletion(owner的deletionTimestamp為非nil绿满,并且為foregroundDeletion終結器的owner集合)
4、switch第一個case:solid集合不為空窟扑,即item存在沒被刪除的owner喇颁。當dangling和waitingForDependentsDeletion都為空,則直接返回嚎货;當dangling或waitingForDependentsDeletion不為空橘霎,合并兩個集合uid,執(zhí)行patch請求殖属,將這些uid對應的ownerReferences從item中刪除
5姐叁、switch第二個case:waitingForDependentsDeletion集合不為空,且item有從資源洗显。即item的owner不存在外潜,或正在被foregroundDeletion方式刪除,如果item的從資源正在刪除依賴項挠唆,則取消阻止item的owner刪除处窥,給item執(zhí)行patch請求,最終采用foregroundDeletion方式刪除item玄组;
6滔驾、switch第三個case:以上條件不符合時,則直接根據(jù)item中的終結器刪除item俄讹,默認為Background方式刪除哆致。


往細了說,processDeletingDependentsItem方法獲取item從資源中BlockOwnerDeletion為true的ownerReferences集合颅悉,如果為空沽瞭,則移除item的foregroundDeletion終結器。否則遍歷剩瓶,將未開始刪除的依賴項的從資源dep加入到嘗試刪除隊列attemptToDelete驹溃。

//等待其依賴項被刪除的進程項
func (gc *GarbageCollector) processDeletingDependentsItem(item *node) error {
    //阻塞item資源刪除的從資源列表
    blockingDependents := item.blockingDependents()
    //沒有阻塞item資源刪除的從資源,則移除item資源的foregroundDeletion終結器
    if len(blockingDependents) == 0 {
        klog.V(2).Infof("remove DeleteDependents finalizer for item %s", item.identity)
        return gc.removeFinalizer(item, metav1.FinalizerDeleteDependents)
    }
    //遍歷阻塞item資源刪除的從資源
    for _, dep := range blockingDependents {
        // 如果dep的從資源沒有開始刪除,則將dep加入到嘗試刪除隊列中
        if !dep.isDeletingDependents() {
            klog.V(2).Infof("adding %s to attemptToDelete, because its owner %s is deletingDependents", dep.identity, item.identity)
            //將從資源加入刪除隊列
            gc.attemptToDelete.Add(dep)
        }
    }
    return nil
}

gc.classifyReferences(item, ownerReferences)方法:遍歷了item的owner列表,調用isDangling方法將已不存在的owner加入到isDangling列表延曙;owner正在被刪除,且owner有foregroundDeletion終結器的加入到waitingForDependentsDeletion列表豌鹤;owner沒開始刪或者終結器不為foregroundDeletion的加入到solid列表。

// 將latestReferences分為三類:
// solid:所有者存在枝缔,且不是waitingForDependentsDeletion
// dangling懸空:所有者不存在
// waitingForDependentsDeletion: 所有者存在布疙,其deletionTimestamp為非nil蚊惯,并且有FinalizerDeletingDependents
func (gc *GarbageCollector) classifyReferences(item *node, latestReferences []metav1.OwnerReference) (
    solid, dangling, waitingForDependentsDeletion []metav1.OwnerReference, err error) {
    //遍歷該node的owner
    for _, reference := range latestReferences {
        //獲取owner是否存在;isDangling為true表示不存在,發(fā)生err則最終將該item加入AddRateLimited attemptToDelete隊列
        isDangling, owner, err := gc.isDangling(reference, item)
        if err != nil {
            return nil, nil, nil, err
        }
        //將不存在的owner加入dangling切片
        if isDangling {
            dangling = append(dangling, reference)
            continue
        }

        //owner存在,獲取accessor
        ownerAccessor, err := meta.Accessor(owner)
        if err != nil {
            return nil, nil, nil, err
        }
        //owner正在被刪除,且owner有foregroundDeletion Finalizer
        if ownerAccessor.GetDeletionTimestamp() != nil && hasDeleteDependentsFinalizer(ownerAccessor) {
            //owner將等待依賴刪除;收集等待刪除依賴的owner列表
            waitingForDependentsDeletion = append(waitingForDependentsDeletion, reference)
        } else {
            //owner沒被刪或者終結器不為foregroundDeletion Finalizer
            solid = append(solid, reference)
        }
    }
    return solid, dangling, waitingForDependentsDeletion, nil
}

gc.isDangling(reference, item)方法:先從absentOwnerCache緩存中根據(jù)owner uid獲取owner是否存在;如果緩存中沒有灵临,則根據(jù)ownerReferences中的參數(shù)截型,構建參數(shù),調用apiserver接口獲取owner對象是否能查到儒溉。查到如果uid不匹配宦焦,加入absentOwnerCache緩存,并返回false顿涣。

// isDangling檢查引用是否指向不存在的對象波闹。 如果isDangling在API服務器上查找引用的對象,它也返回其最新狀態(tài)涛碑。
func (gc *GarbageCollector) isDangling(reference metav1.OwnerReference, item *node) (
    dangling bool, owner *unstructured.Unstructured, err error) {
    if gc.absentOwnerCache.Has(reference.UID) {
        klog.V(5).Infof("according to the absentOwnerCache, object %s's owner %s/%s, %s does not exist", item.identity.UID, reference.APIVersion, reference.Kind, reference.Name)
        return true, nil, nil
    }
    // TODO: we need to verify the reference resource is supported by the
    // system. If it's not a valid resource, the garbage collector should i)
    // ignore the reference when decide if the object should be deleted, and
    // ii) should update the object to remove such references. This is to
    // prevent objects having references to an old resource from being
    // deleted during a cluster upgrade.
    resource, namespaced, err := gc.apiResource(reference.APIVersion, reference.Kind)
    if err != nil {
        return false, nil, err
    }

    // TODO: It's only necessary to talk to the API server if the owner node
    // is a "virtual" node. The local graph could lag behind the real
    // status, but in practice, the difference is small.
    owner, err = gc.dynamicClient.Resource(resource).Namespace(resourceDefaultNamespace(namespaced, item.identity.Namespace)).Get(reference.Name, metav1.GetOptions{})
    switch {
    case errors.IsNotFound(err):
        gc.absentOwnerCache.Add(reference.UID)
        klog.V(5).Infof("object %s's owner %s/%s, %s is not found", item.identity.UID, reference.APIVersion, reference.Kind, reference.Name)
        return true, nil, nil
    case err != nil:
        return false, nil, err
    }

    if owner.GetUID() != reference.UID {
        klog.V(5).Infof("object %s's owner %s/%s, %s is not found, UID mismatch", item.identity.UID, reference.APIVersion, reference.Kind, reference.Name)
        gc.absentOwnerCache.Add(reference.UID)
        return true, nil, nil
    }
    return false, owner, nil
}

attemptToOrphan隊列

來到代碼:

func (gc *GarbageCollector) runAttemptToOrphanWorker() {
    for gc.attemptToOrphanWorker() {
    }
}

死循環(huán)一直從attemptToOrphan隊列中獲取item資源精堕,調用gc.orphanDependents(owner.identity, dependents)方法,從item從資源中刪掉該item的ownerReferences蒲障,期間如果發(fā)生錯誤歹篓,則通過rateLimited重新加回attemptToOrphan隊列。最后移除item中的orphan終結器揉阎。

// attemptToOrphanWorker將一個節(jié)點從attemptToOrphan中取出滋捶,然后根據(jù)GC維護的圖找到它的依賴項,然后將其從其依賴項的
// OwnerReferences中刪除余黎,最后更新item以刪除孤兒終結器。如果這些步驟中的任何一個失敗载萌,則將節(jié)點添加回attemptToOrphan惧财。
func (gc *GarbageCollector) attemptToOrphanWorker() bool {
    item, quit := gc.attemptToOrphan.Get()
    gc.workerLock.RLock()
    defer gc.workerLock.RUnlock()
    if quit {
        return false
    }
    defer gc.attemptToOrphan.Done(item)
    owner, ok := item.(*node)
    if !ok {
        utilruntime.HandleError(fmt.Errorf("expect *node, got %#v", item))
        return true
    }
    // we don't need to lock each element, because they never get updated
    owner.dependentsLock.RLock()
    dependents := make([]*node, 0, len(owner.dependents))
    for dependent := range owner.dependents {
        dependents = append(dependents, dependent)
    }
    owner.dependentsLock.RUnlock()
    // 處理孤兒
    err := gc.orphanDependents(owner.identity, dependents)
    if err != nil {
        utilruntime.HandleError(fmt.Errorf("orphanDependents for %s failed with %v", owner.identity, err))
        gc.attemptToOrphan.AddRateLimited(item)
        return true
    }
    // update the owner, remove "orphaningFinalizer" from its finalizers list
    // 移除item的orphan終結器
    err = gc.removeFinalizer(owner, metav1.FinalizerOrphanDependents)
    if err != nil {
        utilruntime.HandleError(fmt.Errorf("removeOrphanFinalizer for %s failed with %v", owner.identity, err))
        gc.attemptToOrphan.AddRateLimited(item)
    }
    return true
}

gc.orphanDependents(owner.identity, dependents)方法:遍歷item的從資源,并發(fā)的執(zhí)行patch請求扭仁,刪除從資源中和item同uid的ownerReferences垮衷,將error加入到errCh channel中,最后給調用者返回error列表:

// dependents are copies of pointers to the owner's dependents, they don't need to be locked.
func (gc *GarbageCollector) orphanDependents(owner objectReference, dependents []*node) error {
    errCh := make(chan error, len(dependents))
    wg := sync.WaitGroup{}
    wg.Add(len(dependents))
    for i := range dependents {
        go func(dependent *node) {
            defer wg.Done()
            // the dependent.identity.UID is used as precondition
            patch := deleteOwnerRefStrategicMergePatch(dependent.identity.UID, owner.UID)
            _, err := gc.patch(dependent, patch, func(n *node) ([]byte, error) {
                return gc.deleteOwnerRefJSONMergePatch(n, owner.UID)
            })
            // note that if the target ownerReference doesn't exist in the
            // dependent, strategic merge patch will NOT return an error.
            if err != nil && !errors.IsNotFound(err) {
                errCh <- fmt.Errorf("orphaning %s failed, %v", dependent.identity, err)
            }
        }(dependents[i])
    }
    wg.Wait()
    close(errCh)

    var errorsSlice []error
    for e := range errCh {
        errorsSlice = append(errorsSlice, e)
    }

    if len(errorsSlice) != 0 {
        return fmt.Errorf("failed to orphan dependents of owner %s, got errors: %s", owner, utilerrors.NewAggregate(errorsSlice).Error())
    }
    klog.V(5).Infof("successfully updated all dependents of owner %s", owner)
    return nil
}

deleteOwnerRefStrategicMergePatch方法:拼接patch請求參數(shù)乖坠。該方法同樣的搀突,在處理attemptToDelete死循中,第一個switch case處被調用熊泵。

func deleteOwnerRefStrategicMergePatch(dependentUID types.UID, ownerUIDs ...types.UID) []byte {
    var pieces []string
    //拼接需要刪除的uid
    for _, ownerUID := range ownerUIDs {
        pieces = append(pieces, fmt.Sprintf(`{"$patch":"delete","uid":"%s"}`, ownerUID))
    }
    //拼接patch請求參數(shù)
    patch := fmt.Sprintf(`{"metadata":{"ownerReferences":[%s],"uid":"%s"}}`, strings.Join(pieces, ","), dependentUID)
    return []byte(patch)
}

回到初衷

中間件redis容器化后仰迁,在測試環(huán)境上部署的redis集群,在kubernetes apiserver重啟后顽分,redis集群被異常刪除(包括redis exporter statefulset徐许、redis statefulset)。


在這里插入圖片描述

原因定位

在開發(fā)環(huán)境上經(jīng)多次復現(xiàn)卒蘸,apiserver重啟后雌隅,通過查詢redis operator日志,并沒有發(fā)現(xiàn)主動去刪除redis集群(redis statefulset)、監(jiān)控實例(redis exporter)恰起。進一步去查看kube-controller-manager的日志修械,將其日志級別設置--v=5,繼續(xù)復現(xiàn)检盼,最終在kube-controller-manager日志中發(fā)現(xiàn)如下日志:


在這里插入圖片描述

可以看到肯污,垃圾回收器garbage collector在處理redis exporter statefulset時,發(fā)現(xiàn)其加了ownerReferences梯皿,在exporter所在分區(qū)(monitoring)查詢其owner——redisCluster對象redis-0826仇箱,而redisCluster對象redis-0826存在于kube-system分區(qū),所以在monitoring分區(qū)查詢到的是404 Not Found东羹,garbage collector會將該owner不存在信息(uid)存入緩存absentOwnerCache剂桥。
因redis exporter statefulset的owner不存在,所以gc認為需要回收垃圾属提,故將其刪除掉权逗。同理,當處理redis statefulset時冤议,從緩存中發(fā)現(xiàn)owner不存在斟薇,也會回收垃圾,將其刪除掉恕酸。


在這里插入圖片描述

經(jīng)過多次復現(xiàn)故障堪滨,發(fā)現(xiàn)重啟kube-controller-manager時有概率復現(xiàn)。(Apiserver的重啟時蕊温,kube-controller-manager在連接apiserver失敗多次后袱箱,也會發(fā)生自重啟),之所以是概率問題义矛,這和garbage collector將資源對象加入attemptToDelete隊列的順序有關:


在這里插入圖片描述

先同步monitoring分區(qū)的exporter statefulset发笔,后同步kube-system分區(qū)的redis statefulset,就會出現(xiàn)該故障凉翻;反之就不會出現(xiàn)故障了讨,這取決于garbage collector啟動時全量獲取集群內(nèi)資源(listwatch)的順序。
在apiserver和kube-controller-manager正常運行時不出現(xiàn)該故障制轰,可以從garbage collector源碼中看到以下代碼邏輯:


在這里插入圖片描述

在這里插入圖片描述

Garbage collector中維護一個父子關系圖表前计,controller-manager啟動時該圖里節(jié)點是不存在的,會走上圖switch的第一個case艇挨,之后圖形成之后残炮,會走第二個case。第二個case里只有在owner發(fā)生變化時才會觸發(fā)將資源對象加入attemptToDelete隊列缩滨,所以在各個組件正常運行時沒有出現(xiàn)該故障势就。

獲取圖表的接口地址泉瞻,IP和端口都是controller-manager的,可以重定向到tmp.dot文件
dot.exe

curl http://127.0.0.1:10252/debug/controllers/garbagecollector/graph

curl http://127.0.0.1:10252/debug/controllers/garbagecollector/graph?uid=11211212edsaddkqedmk12

之后用可視化工具Graphviz軟件苞冯,進入到bin目錄下袖牙,執(zhí)行以下命令生成svg文件,用瀏覽器打開舅锄,Graphviz和dot的使用可以自行谷歌鞭达。

dot -Tsvg -o graph2.svg tmp.dot
在這里插入圖片描述

在這里插入圖片描述

解決方法

在redis operator創(chuàng)建redis集群時,將exporter放到和redis同一分區(qū)皇忿。

思考反思

1畴蹭、出現(xiàn)該故障,主要是因進行了跨命名空間owner引用鳍烁。在使用垃圾回收機制時叨襟,應該盡量參考kubernetes官方網(wǎng)站中的說明.
如下,官網(wǎng)中說明了owner引用在設計時就不允許跨namespace使用幔荒,這意味著:

1)命名空間范圍的從屬只能指定同一命名空間中的所有者糊闽,以及群集范圍的所有者。

2)群集作用域的從屬只能指定群集作用域的所有者爹梁,而不能指定命名空間作用域的所有者右犹。


在這里插入圖片描述

參考文檔

垃圾回收官方文檔:

https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/

詳解 Kubernetes 垃圾收集器的實現(xiàn)原理:

https://draveness.me/kubernetes-garbage-collector#



本公眾號免費提供csdn下載服務,海量IT學習資源姚垃,如果你準備入IT坑念链,勵志成為優(yōu)秀的程序猿,那么這些資源很適合你积糯,包括但不限于java钓账、go、python絮宁、springcloud、elk服协、嵌入式 绍昂、大數(shù)據(jù)、面試資料偿荷、前端 等資源窘游。同時我們組建了一個技術交流群,里面有很多大佬跳纳,會不定時分享技術文章忍饰,如果你想來一起學習提高,可以公眾號后臺回復【2】寺庄,免費邀請加技術交流群互相學習提高艾蓝,會不定期分享編程IT相關資源力崇。


掃碼關注,精彩內(nèi)容第一時間推給你

image
?著作權歸作者所有,轉載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末赢织,一起剝皮案震驚了整個濱河市亮靴,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌于置,老刑警劉巖茧吊,帶你破解...
    沈念sama閱讀 211,042評論 6 490
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異八毯,居然都是意外死亡搓侄,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 89,996評論 2 384
  • 文/潘曉璐 我一進店門话速,熙熙樓的掌柜王于貴愁眉苦臉地迎上來讶踪,“玉大人,你說我怎么就攤上這事尿孔】∪幔” “怎么了?”我有些...
    開封第一講書人閱讀 156,674評論 0 345
  • 文/不壞的土叔 我叫張陵活合,是天一觀的道長雏婶。 經(jīng)常有香客問我,道長白指,這世上最難降的妖魔是什么留晚? 我笑而不...
    開封第一講書人閱讀 56,340評論 1 283
  • 正文 為了忘掉前任,我火速辦了婚禮告嘲,結果婚禮上错维,老公的妹妹穿的比我還像新娘。我一直安慰自己橄唬,他們只是感情好赋焕,可當我...
    茶點故事閱讀 65,404評論 5 384
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著仰楚,像睡著了一般隆判。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上僧界,一...
    開封第一講書人閱讀 49,749評論 1 289
  • 那天侨嘀,我揣著相機與錄音,去河邊找鬼捂襟。 笑死咬腕,一個胖子當著我的面吹牛,可吹牛的內(nèi)容都是我干的葬荷。 我是一名探鬼主播涨共,決...
    沈念sama閱讀 38,902評論 3 405
  • 文/蒼蘭香墨 我猛地睜開眼纽帖,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了煞赢?” 一聲冷哼從身側響起抛计,我...
    開封第一講書人閱讀 37,662評論 0 266
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎照筑,沒想到半個月后吹截,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 44,110評論 1 303
  • 正文 獨居荒郊野嶺守林人離奇死亡凝危,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 36,451評論 2 325
  • 正文 我和宋清朗相戀三年波俄,在試婚紗的時候發(fā)現(xiàn)自己被綠了。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片蛾默。...
    茶點故事閱讀 38,577評論 1 340
  • 序言:一個原本活蹦亂跳的男人離奇死亡懦铺,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出支鸡,到底是詐尸還是另有隱情冬念,我是刑警寧澤,帶...
    沈念sama閱讀 34,258評論 4 328
  • 正文 年R本政府宣布牧挣,位于F島的核電站急前,受9級特大地震影響,放射性物質發(fā)生泄漏瀑构。R本人自食惡果不足惜裆针,卻給世界環(huán)境...
    茶點故事閱讀 39,848評論 3 312
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望寺晌。 院中可真熱鬧世吨,春花似錦、人聲如沸呻征。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,726評論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽陆赋。三九已至边篮,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間奏甫,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 31,952評論 1 264
  • 我被黑心中介騙來泰國打工凌受, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留阵子,地道東北人。 一個月前我還...
    沈念sama閱讀 46,271評論 2 360
  • 正文 我出身青樓胜蛉,卻偏偏與公主長得像挠进,于是被迫代替她去往敵國和親色乾。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 43,452評論 2 348

推薦閱讀更多精彩內(nèi)容