Calico部署完后pod狀態(tài)顯示CrashLoopBackOff如何處理做粤?
地址:http://www.reibang.com/p/87a01ec9964c
環(huán)境準備
如題,在開始之前我們需要一個干凈 Kubernetes 集群乎完,這里說的干凈是指沒有被網(wǎng)絡(luò)插件干預(yù)過的集群。所以我這里準備如下三個節(jié)點:
IP Role OS
10.0.1.111 Master纪蜒、Node Ubuntu 18.04
10.0.1.112 Master欺税、Node Ubuntu 18.04
10.0.1.113 Node Ubuntu 18.04
使用系統(tǒng)版本為 Ubuntu 18.04,這里就直接使用 Ansible Role 的方式來快速以二進制形式創(chuàng)建一個干凈的 Kubernetes 集群
Ansible部署二進制的k8s
地址:http://www.reibang.com/p/85edca636ddc
修改主機清單 hosts.yaml 配置內(nèi)容如下:
all:
vars:
ansible_user: root
ansible_ssh_pass: root1234
ansible_sudo_pass: root1234
is_mutil_master: yes
virtual_ip: 10.0.1.110
virtual_ip_device: ens33
service_net: 10.0.0.0/24
pod_net: 10.244.0.0/16
proxy_master_port: 7443
install_dir: /opt/apps/
package_dir: /opt/packages/
tls_dir: /opt/k8s_tls
ntp_host: ntp1.aliyun.com
have_network: yes
replace_repo: yes
docker_registry_mirrors: https://7hsct51i.mirror.aliyuncs.com
kubelet_bootstrap_token: 8fba966b6e3b5d182960a30f6cb94428
pause_image: registry.cn-shenzhen.aliyuncs.com/zze/pause:3.2
dashboard_port: 30001
dashboard_token_file: dashboard_token.txt
ingress_controller_type: nginx
hosts:
10.0.1.111:
hostname: k8s-master1
master: yes
node: yes
etcd: yes
proxy_master: yes
proxy_priority: 110
10.0.1.112:
hostname: k8s-master2
master: yes
node: yes
etcd: yes
proxy_master: yes
proxy_priority: 100
10.0.1.113:
hostname: k8s-node1
etcd: yes
node: yes
ingress: yes
通過如上配置可以構(gòu)建一個由三節(jié)點組成的 2 Master + 3 Node 的 Kubernetes 集群抗碰,開始執(zhí)行 Playbook:
$ ansible-playbook -i hosts.yml run.yml --skip-tag=deploy_manifests
...
TASK [start_service : 簽發(fā) Kubelet 申請的證書 - 簽發(fā)證書 (2/2)] ***************************************************************************************************************************************************************************
skipping: [10.0.1.112]
skipping: [10.0.1.113]
changed: [10.0.1.111]
PLAY RECAP *********************************************************************************************************************************************************************************************************************
10.0.1.111 : ok=88 changed=46 unreachable=0 failed=0 skipped=20 rescued=0 ignored=0
10.0.1.112 : ok=68 changed=32 unreachable=0 failed=0 skipped=15 rescued=0 ignored=0
10.0.1.113 : ok=48 changed=20 unreachable=0 failed=0 skipped=35 rescued=0 ignored=0
此 Ansible
默認在部署完 Kubernetes
的基本組件后還會自動安裝網(wǎng)絡(luò)插件、CoreDNS
鳍寂、Dashboard
等附件改含,這里通過 --skip-tag=deploy-manifests
來忽略這些步驟
由于此 Ansible
默認是使用 Flannel
作為 cni
插件實現(xiàn)的,所以預(yù)裝了一些 Flannel
二進制包迄汛,可以在各個節(jié)點中將其刪除:
$ rm -f /opt/apps/cni/bin/*
至此捍壤,一個干凈的 Kubernetes 集群就已經(jīng)構(gòu)建完成,可以看到它的各個節(jié)點如下:
$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master1 NotReady <none> 38m v1.19.3
k8s-master2 NotReady <none> 38m v1.19.3
k8s-node1 NotReady <none> 38m v1.19.3
這里由于還沒有安裝網(wǎng)絡(luò)插件鞍爱,所以它處于 NotReady
狀態(tài)鹃觉,咱們繼續(xù)下面的 Calico
部署步驟做完它們就會成為 Ready
狀態(tài)了。
Calico 部署
從官網(wǎng)下載資源文件:
$ wget https://docs.projectcalico.org/manifests/calico-etcd.yaml
下面只列出修改的部分:
$ vim calico-etcd.yaml
...
# 這里反引號包裹的內(nèi)容表示需要執(zhí)行它將其結(jié)果替換到此處
# etcd 證書私鑰
etcd-key: `cat /opt/k8s_tls/etcd/server-key.pem | base64 -w 0`
# etcd 證書
etcd-cert: `cat /opt/k8s_tls/etcd/server.pem | base64 -w 0`
# etcd CA 證書
etcd-ca: `cat /opt/k8s_tls/etcd/ca.pem | base64 -w 0`
...
# etcd 集群地址
etcd_endpoints: "https://10.0.1.111:2379,https://10.0.1.112:2379,https://10.0.1.113:2379"
etcd_ca: "/calico-secrets/etcd-ca"
etcd_cert: "/calico-secrets/etcd-cert"
etcd_key: "/calico-secrets/etcd-key"
...
# 禁止使用 IPIP 模式
- name: CALICO_IPV4POOL_IPIP
value: "Never"
# 設(shè)置 Pod IP 地址段睹逃,此處 value 應(yīng)該與之前配置的 hosts.yaml 中的 pod_net 變量值一致
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
...
# 修改 cni 插件二進制文件映射到宿主機的目錄盗扇,此處 /opt/apps 與 hosts.yaml 中的 install_dir 變量值一致
- name: cni-bin-dir
hostPath:
path: /opt/apps/cni/bin
# 修改 cni 配置目錄為手動指定的目錄,此處 /opt/apps 與 hosts.yaml 中的 install_dir 變量值一致
- name: cni-net-dir
hostPath:
path: /opt/apps/cni/conf
# 修改 cni 日志目錄為手動指定的目錄沉填,此處 /opt/apps 與 hosts.yaml 中的 install_dir 變量值一致
- name: cni-log-dir
hostPath:
path: /opt/apps/cni/log
# 修改此卷的掛載權(quán)限為 0440疗隶,有兩處
- name: etcd-certs
secret:
secretName: calico-etcd-secrets
defaultMode: 0440
由于該資源文件使用的鏡像源在國外,我將它們 download 下來后上傳到了阿里云倉庫翼闹,可以執(zhí)行下面操作進行替換
$ sed -i 's#docker.io/calico/cni:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-cni:v3.18.0#g;s#docker.io/calico/pod2daemon-flexvol:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-pod2daemon-flexvol:v3.18.0#g;s#docker.io/calico/node:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-node:v3.18.0#g;s#docker.io/calico/kube-controllers:v3.18.0#registry.cn-shenzhen.aliyuncs.com/zze/calico-kube-controllers:v3.18.0#g' calico-etcd.yaml
注意:此時下載的 YAML 鏡像版本為 v3.18.0
應(yīng)用修改好的資源文件:
kubectl apply -f calico-etcd.yaml
secret/calico-etcd-secrets created
configmap/calico-config created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
poddisruptionbudget.policy/calico-kube-controllers created
稍等片刻會在 kube-system
命名空間下啟動如下 Pod:
$ kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-79678fdb96-5w4kl 1/1 Running 0 16m 10.0.1.111 k8s-master1 <none> <none>
calico-node-hsm8s 1/1 Running 0 16m 10.0.1.112 k8s-master2 <none> <none>
calico-node-qnm9r 1/1 Running 0 16m 10.0.1.113 k8s-node1 <none> <none>
calico-node-t4cjq 1/1 Running 0 16m 10.0.1.111 k8s-master1 <none> <none>
測試一下 Pod 的跨主機通信斑鼻,應(yīng)用如下資源文件:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: test
name: test
spec:
replicas: 3
selector:
matchLabels:
app: test
strategy: {}
template:
metadata:
labels:
app: test
spec:
containers:
- image: busybox:latest
command: ['sleep','3000']
name: busybox
成功應(yīng)用后將會創(chuàng)建如下三個 Pod:
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-c4f594994-nl2ks 1/1 Running 0 2m49s 10.244.0.5 k8s-node1 <none> <none>
test-c4f594994-s48pl 1/1 Running 0 2m49s 10.244.2.2 k8s-master1 <none> <none>
test-c4f594994-wv6nj 1/1 Running 0 2m49s 10.244.1.2 k8s-master2 <none> <none>
也可以在各 Node 上查看到由 Calico 管理的路由信息
隨便進入一個 Pod 測試 ping 其它兩個 Pod:
$ kubectl exec -it test-c4f594994-nl2ks -- sh
/ # ping 10.244.2.2
PING 10.244.2.2 (10.244.2.2): 56 data bytes
64 bytes from 10.244.2.2: seq=0 ttl=62 time=0.474 ms
^C
--- 10.244.2.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.474/0.474/0.474 ms
/ # ping 10.244.1.2
PING 10.244.1.2 (10.244.1.2): 56 data bytes
64 bytes from 10.244.1.2: seq=0 ttl=62 time=0.321 ms
^C
--- 10.244.1.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.321/0.321/0.321 ms
可以正常通信,說明 Calico 已經(jīng)正常在 Kubernetes 集群中工作了猎荠。
參考官網(wǎng):https://docs.projectcalico.org/getting-started/kubernetes/self-managed-onprem/onpremises
在完成這篇文章之后坚弱。我已經(jīng)對上述使用的 Ansible Role
進行了增強蜀备,以對 Calico
提供支持,所以你如果想要在 Kubernetes
集群中應(yīng)用 Calico
荒叶,直接使用我的 Ansible Role
就可以一鍵部署完成碾阁。地址:http://www.reibang.com/p/85edca636ddc