1. 下載metrics-server代碼
git clone https://github.com/kubernetes-incubator/metrics-server.git
2. 查看依賴(lài)的鏡像
$ cd metrics-server/deploy/1.8+
$ grep 'image:' *
metrics-server-deployment.yaml: image: k8s.gcr.io/metrics-server-amd64:v0.3.3
假如gcr.io的鏡像訪問(wèn)不到可以將metrics-server-deployment.yaml中的鏡像替換為:registry.cn-hangzhou.aliyuncs.com/kubernets-imags/metrics-server-amd64:v0.3.3
sed -i "s/image: .*/image: registry.cn-hangzhou.aliyuncs.com\/kubernets-imags\/metrics-server-amd64:v0.3.3/g" metrics-server-deployment.yaml
3. 安裝metrics-server
$ cd metrics-server
$ kubectl create -f deploy/1.8+/
稍后就可以看到 metrics-server 運(yùn)行起來(lái):
$ kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-54957b58f4-dnntx 1/1 Running 0 21s
4. 驗(yàn)證是否安全成功
$ kubectl top node
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
從上面的輸出可以看到 metrics-server 并未成功啟動(dòng)淘衙。查看 metrics-server 運(yùn)行日志:
$ kubectl logs metrics-server-54957b58f4-dnntx -n kube-system
E1005 11:58:15.654250 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:mesos-test2: unable to fetch metrics from Kubelet mesos-test2 (mesos-test2): Get https://mesos-test2:10250/stats/summary/: dial tcp: lookup mesos-test2 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:k8s-slave20: unable to fetch metrics from Kubelet k8s-slave20 (k8s-slave20): Get https://k8s-slave20:10250/stats/summary/: dial tcp: lookup k8s-slave20 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:mesos-test1: unable to fetch metrics from Kubelet mesos-test1 (mesos-test1): Get https://mesos-test1:10250/stats/summary/: dial tcp: lookup mesos-test1 on 10.96.0.10:53: no such host]
可以看到metrics-server在從kubelet的10250端口獲取信息時(shí)彤守,使用的是hostname,而因?yàn)閚ode1和node2是一個(gè)獨(dú)立的Kubernetes演示環(huán)境具垫,只是修改了這兩個(gè)節(jié)點(diǎn)系統(tǒng)的/etc/hosts文件,而并沒(méi)有內(nèi)網(wǎng)的DNS服務(wù)器卦碾,所以metrics-server中不認(rèn)識(shí)node1和node2的名字起宽。
解決方案:
- 刪除metrics-server
kubectl delete pods metrics-server-54957b58f4-dnntx -n kube-system
- 修改metrics-server-deployment.yaml,添加如下command配置绿映,然后重新部署metrics-server屏箍。
imagePullPolicy: Always
command:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
volumeMounts:
- name: tmp-dir
mountPath: /tmp