前言
需求如下鼻种,Kuberneters 中部署的pod訪問Cluster集群外的server(后續(xù)成為remote server)時缕探,期望可以在remote server中查看到pod的源IP。
環(huán)境如下
- kubernetes: v1.12.5
- calico: v3.7.5
- OS: Redhat7.6
工作原理
本次配置中Kubernetes集群搭建中選用的網(wǎng)絡插件為cailico搅轿。pod在網(wǎng)訪問remote server時助泽,會在node節(jié)點先匹配策略抖拦,將pod的源IP 經(jīng)過SNAT映射為node節(jié)點的物理IP 拗踢,此時remote server看到的請求IP則是node節(jié)點的IP券膀,無法獲取到pod的源IP芹彬。
所以需要在calico中添加配置,設置nat-outgoing參數(shù)為false,pod在對外訪問時不做nat映射译红,通過邊界路由實現(xiàn)訪問remote server刨沦。(注:需要在remote server上添加訪問pod的訪問路由梧田,否則remote server無法回包)
具體配置如下
calico配置文件calico.yaml中,kind: DaemonSet 的配置添加
- name: CALICO_IPV4POOL_NAT_OUTGOING
value: "false"
具體如下:
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
name: calico-node
namespace: kube-system
labels:
k8s-app: calico-node
spec:
selector:
matchLabels:
k8s-app: calico-node
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: calico-node
annotations:
# This, along with the CriticalAddonsOnly toleration below,
# marks the pod as a critical add-on, ensuring it gets
# priority scheduling and that its resources are reserved
# if it ever gets evicted.
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
nodeSelector:
beta.kubernetes.io/os: linux
hostNetwork: true
tolerations:
# Make sure calico-node gets scheduled on all nodes.
- effect: NoSchedule
operator: Exists
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
operator: Exists
serviceAccountName: calico-node
# Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
# deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
terminationGracePeriodSeconds: 0
initContainers:
# This container performs upgrade from host-local IPAM to calico-ipam.
# It can be deleted if this is a fresh installation, or if you have already
# upgraded to use calico-ipam.
- name: upgrade-ipam
image: calico/cni:v3.7.5
command: ["/opt/cni/bin/calico-ipam", "-upgrade"]
env:
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
volumeMounts:
- mountPath: /var/lib/cni/networks
name: host-local-net-dir
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
# This container installs the CNI binaries
# and CNI network config file on each node.
- name: install-cni
image: calico/cni:v3.7.5
command: ["/install-cni.sh"]
env:
# Name of the CNI config file to create.
- name: CNI_CONF_NAME
value: "10-calico.conflist"
# The CNI network config to install on each node.
- name: CNI_NETWORK_CONFIG
valueFrom:
configMapKeyRef:
name: calico-config
key: cni_network_config
# Set the hostname based on the k8s node name.
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# CNI MTU Config variable
- name: CNI_MTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Prevents the container from sleeping forever.
- name: SLEEP
value: "false"
volumeMounts:
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
- mountPath: /host/etc/cni/net.d
name: cni-net-dir
containers:
# Runs calico-node container on each Kubernetes node. This
# container programs network policy and routes on each
# host.
- name: calico-node
image: calico/node:v3.7.5
env:
# Use Kubernetes API as the backing datastore.
- name: DATASTORE_TYPE
value: "kubernetes"
# Wait for the datastore.
- name: WAIT_FOR_DATASTORE
value: "true"
# Set based on the k8s node name.
- name: NODENAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Choose the backend to use.
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
# Auto-detect the BGP IP address.
- name: IP
value: "autodetect"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "off"
# Set MTU for tunnel device used if ipip is enabled
- name: FELIX_IPINIPMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within `--cluster-cidr`.
- name: CALICO_IPV4POOL_CIDR
value: "10.96.0.0/16"
- name: CALICO_IPV4POOL_NAT_OUTGOING
value: "false"
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# Set Felix endpoint to host default action to ACCEPT.
- name: FELIX_DEFAULTENDPOINTTOHOSTACTION
value: "ACCEPT"
# Disable IPv6 on Kubernetes.
- name: FELIX_IPV6SUPPORT
value: "false"
# Set Felix logging to "info"
- name: FELIX_LOGSEVERITYSCREEN
value: "info"
- name: FELIX_HEALTHENABLED
value: "true"
securityContext:
privileged: true
resources:
requests:
cpu: 250m
livenessProbe:
httpGet:
path: /liveness
port: 9099
host: localhost
periodSeconds: 10
initialDelaySeconds: 10
failureThreshold: 6
readinessProbe:
exec:
command:
- /bin/calico-node
- -bird-ready
- -felix-ready
periodSeconds: 10
volumeMounts:
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- mountPath: /var/run/calico
name: var-run-calico
readOnly: false
- mountPath: /var/lib/calico
name: var-lib-calico
readOnly: false
volumes:
# Used by calico-node.
- name: lib-modules
hostPath:
path: /lib/modules
- name: var-run-calico
hostPath:
path: /var/run/calico
- name: var-lib-calico
hostPath:
path: /var/lib/calico
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
# Used to install CNI.
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin
- name: cni-net-dir
hostPath:
path: /etc/cni/net.d
# Mount in the directory for host-local IPAM allocations. This is
# used when upgrading from host-local to calico-ipam, and can be removed
# if not using the upgrade-ipam init container.
- name: host-local-net-dir
hostPath:
path: /var/lib/cni/networks
啟動calico后用node訪問remote server可看到pod的源IP
實現(xiàn)效果
Pod IP 內網(wǎng)地址:10.96.164.131
node IP : 10.0.5.203
remote server: 10.0.5.204
remote server 未在kubernetes集群中
-
pod訪問遠程服務器10.0.5.204的8081,地址不可達,因為remote server上未添加返回pod的路由
-
remote server 10.0.5.204上添加路由
含義為10.0.5.204訪問10.90.164.131這個pod時,由于網(wǎng)段不同信粮,使用10.0.5.203作為網(wǎng)關轉發(fā)访娶。
-
再次從pod發(fā)送訪問請求,可查看訪問成功
-
remote server tcpdump查看pod源IP,可查看到源IP為10.90.164.131
Tips
- 路由策略可以添加直接主機的乱顾,也可以添加指向子網(wǎng)的
route add -net 10.0.X.0/24 gw 10.0.X.1
- 啟動pod時間,默認dns策略會自動加載node節(jié)點的/etc/resolv.conf中的search域扭粱。由于nat已經(jīng)被禁用,原來node可訪問的dns search域則會在pod中不可達背伴。造成pod查詢域超時徐紧。建議在kubernetes搭建時配置kubelet不繼承node節(jié)點的/etc/resolv.conf