機器集群規(guī)劃

操作系統(tǒng)要求

IBM POWER9: RHEL-ALT 7.5 with the "Minimal" installation option and the latest packages from the Extras channel.

IBM POWER8: RHEL 7.5 with the "Minimal" installation option and the latest packages from the Extras channel.

Master :

Minimum 4 vCPU (additional are strongly recommended).

Minimum 16 GB RAM (additional memory is strongly recommended, especially if etcd is co-located on masters).

Minimum 40 GB hard disk space for the file system containing/var/.

Minimum 1 GB hard disk space for the file system containing/usr/local/bin/.

Minimum 1 GB hard disk space for the file system containing the system’s temporary directory.

Masters with a co-located etcd require a minimum of 4 cores. Two-core systems do not work.

Nodes:

NetworkManager 1.0 or later.

1 vCPU.

Minimum 8 GB RAM.

Minimum 15 GB hard disk space for the file system containing/var/.

Minimum 1 GB hard disk space for the file system containing/usr/local/bin/.

Minimum 1 GB hard disk space for the file system containing the system’s temporary directory.

An additional minimum 15 GB unallocated space per system running containers for Docker’s storage back end; seeConfiguring Docker Storage. Additional space might be required, depending on the size and number of containers that run on the node.

實驗集群

Master? ?172.XX.XX.175?

Node? ?172.XX.XX.182? ?

? ? ? ? ? ? 172.XX.XX.183? ?

操作步驟

1?Enable Security-Enhanced Linux (SELinux) on all of the nodes

? ? ? ?a. vi /etc/selinux/config

??????????????? set SELINUX=enforcing and SELINUXTYPE=targeted

??????? b. touch /.autorelabel; reboot

2?Ensuring host access

設(shè)置master到各個Node的免密登錄

2.1?Generate an SSH key on the host you run the installation playbook on:

# ssh-keygen

2.2?Distribute the key to the other cluster hosts. You can use abashloop:

# for host in master.openshift.example.com \1node1.openshift.example.com \2node2.openshift.example.com; \3do ssh-copy-id -i ~/.ssh/id_rsa.pub $host; \ done

3? 更新網(wǎng)卡配置信息

??????? In /etc/sysconfig/network-scripts/ifcfg-ethxx

??????????????? a. Make sure that: NM_CONTROLLED=yes

??????????????? b. Add following entries:

??????????????????????? DNS1=

??????????????????????? DNS2=

??????????????????????? DOMAIN=

??????????????? (You can get DNS values from: /etc/sysconfig/network-scripts/ifcfg-bootnet and /etc/resolv.conf)

如果都沒有值DNS1=本機IP地址

??????????????? (You can get DOMAIN value by this command:? domainname -d)

4 每臺機器設(shè)置/etc/hosts

[root@node1 network-scripts]# cat /etc/hosts

127.0.0.1? localhost localhost.localdomain localhost4 localhost4.localdomain4

::1? ? ? ? localhost localhost.localdomain localhost6 localhost6.localdomain6

172.xx.xx.175? master.openshift.example.com

172.xx.xx.182? node1.openshift.example.com

172.xx.xx.183? node2.openshift.example.com

5 yum 設(shè)置代理

如果機器不能直接上網(wǎng)阳欲，需要設(shè)置上網(wǎng)代理服務(wù)器

vi /etc/yum.conf

set? proxy=http://xx.xx.xx.xx:xxxx

6??Registering hosts(需要有紅帽的訂閱）

在每臺機器執(zhí)行

# subscription-manager register --username=<user_name> --password=<password>

# subscription-manager refresh

# subscription-manager list --available --matches '*OpenShift*'

# subscription-manager attach --pool=<pool_id>

6 注冊yum 源

For on-premise installations on IBM POWER8 servers, run the following command

subscription-manager repos \

--enable="rhel-7-for-power-le-rpms" \

--enable="rhel-7-for-power-le-extras-rpms" \

--enable="rhel-7-for-power-le-optional-rpms" \

--enable="rhel-7-server-ansible-2.6-for-power-le-rpms" \

--enable="rhel-7-for-power-le-ose-3.11-rpms" \

--enable="rhel-7-for-power-le-fast-datapath-rpms" \

--enable="rhel-7-server-for-power-le-rhscl-rpms"

For on-premise installations on IBM POWER9 servers, run the following command:

# subscription-manager repos \

? ? --enable="rhel-7-for-power-9-rpms" \

? ? --enable="rhel-7-for-power-9-extras-rpms" \

? ? --enable="rhel-7-for-power-9-optional-rpms" \

? ? --enable="rhel-7-server-ansible-2.6-for-power-9-rpms" \

? ? --enable="rhel-7-server-for-power-9-rhscl-rpms" \

? ? --enable="rhel-7-for-power-9-ose-3.11-rpms"

7 安裝基礎(chǔ)包

7.1 每臺機器都執(zhí)行

# yum -y install wget git net-tools bind-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct

# yum -y update

# reboot

#?yum install atomic-openshift-excluder-3.11.141*

Now install a container engine:

To install CRI-O:

# yum -y install cri-o

To install Docker:

# yum -y install docker

7.2在master執(zhí)行

# yum -y install openshift-ansible

# yum install atomic-openshift atomic-openshift-clients atomic-openshift-hyperkube atomic-openshift-node flannel glusterfs-fuse? (可以不執(zhí)行此命令）

# yum install cockpit-docker cockpit-kubernetes

7.3 在node執(zhí)行

#? yum install atomic-openshift atomic-openshift-node flannel glusterfs-fuse???(可以不執(zhí)行此命令）

8 開始安裝openshift 在master節(jié)點上執(zhí)行

8.1 安裝前檢查

$ cd /usr/share/ansible/openshift-ansible

$ ansible-playbook -i <inventory_file> playbooks/prerequisites.yml

8.2 執(zhí)行安裝

$ cd /usr/share/ansible/openshift-ansible

$ ansible-playbook -i <inventory_file> playbooks/deploy_cluster.yml

9?inventory_file 示例（1 master +2 node ）

[root@master openshift-ansible]# ls

ansible.cfg? host.311? inventory? playbooks? roles

[root@master openshift-ansible]# cat host.311

# Create an OSEv3 group that contains the masters, nodes, and etcd groups

[OSEv3:children]

masters

nodes

etcd

# Set variables common for all OSEv3 hosts

[OSEv3:vars]

# SSH user, this user should allow ssh based auth without requiring a password

ansible_ssh_user=root

openshift_deployment_type=openshift-enterprise

# If ansible_ssh_user is not root, ansible_become must be set to true

#ansible_become=true

openshift_master_default_subdomain=master.openshift.example.com

debug_level=2

# default selectors for router and registry services

# openshift_router_selector='node-role.kubernetes.io/infra=true'

# openshift_registry_selector='node-role.kubernetes.io/infra=true'

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider

#openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]

openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]

openshift_master_htpasswd_users={'my-rhel-icp-admin': '$apr1$6eO/grkf$9jRafb0tw/2KQEAejT8Lc.'}

# supposedly encrypted password of: S3cure-icp-wordP*s?

openshift_disable_check=memory_availability,disk_availability,docker_image_availability

openshift_master_cluster_hostname=master.openshift.example.com

openshift_master_cluster_public_hostname=master.openshift.example.com

# false

#ansible_service_broker_install=false

#openshift_enable_service_catalog=false

#template_service_broker_install=false

#openshift_logging_install_logging=false

# registry passwd

oreg_url=registry.redhat.io/openshift3/ose-${component}:${version}

oreg_auth_user=****@xxx

oreg_auth_password=*******

openshift_http_proxy=http://xxx.xxx.xxx.xxx:3130

#openshift_https_proxy=https://xx.xxx.xxx.xxx:3130

openshift_no_proxy=".openshift.example.com"

# docker config

openshift_docker_additional_registries=registry.redhat.io

#openshift_docker_insecure_registries

#openshift_docker_blocked_registries

openshift_docker_options="--log-driver json-file --log-opt max-size=1M --log-opt max-file=3"

# openshift_cluster_monitoring_operator_install=false

# openshift_metrics_install_metrics=true

# openshift_enable_unsupported_configurations=True

#openshift_logging_es_nodeselector='node-role.kubernetes.io/infra: "true"'

#openshift_logging_kibana_nodeselector='node-role.kubernetes.io/infra: "true"'

# host group for masters

[masters]

master.openshift.example.com? openshift_public_hostname="master.openshift.example.com"

# host group for etcd

[etcd]

master.openshift.example.com? openshift_public_hostname="master.openshift.example.com"

# host group for nodes, includes region info

[nodes]

master.openshift.example.com openshift_public_hostname="master.openshift.example.com"? openshift_node_group_name='node-config-master-infra'

node[1:2].openshift.example.com openshift_public_hostname="node-[1:2].openshift.example.com" openshift_node_group_name='node-config-compute'

10 安裝過程中可能出現(xiàn)的錯誤情況

10.1 如果安裝openshift_cluster_monitoring_operator_install 鳞溉，對master需要設(shè)置openshift_node_group_name='node-config-master-infra'

參考https://github.com/vorburger/opendaylight-coe-kubernetes-openshift/issues/5

10.2 對于代理設(shè)置网持，需要設(shè)置no_proxy

參考https://github.com/openshift/openshift-ansible/issues/11365

10.3? ?https://github.com/openshift/openshift-ansible/issues/10427

10.3.1?FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created?#10427

File?/etc/sysconfig/network-scripts/ifcfg-eth0?(CentOS)

There is a flag?NM_CONTROLLED=no

?10.3.2??FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created? ? #10427

I have the same issue, but what I did was....

Add NM_CONTROLLED=yes to ifcfg-eth0 to all my nodes

Verify my pods with $oc get pods --all-namespaces

$oc describe [pod cluster-monitoring-operator-WXYZ-ASDF] -n openshift-monitoring ==> With this command, in last part I could see reason with my pod didn't initiate, I have this message....

Warning? FailedCreatePodSandBox? 1h? ? ? ? ? ? ? ? ? kubelet, infra-openshift-nuuptech? Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "70719b9ee2bb9c54fc1d866a6134b229b3c1c151148c9558ea0a4ef8cb66526a" network for pod "cluster-monitoring-operator-67579f5cb5-gxmwc": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-67579f5cb5-gxmwc_openshift-monitoring" network:failed to find plugin "bridge" in path [/opt/cni/bin], failed to clean up sandbox container "70719b9ee2bb9c54fc1d866a6134b229b3c1c151148c9558ea0a4ef8cb66526a" network for pod "cluster-monitoring-operator-67579f5cb5-gxmwc": NetworkPlugin cni failed to teardown pod "cluster-monitoring-operator-67579f5cb5-gxmwc_openshift-monitoring" network: failed to find plugin "bridge" in path [/opt/cni/bin]]

I searched what is in bold, and I find a next solution.....

$ls -l /etc/cni/net.d ==> Normally the only file should be 80-openshift-network.conf, and I had 3 files

$ ls -l /etc/cni/net.d

-rw-r--r--. 1 root root 294 Mar 12 16:46 100-crio-bridge.conf

-rw-r--r--. 1 root root? 54 Mar 12 16:46 200-loopback.conf

-rw-r--r--. 1 root root? 83 May 15 16:15 80-openshift-network.conf

Red Hat suggest delete extra files and only keep 80-openshift-network.conf, but I only move 100-crio-bridge.conf and 200-loopback.conf to other directory. After do that, I reboot all my nodes, and in master node I executeplaybooks/openshift-monitoring/config.ymlagain and it worked.

11 安裝成功后登陸用戶創(chuàng)建

由于admin無法直接登陸伟桅，需要創(chuàng)建用戶

11.1 用htpasswd創(chuàng)建dev/dev的用戶

htpasswd -b /etc/origin/master/htpasswd dev dev

11.2??給dev用戶添加集群管理員權(quán)限孤钦，這樣可以訪問集群內(nèi)所有項目

# oc login -u system:admin

#?htpasswd -b /etc/origin/master/htpasswd dev dev

# oc adm policy add-cluster-role-to-user cluster-admin dev

[root@master openshift-ansible]# oc get clusterrolebindings |grep dev

cluster-admin-0? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? /cluster-admin? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dev

11.3 訪問https://master.openshift.example.com:8443?

輸入用戶名dev 密碼dev?

12 卸載 openshift

ansible-playbook -i hosts.311 /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml

在OpenPOWER上安裝紅帽O(jiān)penShift3.11教程

在OpenPOWER上安裝紅帽O(jiān)penShift3.11教程

機器集群規(guī)劃

操作系統(tǒng)要求

實驗集群

操作步驟