By Brad Johnson, Lead DevOps Engineer
Continuing from ‘Creating an OpenShift Cluster in AWS with Windows Worker Nodes (Part I)‘, we are going to install OpenShift Cluster in this section. We are going to use a public Route53 domain name for our install.
If you wish to create a private cluster then you will need to do a bit more setup. See the following pages for more information on creating a private cluster that does not require DNS. The first page has the RedHat solution on the install-config program not supporting private clusters and contains an install config yaml file to use instead of the install-config command.
https://access.redhat.com/solutions/5158831
https://access.redhat.com/sites/default/files/attachments/aws-internal-install-config.yml
This page has more info on the install process and limitations of private clusters:
https://docs.openshift.com/container-platform/4.5/installing/installing_aws/installing-aws-private.html
First create the install-config yaml file and back it up as it is consumed by manifest creation.
Note: from here out all commands are run from the openshift_windows_cluster directory unless otherwise stated.
$ mkdir ~/openshift_windows_cluster && cd ~/openshift_windows_cluster
$ openshift-install create install-config
? Platform aws
INFO Credentials loaded from the "default" profile in file "/home/ec2-user/.aws/credentials"
? Region us-east-2
? Base Domain example.com
? Cluster Name win-test-cluster
? Pull Secret [? for help] (Paste your Pull Secret from the Red Hat web site or text file you downloaded)
$ sed -i 's/OpenShiftSDN/OVNKubernetes/g' install-config.yaml
$ cp -p install-config.yaml install-config.yaml.backup
Now we can create the manifest files and set up the OVN CNI settings:
$ openshift-install create manifests
INFO Credentials loaded from the "default" profile in file "/home/ec2-user/.aws/credentials"
INFO Consuming Install Config from target directory
$ cp -p manifests/cluster-network-02-config.yml manifests/cluster-network-03-config.yml
$ vi manifests/cluster-network-03-config.yml
The important things to change in this file are the apiVersion and defaultNetwork settings. It is important that the hybrid cluster network CIDR does not overlap with the cluster network CIDR. If you are following this guide exactly you can use this our network config file.
Here are the contents of our manifests/cluster-network-03-config.yml file:
apiVersion: operator.openshift.io/v1
kind: Network
metadata:
creationTimestamp: null
name: cluster
spec:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
externalIP:
policy: {}
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
defaultNetwork:
type: OVNKubernetes
ovnKubernetesConfig:
hybridOverlayConfig:
hybridClusterNetwork:
- cidr: 10.132.0.0/14
hostPrefix: 23
status: {}
Creation of the Cluster
With those files in place we can now create the cluster. Take a coffee break, this will take around 30 minutes to complete.
$ openshift-install create cluster
INFO Consuming Openshift Manifests from target directory
INFO Consuming Worker Machines from target directory
INFO Consuming Master Machines from target directory
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Common Manifests from target directory
INFO Credentials loaded from the "default" profile in file "/home/ec2-user/.aws/credentials"
INFO Creating infrastructure resources...
INFO Waiting up to 20m0s for the Kubernetes API at https://api.win-test-cluster.example.com:6443...
INFO API v1.18.3+5302882 up
INFO Waiting up to 40m0s for bootstrapping to complete...
INFO Destroying the bootstrap resources...
INFO Waiting up to 30m0s for the cluster at https://api.win-test-cluster.example.com:6443 to initialize...
I1015 22:40:12.502855 1042 trace.go:116] Trace[1959950141]: "Reflector ListAndWatch" name:k8s.io/client-go/tools/watch/informerwatcher.go:146 (started: 2020-10-15
22:39:55.810110164 +0000 UTC m=+886.539985514) (total time: 16.692708687s):
Trace[1959950141]: [16.692655552s] [16.692655552s] Objects listed
INFO Waiting up to 10m0s for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/ec2-user/openshift_windows_cluster/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.win-test-cluster.example.com
INFO Login to the console with user: "kubeadmin", and password: "XXXXX-XXXXX-XXXXX-XXXXX"
INFO Time elapsed: 30m48s
Now you can run the export command and start using oc commands.
$ export KUBECONFIG=/home/ec2-user/openshift_windows_cluster/auth/kubeconfig
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-128-115.us-east-2.compute.internal Ready master 1h v1.18.3+970c1b3
ip-10-0-150-141.us-east-2.compute.internal Ready worker 1h v1.18.3+970c1b3
ip-10-0-161-110.us-east-2.compute.internal Ready worker 1h v1.18.3+970c1b3
ip-10-0-186-69.us-east-2.compute.internal Ready master 1h v1.18.3+970c1b3
ip-10-0-201-57.us-east-2.compute.internal Ready master 1h v1.18.3+970c1b3
ip-10-0-220-129.us-east-2.compute.internal Ready worker 1h v1.18.3+970c1b3
$ oc version
Client Version: 4.5.14
Server Version: 4.5.14
Kubernetes Version: v1.18.3+5302882
To verify you have the proper network running you can run this command:
$ oc get network.operator cluster -o yaml
Look at the spec section of the yaml output. It should look like this.
spec:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
defaultNetwork:
ovnKubernetesConfig:
hybridOverlayConfig:
hybridClusterNetwork:
- cidr: 10.132.0.0/14
hostPrefix: 23
type: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
Bootstrapping the Windows Worker Nodes
If you already have an SSH keypair in AWS you can use that, if not you can generate a new one with the steps below. Note that you cannot use a key with a passphrase for Windows machines.
$ ssh-keygen -t rsa -b 4096 -N "" -C "example-key" -f ~/.ssh/example-key
$ aws --region us-east-2 ec2 import-key-pair --key-name "example-key" --public-key-material file://$HOME/.ssh/example-key.pub
Now we need to download the Windows node bootstrapper and create our Windows nodes. This will take about 5 minutes to run.
See this page for the latest releases: https://github.com/openshift/windows-machine-config-bootstrapper/releases
See this page for more info on wni: https://github.com/openshift/windows-machine-config-bootstrapper/tree/master/tools/windows-node-installer
Note: Due to a bug in the Intel 82599 network adapter used in most Intel based instances that causes issues with overlay networks, we suggest using AMD based instances like m5a.large
$ wget https://github.com/openshift/windows-machine-config-bootstrapper/releases/download/v4.5.2-alpha/wni -O ~/bin/wni
$ chmod +x ~/bin/wni && mkdir windowsnodeinstaller
$ wni aws create --kubeconfig $KUBECONFIG --credentials ~/.aws/credentials --credential-account default --instance-type m5a.large --ssh-key example-key --private-key ~/.ssh/example-key --dir ./windowsnodeinstaller/
2020/10/16 20:05:13 kubeconfig source: /home/ec2-user/openshift_windows_cluster/auth/kubeconfig
2020/10/16 20:05:14 Added rule with port 5986 to the security groups of your local IP
2020/10/16 20:05:14 Added rule with port 22 to the security groups of your local IP
2020/10/16 20:05:14 Added rule with port 3389 to the security groups of your local IP
2020/10/16 20:05:14 Using existing Security Group: sg-0123456789012345
2020/10/16 20:09:41 External IP: 4.138.182.84
2020/10/16 20:09:41 Internal IP: 10.0.42.50
After creating the node we can get the login info and run Ansible to finish node setup.
See this page for more information: https://github.com/openshift/windows-machine-config-bootstrapper/tree/master/tools/ansible
Get the Windows node Instance ID from the json file and get the Windows Administrator password. This password can also be used for RDP.
$ cat windowsnodeinstaller/windows-node-installer.json
{"InstanceIDs":["i-0123456789012345"],"SecurityGroupIDs":["sg-0123456789012345"]}
$ aws ec2 get-password-data --instance-id i-0123456789012345 --priv-launch-key ~/.ssh/example-key
Ansible Windows Node Finalization
Now we need to create an Ansible inventory file.
$ vi inventory.ini
Your file should look like this, with your Windows node password and node address. Be sure to put the password in single quotes and set the cluster address to match the name of your cluster and private IP to match your node as well.
[win]
4.138.182.84 ansible_password='YOURWINDOWSNODEPASSWORDHERE' private_ip=10.0.42.50
[win:vars]
ansible_user=Administrator
cluster_address=win-test-cluster.example.com
ansible_connection=winrm
ansible_ssh_port=5986
ansible_winrm_server_cert_validation=ignore
Verify Ansible connectivity with this command and look for SUCCESS in the output:
$ ansible win -i inventory.ini -m win_ping
4.138.182.84 | SUCCESS => {
"changed": false,
"ping": "pong"
}
Clone the Windows Machine Config Bootstrapper repo and run the ansible playbook against the node:
$ git clone https://github.com/openshift/windows-machine-config-bootstrapper.git
$ ansible-playbook -v -i inventory.ini windows-machine-config-bootstrapper/tools/ansible/tasks/wsu/main.yaml
This will produce a lot of output and take 10 minutes or so. In the end you should see the Play Recap. As long as ‘failed=0’ then everything should be good.
To check the node is good and working in the cluster run this command:
$ oc get nodes -o wide -l kubernetes.io/os=windows
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-42-50.us-east-2.compute.internal Ready worker 29m v1.18.3 10.0.42.50 3.138.182.84 Windows Server 2019 Datacenter 10.0.17763.1518 docker://19.3.12
At this point you should use RDP to connect to the Windows worker node using the Administrator user and the password you pulled earlier. Just add the Windows Worker Node to a security group allowing RDP and then open a connection. After logging in start a powershell session with admin rights and run ‘docker ps’.
Deploy a Windows sample application:
$ oc create -f https://raw.githubusercontent.com/keyvatech/blog_files/master/kubernetes_windows_web_server.yaml -n default
You can check it is running in OpenShift with this command:
$oc rollout status deployment win-webserver -n default
deployment "win-webserver" successfully rolled out
On Windows docker output should look like this:
PS C:\Users\Administrator> docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
09c8bbd2a7e8 mcr.microsoft.com/windows/servercore "powershell.exe -com…" 13 minutes ago Up 13 minutes k8s_windowswebserver_win-webserver-85b49f8677-cgqkq_default_01fe28db-5ae7-4ead-8e84-5d9d5cd2cb01_0
52d42f33de9d mcr.microsoft.com/k8s/core/pause:1.2.0 "cmd /S /C 'cmd /c p…" 16 minutes ago Up 16 minutes k8s_POD_win-webserver-85b49f8677-cgqkq_default_01fe28db-5ae7-4ead-8e84-5d9d5cd2cb01_0
If you have any issues try waiting 15 minutes and then redeploying with one of the following commands:
$ oc rollout restart deployment/win-webserver
$ oc rollout retry deployment/win-webserver
To look at logs for the container, do this:
$ oc get pods
NAME READY STATUS RESTARTS AGE
win-webserver-564d75c5f7-l4kk2 1/1 Running 0 96s
$ oc logs win-webserver-564d75c5f7-l4kk2
Listening at http://*:80/
After the application is up and running DNS will take up to 5 minutes to populate. So if this doesn’t work try again. Check the service is up and running by getting the external IP for the service and curling it.
$ oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 172.30.0.1 <none> 443/TCP 23h
openshift ExternalName <none> kubernetes.default.svc.cluster.local <none> 23h
win-webserver LoadBalancer 172.30.88.146 a038a9aa4571f4a7cafaf15ebf7ae270-23672059.us-east-2.elb.amazonaws.com 80:32601/TCP 35m
$ curl a038a9aa4571f4a7cafaf15ebf7ae270-23672059.us-east-2.elb.amazonaws.com
<html><body><H1>Windows Container Web Server</H1></body></html>
Deleting the Cluster
If you’re all done and want to tear down here are the commands:
$ wni aws destroy --kubeconfig $KUBECONFIG --credentials ~/.aws/credentials --credential-account default --dir ./windowsnodeinstaller/
$ openshift-install destroy cluster
If you have any questions about the steps documented here, or have any feedback or requests, please let us know at [email protected].
Brad is an expert in automation using Ansible, Python and pexpect to develop custom solutions and automate the things that “can’t be automated”. Prior to Keyva, Brad worked at Cray R&D for 6 years and led automation efforts across their XC supercomputer development environment. Brad has a passion for learning new technology, technical problem solving and helping others.
Like what you read? Follow Brad on LinkedIn at: https://www.linkedin.com/in/bradejohnson/