Azure Self-managed Kubernetes High Availability for Open5gs [part 1]
- Self-managed Kubernetes High Availability in Azure
- This guide is not for people looking for a fully automated command to bring up a Kubernetes cluster
- Azure VM use because sctp needed for open5gs, AKS nodes do not support sctp, it was failed in AKS, if later Azure support sctp in AKS nodes, probably better to use AKS.
- Calico, Containerd, Kubernetes v1.22.x, ROOK, CEPH, HELM, Istio, Open5Gs, Rancher, nginx
- The results of this tutorial should not be viewed as production ready, and may receive limited support from the community, but don’t let that stop you from learning!
[Part 1] Self-managed Kubernetes High Availability in Azure
Azure Ideal Topology
Because we don’t have 6 VMs for this simulation and only have 5 VMs, we use 1 of controller as worker, below is the topology, we should be able achieve 3 etcd and 3 rook ceph hosts as cluster to have quorum since we want to have HA, and the procedure should be similar :).
Azure Topology Use As Alternative
Azure Login
az loginaz group list
Networking
az network vnet create -g my-resource-group \
-n kubernetes-vnet \
--address-prefix 10.240.0.0/24 \
--subnet-name kubernetes-subnetaz network nsg create -g my-resource-group -n kubernetes-nsgaz network vnet subnet update -g my-resource-group \
-n kubernetes-subnet \
--vnet-name kubernetes-vnet \
--network-security-group kubernetes-nsgaz network nsg rule create -g my-resource-group \
-n kubernetes-allow-ssh \
--access allow \
--destination-address-prefix '*' \
--destination-port-range 22 \
--direction inbound \
--nsg-name kubernetes-nsg \
--protocol tcp \
--source-address-prefix '*' \
--source-port-range '*' \
--priority 1000az network nsg rule create -g my-resource-group \
-n kubernetes-allow-api-server \
--access allow \
--destination-address-prefix '*' \
--destination-port-range 6443 \
--direction inbound \
--nsg-name kubernetes-nsg \
--protocol tcp \
--source-address-prefix '*' \
--source-port-range '*' \
--priority 1001az network nsg rule list -g my-resource-group --nsg-name kubernetes-nsg --query "[].{Name:name, \
Direction:direction, Priority:priority, Port:destinationPortRange}" -o table
Kubernetes Public IP Address
Allocate a static IP address that will be attached to the external load balancer fronting the Kubernetes API Servers:
az network lb create -g my-resource-group \
-n kubernetes-lb \
--backend-pool-name kubernetes-lb-pool \
--public-ip-address kubernetes-pip \
--public-ip-address-allocation staticaz network public-ip list --query="[?name=='kubernetes-pip'].{ResourceGroup:resourceGroup, \
Region:location,Allocation:publicIpAllocationMethod,IP:ipAddress}" -o tableNAME ADDRESS/RANGE TYPE PURPOSE NETWORK REGION SUBNET STATUS
example-k8s 34.133.82.37 EXTERNAL us-central1 RESERVED
Compute Instances
Different zone can be use to make nodes more distributed
Kubernetes Controllers
az vm availability-set create -g my-resource-group -n controller-asfor i in 0 1 2; do
echo "[Controller ${i}] Creating public IP..."
az network public-ip create -n controller-${i}-pip -g my-resource-group > /dev/nullecho "[Controller ${i}] Creating NIC..."
az network nic create -g my-resource-group \
-n controller-${i}-nic \
--private-ip-address 10.240.0.1${i} \
--public-ip-address controller-${i}-pip \
--vnet kubernetes-vnet \
--subnet kubernetes-subnet \
--ip-forwarding \
--lb-name kubernetes-lb \
--lb-address-pools kubernetes-lb-pool > /dev/nullecho "[Controller ${i}] Creating VM..."
az vm create -g my-resource-group \
-n controller-${i} \
--image UbuntuLTS \
--size Standard_B2ms \
--nics controller-${i}-nic \
--availability-set controller-as \
--nsg '' \
--admin-username 'kuberoot' \
--generate-ssh-keys > /dev/null
done
Kubernetes Workers
az vm availability-set create -g my-resource-group -n worker-asfor i in 0 1; do
echo "[Worker ${i}] Creating public IP..."
az network public-ip create -n worker-${i}-pip -g my-resource-group > /dev/nullecho "[Worker ${i}] Creating NIC..."
az network nic create -g my-resource-group \
-n worker-${i}-nic \
--private-ip-address 10.240.0.2${i} \
--public-ip-address worker-${i}-pip \
--vnet kubernetes-vnet \
--subnet kubernetes-subnet \
--ip-forwarding > /dev/nullecho "[Worker ${i}] Creating VM..."
az vm create -g my-resource-group \
-n worker-${i} \
--image UbuntuLTS \
--size Standard_B2ms \
--nics worker-${i}-nic \
--tags pod-cidr=10.200.${i}.0/24 \
--availability-set worker-as \
--nsg '' \
--generate-ssh-keys \
--admin-username 'kuberoot' > /dev/null
done
Add worker’s disk for Storage Cluster using ROOK with CEPH
az vm disk attach \
-g my-resource-group \
--vm-name worker-0 \
--name myDataDisk1 \
--new \
--size-gb 50az vm disk attach \
-g my-resource-group \
--vm-name worker-1 \
--name myDataDisk2 \
--new \
--size-gb 50az vm disk attach \
-g my-resource-group \
--vm-name controller-2 \
--name myDataDisk3 \
--new \
--size-gb 50
Compute Instances Lists
az vm list -d -g my-resource-group -o table
Name ResourceGroup PowerState PublicIps Fqdns Location Zones
------------ ----------------------------- ------------ -------------- ------- ---------- -------
controller-0 my-resource-group VM running 20.106.131.198 eastus
controller-1 my-resource-group VM running 52.170.133.102 eastus
controller-2 my-resource-group VM running 137.117.45.116 eastus
worker-0 my-resource-group VM running 20.124.98.81 eastus
worker-1 my-resource-group VM running 20.124.97.30 eastus
Install Kubernetes on All Nodes
SSH to all Nodes
ssh kuberoot@20.106.131.198
ssh kuberoot@52.170.133.102
ssh kuberoot@137.117.45.116
ssh kuberoot@20.124.98.81
ssh kuberoot@20.124.97.30
Install packages containerd
Load overlay and br_netfilter kernal modules.
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOFsudo modprobe overlay
sudo modprobe br_netfilter
Set these system configurations for Kubernetes networking
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
Apply settings
sudo sysctl --system
Install containerd
sudo apt-get update && sudo apt-get install -y containerdsudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd
sudo systemctl enable containerd
Disable SWAP
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
Install dependency packages
sudo apt update && sudo apt-get install -y apt-transport-https curlcurl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
add kubernetes repo
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt update
Install kubectl, kubelet, & kubeadm packages
sudo apt-get install -y kubelet=1.22.1-00 kubeadm=1.22.1-00 kubectl=1.22.1-00sudo apt-mark hold kubelet kubeadm kubectl
exit to azure cli
exitexit
The Kubernetes Frontend Load Balancer
In this section you will provision an external load balancer to front the Kubernetes API Servers. The static IP address will be attached to the resulting load balancer.
The compute instances created in this tutorial will not have permission to complete this section. Run the following commands from the same machine used to create the compute instances.
Create the load balancer health probe as a pre-requesite for the lb rule that follows.
az network lb probe create -g my-resource-group \
--lb-name kubernetes-lb \
--name kubernetes-apiserver-probe \
--port 6443 \
--protocol tcp
Create the external load balancer network resources:
az network lb rule create -g my-resource-group \
-n kubernetes-apiserver-rule \
--protocol tcp \
--lb-name kubernetes-lb \
--frontend-ip-name LoadBalancerFrontEnd \
--frontend-port 6443 \
--backend-pool-name kubernetes-lb-pool \
--backend-port 6443 \
--probe-name kubernetes-apiserver-probe
Only Controller-0 node
SSH to controller-0 Nodes
ssh kuberoot@20.106.131.198
Create kubeadm config
“controlPlaneEndpoint” from External IP created in Kubernetes Public IP Address
cat > kubeadm-config.yaml <<EOF
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "52.186.18.198:6443"
networking:
podSubnet: "10.244.0.0/16"
EOF
Initialize the Cluster
sudo kubeadm init --config=kubeadm-config.yaml --upload-certs
Initialize Result
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/You can now join any number of the control-plane node running the following command on each as root:kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538 \
--control-plane --certificate-key 89aa63f203b2e6f958eddc5b09df641cb860038222e0f54f7fc20aa6401cbecdPlease note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.Then you can join any number of worker nodes by running the following on each as root:kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538
Make Current user able to use kube commands
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Install Calico Networking (CNI) specific for Azure
wget https://docs.projectcalico.org/manifests/calico.yaml
sed -i "s/calico_backend: \"bird\"/calico_backend: \"none\"/g" calico.yaml
sed -i "s/\- \-bird\-live/#\- \-bird\-live/g" calico.yaml
sed -i "s/\- \-bird\-ready/#\- \-bird\-ready/g" calico.yaml
kubectl apply -f calico.yaml
kubectl get pods -n kube-system
exit
exit
All Master nodes except controller-0
SSH to controller-1 and controller-2 Nodes
ssh kuberoot@52.170.133.102
ssh kuberoot@137.117.45.116
Join master nodes
sudo kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538 \
--control-plane --certificate-key 89aa63f203b2e6f958eddc5b09df641cb860038222e0f54f7fc20aa6401cbecd
Make Current user able to use kube commands
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/configexit
SSH to controller-0 Nodes
ssh kuberoot@20.106.131.198
Verify All master nodes already join
kuberoot@controller-0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controller-0 Ready control-plane,master 9m36s v1.22.1
controller-1 Ready control-plane,master 2m34s v1.22.1
controller-2 Ready control-plane,master 58s v1.22.1
exit
exit
All Worker nodes
SSH to all workers Nodes
ssh kuberoot@20.124.98.81
ssh kuberoot@20.124.97.30
Join worker nodes
sudo kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538
Check Disk and enable sctp in workers
sudo lsblk -f
sudo modprobe sctpexit
Verify All nodes already join
SSH to controller-0 Nodes
ssh kuberoot@20.106.131.198
Verify All master nodes already join
kuberoot@controller-0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controller-0 Ready control-plane,master 22m v1.22.1
controller-1 Ready control-plane,master 15m v1.22.1
controller-2 Ready control-plane,master 13m v1.22.1
worker-0 Ready <none> 2m5s v1.22.1
worker-1 Ready <none> 37s v1.22.1
Make controller-2 as worker
kubectl describe node | egrep -i taintkubectl taint nodes controller-2 node-role.kubernetes.io/master-kubectl describe node | egrep -i taint
kuberoot@controller-0:~$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-958545d87-rwgm2 1/1 Running 0 57m
kube-system calico-node-5bh46 1/1 Running 0 37m
kube-system calico-node-8mwvn 1/1 Running 0 52m
kube-system calico-node-9jxz7 1/1 Running 0 57m
kube-system calico-node-kr6c2 1/1 Running 0 39m
kube-system calico-node-rt2bf 1/1 Running 0 51m
kube-system coredns-78fcd69978-vtz52 1/1 Running 0 59m
kube-system coredns-78fcd69978-xtqzh 1/1 Running 0 59m
kube-system etcd-controller-0 1/1 Running 0 59m
kube-system etcd-controller-1 1/1 Running 0 52m
kube-system etcd-controller-2 1/1 Running 0 51m
kube-system kube-apiserver-controller-0 1/1 Running 0 59m
kube-system kube-apiserver-controller-1 1/1 Running 0 52m
kube-system kube-apiserver-controller-2 1/1 Running 0 51m
kube-system kube-controller-manager-controller-0 1/1 Running 1 (52m ago) 59m
kube-system kube-controller-manager-controller-1 1/1 Running 0 52m
kube-system kube-controller-manager-controller-2 1/1 Running 0 51m
kube-system kube-proxy-cjjh8 1/1 Running 0 51m
kube-system kube-proxy-flvjp 1/1 Running 0 39m
kube-system kube-proxy-h7ms8 1/1 Running 0 59m
kube-system kube-proxy-jq7wj 1/1 Running 0 52m
kube-system kube-proxy-wj2c4 1/1 Running 0 37m
kube-system kube-scheduler-controller-0 1/1 Running 1 (52m ago) 59m
kube-system kube-scheduler-controller-1 1/1 Running 0 52m
kube-system kube-scheduler-controller-2 1/1 Running 0 51m
In Azure Enable IP Forwarding each hosts and User Define Routes (UDR)
This is related to Calico in Azure, This is needed in order coredns to work.
az network route-table create -g my-resource-group -n kubernetes-routesaz network route-table route list -g my-resource-group --route-table-name kubernetes-routes -o tablefor i in 0 1; do
az network route-table route create -g my-resource-group \
-n kubernetes-route-10-200-${i}-0-24 \
--route-table-name kubernetes-routes \
--address-prefix 10.200.${i}.0/24 \
--next-hop-ip-address 10.240.0.2${i} \
--next-hop-type VirtualAppliance
doneaz network route-table route create -g my-resource-group \
-n kubernetes-route-controller-0 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.192.64/26 \
--next-hop-ip-address 10.240.0.10 \
--next-hop-type VirtualAppliance
az network route-table route create -g my-resource-group \
-n kubernetes-route-controller-1 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.166.128/26 \
--next-hop-ip-address 10.240.0.11 \
--next-hop-type VirtualAppliance
az network route-table route create -g my-resource-group \
-n kubernetes-route-controller-2 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.27.192/26 \
--next-hop-ip-address 10.240.0.12 \
--next-hop-type VirtualApplianceaz network route-table route create -g my-resource-group \
-n kubernetes-route-worker-0 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.43.0/26 \
--next-hop-ip-address 10.240.0.20 \
--next-hop-type VirtualApplianceaz network route-table route create -g my-resource-group \
-n kubernetes-route-worker-1 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.226.64/26 \
--next-hop-ip-address 10.240.0.21 \
--next-hop-type VirtualAppliance
az network vnet subnet update -g my-resource-group -n kubernetes-subnet --vnet-name kubernetes-vnet --route-table kubernetes-routesaz network route-table route list -g my-resource-group --route-table-name kubernetes-routes -o table
We are done!!!
In the next step we will deploy istio, rancher and ROOK CEPH.
Really appreciated the info from below references, if any question just ask :)
Enjoy!!!
References:
- https://github.com/ivanfioravanti/kubernetes-the-hard-way-on-azure
- https://projectcalico.docs.tigera.io/reference/public-cloud/azure#about-calico-on-azure
- https://stackoverflow.com/questions/60222243/calico-k8s-on-azure-cant-access-pods
- https://assyafii.com/docs/install-kubernetes-cluster-multi-master-ha/