Azure Self-managed Kubernetes High Availability for Open5gs [part 1]

Indo's lab
11 min readFeb 11, 2022

--

  • Self-managed Kubernetes High Availability in Azure
  • This guide is not for people looking for a fully automated command to bring up a Kubernetes cluster
  • Azure VM use because sctp needed for open5gs, AKS nodes do not support sctp, it was failed in AKS, if later Azure support sctp in AKS nodes, probably better to use AKS.
  • Calico, Containerd, Kubernetes v1.22.x, ROOK, CEPH, HELM, Istio, Open5Gs, Rancher, nginx
  • The results of this tutorial should not be viewed as production ready, and may receive limited support from the community, but don’t let that stop you from learning!

[Part 1] Self-managed Kubernetes High Availability in Azure

source : kubernetes.io

Azure Ideal Topology

Because we don’t have 6 VMs for this simulation and only have 5 VMs, we use 1 of controller as worker, below is the topology, we should be able achieve 3 etcd and 3 rook ceph hosts as cluster to have quorum since we want to have HA, and the procedure should be similar :).

Azure Topology Use As Alternative

Azure Login

az loginaz group list

Networking

az network vnet create -g my-resource-group \
-n kubernetes-vnet \
--address-prefix 10.240.0.0/24 \
--subnet-name kubernetes-subnet
az network nsg create -g my-resource-group -n kubernetes-nsgaz network vnet subnet update -g my-resource-group \
-n kubernetes-subnet \
--vnet-name kubernetes-vnet \
--network-security-group kubernetes-nsg
az network nsg rule create -g my-resource-group \
-n kubernetes-allow-ssh \
--access allow \
--destination-address-prefix '*' \
--destination-port-range 22 \
--direction inbound \
--nsg-name kubernetes-nsg \
--protocol tcp \
--source-address-prefix '*' \
--source-port-range '*' \
--priority 1000
az network nsg rule create -g my-resource-group \
-n kubernetes-allow-api-server \
--access allow \
--destination-address-prefix '*' \
--destination-port-range 6443 \
--direction inbound \
--nsg-name kubernetes-nsg \
--protocol tcp \
--source-address-prefix '*' \
--source-port-range '*' \
--priority 1001
az network nsg rule list -g my-resource-group --nsg-name kubernetes-nsg --query "[].{Name:name, \
Direction:direction, Priority:priority, Port:destinationPortRange}" -o table

Kubernetes Public IP Address

Allocate a static IP address that will be attached to the external load balancer fronting the Kubernetes API Servers:

az network lb create -g my-resource-group \
-n kubernetes-lb \
--backend-pool-name kubernetes-lb-pool \
--public-ip-address kubernetes-pip \
--public-ip-address-allocation static
az network public-ip list --query="[?name=='kubernetes-pip'].{ResourceGroup:resourceGroup, \
Region:location,Allocation:publicIpAllocationMethod,IP:ipAddress}" -o table
NAME ADDRESS/RANGE TYPE PURPOSE NETWORK REGION SUBNET STATUS
example-k8s 34.133.82.37 EXTERNAL us-central1 RESERVED

Compute Instances

Different zone can be use to make nodes more distributed

Kubernetes Controllers

az vm availability-set create -g my-resource-group -n controller-asfor i in 0 1 2; do
echo "[Controller ${i}] Creating public IP..."
az network public-ip create -n controller-${i}-pip -g my-resource-group > /dev/null
echo "[Controller ${i}] Creating NIC..."
az network nic create -g my-resource-group \
-n controller-${i}-nic \
--private-ip-address 10.240.0.1${i} \
--public-ip-address controller-${i}-pip \
--vnet kubernetes-vnet \
--subnet kubernetes-subnet \
--ip-forwarding \
--lb-name kubernetes-lb \
--lb-address-pools kubernetes-lb-pool > /dev/null
echo "[Controller ${i}] Creating VM..."
az vm create -g my-resource-group \
-n controller-${i} \
--image UbuntuLTS \
--size Standard_B2ms \
--nics controller-${i}-nic \
--availability-set controller-as \
--nsg '' \
--admin-username 'kuberoot' \
--generate-ssh-keys > /dev/null
done

Kubernetes Workers

az vm availability-set create -g my-resource-group -n worker-asfor i in 0 1; do
echo "[Worker ${i}] Creating public IP..."
az network public-ip create -n worker-${i}-pip -g my-resource-group > /dev/null
echo "[Worker ${i}] Creating NIC..."
az network nic create -g my-resource-group \
-n worker-${i}-nic \
--private-ip-address 10.240.0.2${i} \
--public-ip-address worker-${i}-pip \
--vnet kubernetes-vnet \
--subnet kubernetes-subnet \
--ip-forwarding > /dev/null
echo "[Worker ${i}] Creating VM..."
az vm create -g my-resource-group \
-n worker-${i} \
--image UbuntuLTS \
--size Standard_B2ms \
--nics worker-${i}-nic \
--tags pod-cidr=10.200.${i}.0/24 \
--availability-set worker-as \
--nsg '' \
--generate-ssh-keys \
--admin-username 'kuberoot' > /dev/null
done

Add worker’s disk for Storage Cluster using ROOK with CEPH

az vm disk attach \
-g my-resource-group \
--vm-name worker-0 \
--name myDataDisk1 \
--new \
--size-gb 50
az vm disk attach \
-g my-resource-group \
--vm-name worker-1 \
--name myDataDisk2 \
--new \
--size-gb 50
az vm disk attach \
-g my-resource-group \
--vm-name controller-2 \
--name myDataDisk3 \
--new \
--size-gb 50

Compute Instances Lists

az vm list -d -g my-resource-group -o table
Name ResourceGroup PowerState PublicIps Fqdns Location Zones
------------ ----------------------------- ------------ -------------- ------- ---------- -------
controller-0 my-resource-group VM running 20.106.131.198 eastus
controller-1 my-resource-group VM running 52.170.133.102 eastus
controller-2 my-resource-group VM running 137.117.45.116 eastus
worker-0 my-resource-group VM running 20.124.98.81 eastus
worker-1 my-resource-group VM running 20.124.97.30 eastus

Install Kubernetes on All Nodes

SSH to all Nodes

ssh kuberoot@20.106.131.198  
ssh kuberoot@52.170.133.102
ssh kuberoot@137.117.45.116
ssh kuberoot@20.124.98.81
ssh kuberoot@20.124.97.30

Install packages containerd

Load overlay and br_netfilter kernal modules.

cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf 
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter

Set these system configurations for Kubernetes networking

cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf 
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF

Apply settings

sudo sysctl --system

Install containerd

sudo apt-get update && sudo apt-get install -y containerdsudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd
sudo systemctl enable containerd

Disable SWAP

sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

Install dependency packages

sudo apt update && sudo apt-get install -y apt-transport-https curlcurl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

add kubernetes repo

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt update

Install kubectl, kubelet, & kubeadm packages

sudo apt-get install -y kubelet=1.22.1-00 kubeadm=1.22.1-00 kubectl=1.22.1-00sudo apt-mark hold kubelet kubeadm kubectl

exit to azure cli

exitexit

The Kubernetes Frontend Load Balancer

In this section you will provision an external load balancer to front the Kubernetes API Servers. The static IP address will be attached to the resulting load balancer.

The compute instances created in this tutorial will not have permission to complete this section. Run the following commands from the same machine used to create the compute instances.

Create the load balancer health probe as a pre-requesite for the lb rule that follows.

az network lb probe create -g my-resource-group \
--lb-name kubernetes-lb \
--name kubernetes-apiserver-probe \
--port 6443 \
--protocol tcp

Create the external load balancer network resources:

az network lb rule create -g my-resource-group \
-n kubernetes-apiserver-rule \
--protocol tcp \
--lb-name kubernetes-lb \
--frontend-ip-name LoadBalancerFrontEnd \
--frontend-port 6443 \
--backend-pool-name kubernetes-lb-pool \
--backend-port 6443 \
--probe-name kubernetes-apiserver-probe

Only Controller-0 node

SSH to controller-0 Nodes

ssh kuberoot@20.106.131.198

Create kubeadm config

“controlPlaneEndpoint” from External IP created in Kubernetes Public IP Address

cat > kubeadm-config.yaml <<EOF
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "52.186.18.198:6443"
networking:
podSubnet: "10.244.0.0/16"
EOF

Initialize the Cluster

sudo kubeadm init --config=kubeadm-config.yaml --upload-certs

Initialize Result

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538 \
--control-plane --certificate-key 89aa63f203b2e6f958eddc5b09df641cb860038222e0f54f7fc20aa6401cbecd
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538

Make Current user able to use kube commands

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Install Calico Networking (CNI) specific for Azure

wget https://docs.projectcalico.org/manifests/calico.yaml
sed -i "s/calico_backend: \"bird\"/calico_backend: \"none\"/g" calico.yaml
sed -i "s/\- \-bird\-live/#\- \-bird\-live/g" calico.yaml
sed -i "s/\- \-bird\-ready/#\- \-bird\-ready/g" calico.yaml
kubectl apply -f calico.yaml
kubectl get pods -n kube-system

exit

exit

All Master nodes except controller-0

SSH to controller-1 and controller-2 Nodes

ssh kuberoot@52.170.133.102  
ssh kuberoot@137.117.45.116

Join master nodes

sudo kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538 \
--control-plane --certificate-key 89aa63f203b2e6f958eddc5b09df641cb860038222e0f54f7fc20aa6401cbecd

Make Current user able to use kube commands

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
exit

SSH to controller-0 Nodes

ssh kuberoot@20.106.131.198

Verify All master nodes already join

kuberoot@controller-0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controller-0 Ready control-plane,master 9m36s v1.22.1
controller-1 Ready control-plane,master 2m34s v1.22.1
controller-2 Ready control-plane,master 58s v1.22.1

exit

exit

All Worker nodes

SSH to all workers Nodes

ssh kuberoot@20.124.98.81    
ssh kuberoot@20.124.97.30

Join worker nodes

sudo kubeadm join 52.186.18.198:6443 --token d25t07.xn631slhm9gvx8nn \
--discovery-token-ca-cert-hash sha256:0cfc712d3aa46df3d58318c56e16c4ddec71b113da8683eb87e770086aa64538

Check Disk and enable sctp in workers

sudo lsblk -f
sudo modprobe sctp
exit

Verify All nodes already join

SSH to controller-0 Nodes

ssh kuberoot@20.106.131.198

Verify All master nodes already join

kuberoot@controller-0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controller-0 Ready control-plane,master 22m v1.22.1
controller-1 Ready control-plane,master 15m v1.22.1
controller-2 Ready control-plane,master 13m v1.22.1
worker-0 Ready <none> 2m5s v1.22.1
worker-1 Ready <none> 37s v1.22.1

Make controller-2 as worker

kubectl describe node | egrep -i taintkubectl taint nodes controller-2 node-role.kubernetes.io/master-kubectl describe node | egrep -i taint
kuberoot@controller-0:~$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-958545d87-rwgm2 1/1 Running 0 57m
kube-system calico-node-5bh46 1/1 Running 0 37m
kube-system calico-node-8mwvn 1/1 Running 0 52m
kube-system calico-node-9jxz7 1/1 Running 0 57m
kube-system calico-node-kr6c2 1/1 Running 0 39m
kube-system calico-node-rt2bf 1/1 Running 0 51m
kube-system coredns-78fcd69978-vtz52 1/1 Running 0 59m
kube-system coredns-78fcd69978-xtqzh 1/1 Running 0 59m
kube-system etcd-controller-0 1/1 Running 0 59m
kube-system etcd-controller-1 1/1 Running 0 52m
kube-system etcd-controller-2 1/1 Running 0 51m
kube-system kube-apiserver-controller-0 1/1 Running 0 59m
kube-system kube-apiserver-controller-1 1/1 Running 0 52m
kube-system kube-apiserver-controller-2 1/1 Running 0 51m
kube-system kube-controller-manager-controller-0 1/1 Running 1 (52m ago) 59m
kube-system kube-controller-manager-controller-1 1/1 Running 0 52m
kube-system kube-controller-manager-controller-2 1/1 Running 0 51m
kube-system kube-proxy-cjjh8 1/1 Running 0 51m
kube-system kube-proxy-flvjp 1/1 Running 0 39m
kube-system kube-proxy-h7ms8 1/1 Running 0 59m
kube-system kube-proxy-jq7wj 1/1 Running 0 52m
kube-system kube-proxy-wj2c4 1/1 Running 0 37m
kube-system kube-scheduler-controller-0 1/1 Running 1 (52m ago) 59m
kube-system kube-scheduler-controller-1 1/1 Running 0 52m
kube-system kube-scheduler-controller-2 1/1 Running 0 51m

In Azure Enable IP Forwarding each hosts and User Define Routes (UDR)

This is related to Calico in Azure, This is needed in order coredns to work.

az network route-table create -g my-resource-group -n kubernetes-routesaz network route-table route list -g my-resource-group --route-table-name kubernetes-routes -o tablefor i in 0 1; do
az network route-table route create -g my-resource-group \
-n kubernetes-route-10-200-${i}-0-24 \
--route-table-name kubernetes-routes \
--address-prefix 10.200.${i}.0/24 \
--next-hop-ip-address 10.240.0.2${i} \
--next-hop-type VirtualAppliance
done
az network route-table route create -g my-resource-group \
-n kubernetes-route-controller-0 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.192.64/26 \
--next-hop-ip-address 10.240.0.10 \
--next-hop-type VirtualAppliance

az network route-table route create -g my-resource-group \
-n kubernetes-route-controller-1 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.166.128/26 \
--next-hop-ip-address 10.240.0.11 \
--next-hop-type VirtualAppliance

az network route-table route create -g my-resource-group \
-n kubernetes-route-controller-2 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.27.192/26 \
--next-hop-ip-address 10.240.0.12 \
--next-hop-type VirtualAppliance
az network route-table route create -g my-resource-group \
-n kubernetes-route-worker-0 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.43.0/26 \
--next-hop-ip-address 10.240.0.20 \
--next-hop-type VirtualAppliance
az network route-table route create -g my-resource-group \
-n kubernetes-route-worker-1 \
--route-table-name kubernetes-routes \
--address-prefix 10.244.226.64/26 \
--next-hop-ip-address 10.240.0.21 \
--next-hop-type VirtualAppliance

az network vnet subnet update -g my-resource-group -n kubernetes-subnet --vnet-name kubernetes-vnet --route-table kubernetes-routes
az network route-table route list -g my-resource-group --route-table-name kubernetes-routes -o table

We are done!!!

In the next step we will deploy istio, rancher and ROOK CEPH.

https://indoslab.medium.com/3-steps-creating-self-managed-kubernetes-high-availability-in-azure-for-open5gs-part-2-a13c011faa2a

Really appreciated the info from below references, if any question just ask :)

Enjoy!!!

References:

--

--

Indo's lab

Interested to learn new technology. Cloud native, 5G, Open Source, Blockchain, etc.