GCP, AWS에서의 학습은 비용 및 자율성의 문제로
VMware의 Ubuntu 20.04로 로컬 환경을 구축하였다.
마스터 노드 이외의 세팅은 모든 노드에 적용되어야 한다.
https://phoenixnap.com/kb/swap-memory
Swap memory, also known as swap space, is a section of a computer's hard disk or SSD that the operating system (OS) uses to store inactive data from Random Access Memory (RAM). This allows the OS to run even when RAM is full, preventing system slowdowns or crashes.
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
위와 같이 swap메모리를 off 해주고 이후의 kubernetes 설치를 진행한다
# Using Docker Repository
sudo apt update
sudo apt install -y ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list
# containerd 설치
sudo apt update
sudo apt install -y containerd.io
# sudo systemctl status containerd # Ctrl + C를 눌러서 나간다.
# Containerd configuration for Kubernetes
cat <<EOF | sudo tee -a /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
EOF
sudo sed -i 's/^disabled_plugins \=/\#disabled_plugins \=/g' /etc/containerd/config.toml
sudo systemctl restart containerd
# 소켓이 있는지 확인한다.
ls /var/run/containerd/containerd.sock
cat <<EOF > kube_install.sh
# 1. apt 패키지 색인을 업데이트하고, 쿠버네티스 apt 리포지터리를 사용하는 데 필요한 패키지를 설치한다.
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
# 2. 구글 클라우드의 공개 사이닝 키를 다운로드 한다.
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
# 3. 쿠버네티스 apt 리포지터리를 추가한다.
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
# 4. apt 패키지 색인을 업데이트하고, kubelet, kubeadm, kubectl을 설치하고 해당 버전을 고정한다.
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
EOF
sudo bash kube_install.sh
https://en.wikipedia.org/wiki/Netfilter
Netfilter is a framework provided by the Linux kernel that allows various networking-related operations to be implemented in the form of customized handlers. Netfilter offers various functions and operations for packet filtering, network address translation, and port translation, which provide the functionality required for directing packets through a network and prohibiting packets from reaching sensitive locations within a network.
sudo -i
modprobe br_netfilter
echo 1 > /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
exit
sudo hostnamectl set-hostname master
sudo reboot
원활한 노드 관리, 통신을 위하여 hostname을 변경한다
sudo kubeadm init
# 유저 설정
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 파드 네트워크 배포
curl -LO https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
rm cilium-linux-amd64.tar.gz
cilium install
# swapoff
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# netfilter bridge configure
sudo -i
modprobe br_netfilter
echo 1 > /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
exit
# master node
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Unable to connect to the server: tls: failed to verify certificate: x509: certificate is valid for
이는 아래와 같이 해결할 수 있다. MacOS 등 다른 OS에서는 링크를 참조해서 해결하면 된다
export KUBECONFIG=/etc/kubernetes/kubelet.conf
Error from server (Forbidden): namespaces is forbidden:
User "system:node:master.example.com" cannot list resource "namespaces" in API group "" at the cluster scope
이는 현재 kubectl context가 admin이 아니어서 발생하는 문제점으로 아래와 같이 해결할 수 있다
# config view
kubectl config view
# admin config를 expose해준다
export KUBECONFIG=/etc/kubernetes/admin.conf
$ kubectl get pod
The connection to the server 192.168.75.128:6443 was refused - did you specify the right host or port?
위 오류는 swapoff 및 넷필터 브릿지 설정으로 해결할 수 있다
GPG error: https://packages.cloud.google.com/apt kubernetes-xenial InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY
위와 같은 오류를 해결하기 위해서는 아래 코드로 PUBKEY를 발급받으면 된다
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://dl.k8s.io/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update -y
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
kubeadm을 reset하고 넷필터 브릿지 재설정한 후 join을 하면 해결된다
sudo kubeadm reset
accepts at most 1 arg(s), received 3
To see the stack trace of this error execute with --v=5 or higher
master 노드에서 새로운 token을 발급받고 관리자 권한으로 join하면 해결된다
sudo kubeadm token create --print-join-command
[preflight] Running pre-flight checks
위와 같이 join을 수행하면 preflight check에서 멈추는 현상
firewall을 diable하거나 6443번 포트를 allow 해주고
sudo ufw disable
sudo ufw allow 6443
VM의 여러 이터넷 어댑터들 중 eth1, kubeadm apiserver 주소를 enp08s의 IP로 설정해주면 해결된다
sudo kubeadm init --apiserver-advertise-address=<enp08s IPv4 addr>
https://github.com/kubernetes/kubernetes/issues/90345
/¯¯\
/¯¯\__/¯¯\ Cilium: 2 errors, 2 warnings
\__/¯¯\__/ Operator: 1 errors
/¯¯\__/¯¯\ Envoy DaemonSet: disabled (using embedded mode)
\__/¯¯\__/ Hubble Relay: disabled
\__/ ClusterMesh: disabled
Deployment cilium-operator Desired: 1, Unavailable: 1/1
DaemonSet cilium Desired: 3, Unavailable: 3/3
Containers: cilium Running: 1, Pending: 2
cilium-operator Running: 1
Cluster Pods: 0/0 managed by Cilium
Helm chart version: 1.14.2
Image versions cilium quay.io/cilium/cilium:v1.14.2@sha256:6263f3a3d5d63b267b538298dbeb5ae87da3efacf09a2c620446c873ba807d35: 3
cilium-operator quay.io/cilium/operator-generic:v1.14.2@sha256:52f70250dea22e506959439a7c4ea31b10fe8375db62f5c27ab746e3a2af866d: 1
Errors: cilium-operator cilium-operator 1 pods of Deployment cilium-operator are not ready
cilium cilium 3 pods of DaemonSet cilium are not ready
cilium cilium-g7nr6 unable to retrieve cilium status: command terminated with exit code 1
Warnings: cilium cilium-qlx45 pod is pending
cilium cilium-t8q2f pod is pending
cilium version 및 k8sServiceHost의 문제로 예상, 아래 코드로 문제 해결 가능
# sysdump 실행
cilium sysdump
# uninstall 이후 버전 및 k8sServiceHost를 지정해서 설치
cilium uninstall
cilium install --version "v1.13.0-rc5" \
--helm-set kubeProxyReplacement=strict \
--helm-set k8sServiceHost=<현재 k8s 호스트 IP 주소> \
--helm-set k8sServicePort=6443 \
--helm-set ingressController.enabled=true \
--helm-set ingressController.loadbalancerMode=shared \
--helm-set bgpControlPlane.enabled=true
Errors:
ciliumcilium-ccrgh unable to retrieve cilium status: unable to upgrade connection: pod does not exist
cilium cilium-r8sz9 unable to retrieve cilium status: unable to upgrade connection: pod does not exist
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf 해당 파일에 아래와 같은 코드 추가
Environment="KUBELET_EXTRA_ARGS=--node-ip=_YOUR_NODE_IP"
이후 kubectl delete cn --all 으로 노드를 모두 삭제하고 cilium을 재설치하면 해결된다
https://github.com/cilium/cilium/issues/18670