쿠버네티스 인증서 만료 관련 이슈 수정

sang yun Lee·2024년 9월 20일


- 이슈

오늘 쿠버네티스 명령을 하려고 하니까 아래와 같이 에러가 발생했다.

$ kubectl get node
E0920 22:26:27.495184   25796 memcache.go:265] couldn't get current server API group list: Get "": dial tcp connect: connection refused
The connection to the server was refused - did you specify the right host or port?

kube-apiserver 은 아래와 같이 종료되어있었고

$ sudo journalctl -u kube-apiserver
-- Logs begin at Sun 2024-09-15 20:54:44 KST, end at Fri 2024-09-20 22:29:42 KST. --
-- No entries --

확인해보니 아래의 로그로 짐작해볼 때에 인증서가 만료되어 kube-apiserver 가 정상동작하지 않고 죽었음을 알 수 있었다.

$ sudo crictl ps -a | grep kube-apiserver
a9ab410eded16       19b9246d37c8b       About a minute ago   Exited              kube-apiserver              811                 a917030e3d193       kube-apiserver-com

$ sudo crictl logs a9ab410eded16
W0920 13:31:21.680393       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
  "Addr": "",
  "ServerName": "",
  "Attributes": null,
  "BalancerAttributes": null,
  "Type": 0,
  "Metadata": null
}. Err: connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-09-20T13:31:21Z is after 2024-09-20T03:21:59Z"
E0920 13:31:24.410777       1 run.go:74] "command failed" err="context deadline exceeded"

$ openssl x509 -noout -dates -in /etc/kubernetes/pki/apiserver.crt
notBefore=Sep 21 03:16:57 2023 GMT
notAfter=Sep 20 03:21:58 2024 GMT

$ sudo kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration

admin.conf                        Sep 20, 2024 03:22 UTC   <invalid>       ca                      no      
apiserver                         Sep 20, 2025 13:39 UTC   364d            ca                      no      
!MISSING! apiserver-etcd-client                                                                    
apiserver-kubelet-client          Sep 20, 2024 03:21 UTC   <invalid>       ca                      no      
controller-manager.conf           Sep 20, 2024 03:22 UTC   <invalid>       ca                      no      
etcd-healthcheck-client           Sep 20, 2024 03:22 UTC   <invalid>       etcd-ca                 no      
etcd-peer                         Sep 20, 2024 03:22 UTC   <invalid>       etcd-ca                 no      
etcd-server                       Sep 20, 2024 03:21 UTC   <invalid>       etcd-ca                 no      
front-proxy-client                Sep 20, 2024 03:21 UTC   <invalid>       front-proxy-ca          no      
scheduler.conf                    Sep 20, 2024 03:22 UTC   <invalid>       ca                      no      

ca                      Sep 18, 2033 03:21 UTC   8y              no      
etcd-ca                 Sep 18, 2033 03:21 UTC   8y              no      
front-proxy-ca          Sep 18, 2033 03:21 UTC   8y              no

- 해결

# 인증서 생성
$ sudo kubeadm init phase certs all
I0920 22:56:40.672223   30625 version.go:256] remote version is much newer: v1.31.0; falling back to: stable-1.28
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Using the existing "sa" key

sudo mv /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/apiserver.crt.bak
sudo mv /etc/kubernetes/pki/apiserver.key /etc/kubernetes/pki/apiserver.key.bak
sudo mv /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/apiserver-etcd-client.crt.bak
sudo mv /etc/kubernetes/pki/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd-client.key.bak

# 안되서 잡다하게 했음... 이 명령어가 도움이 되었는 지는 모르겠다.
$ sudo kubeadm init phase kubeconfig all

# config 복사
$ sudo cp /etc/kubernetes/admin.conf /home/[username]/.kube/config

# 정상 동작 확인
$ kubectl get node

# 안될 경우 로그 보는 법
$ sudo crictl ps | grep kube-apiserver

$ sudo crictl logs a2650fe07fc7f -f
# 비정상 동작 에러 발생
E0920 14:47:44.447360       1 authentication.go:70] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z, verifying certificate SN=5751791429713888989, SKID=, AKID=95:58:3C:70:E1:8B:20:61:2B:A6:80:4E:93:52:2D:F6:D0:38:74:02 failed: x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z]"
E0920 14:47:44.448531       1 authentication.go:70] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z, verifying certificate SN=5751791429713888989, SKID=, AKID=95:58:3C:70:E1:8B:20:61:2B:A6:80:4E:93:52:2D:F6:D0:38:74:02 failed: x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z]"
# 이 이후로 갑자기 정상으로 바뀜
I0920 14:48:03.755701       1 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager
I0920 14:48:03.760819       1 handler.go:232] Adding GroupVersion projectcalico.org v3 to ResourceManager

- 후기

인증서를 갱신해줬음에도 불구하고 api-server 가 계속 예전 인증서를 바라보는 것으로 보이는 에러가 발생했었다. 재시작을 해줘도 인식을 못했었는데 뜬금없이 갑자기 정상 동작하였다. 정확한 동작 흐름은 좀 봐야할 듯 싶다.

참고 문헌

