๐Ÿณโ€๐ŸŒˆ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ์Šคํ„ฐ๋”” PKOS 5์ฃผ์ฐจ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค / ๊ทธ๋ผํŒŒ๋‚˜

Burstยท2023๋…„ 2์›” 24์ผ
0

๐Ÿ˜ŽPKOS์Šคํ„ฐ๋””

๋ชฉ๋ก ๋ณด๊ธฐ
5/7


๋ชจ๋‹ˆํ„ฐ๋ง

์„œ๋น„์Šค๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋ชจ๋‹ˆํ„ฐ๋ง์€ ์ค‘์š”ํ•œ ์š”์†Œ ์ค‘ ํ•˜๋‚˜์ด๋‹ค.
CPU, Memory, ๋””์Šคํฌ ๋“ฑ ๋ฆฌ์†Œ์Šค ์‚ฌ์šฉ๋Ÿ‰, ๋„คํŠธ์›Œํฌ ์ƒํƒœ ๋“ฑ ์„œ๋น„์Šค์˜ ์ „์ฒด์ ์ธ ์ƒํƒœ ํ™•์ธ, ๋ถ„์„, ์žฅ์• ํŒŒ์•… ๋“ฑ์ด ๋ชจ๋‹ˆํ„ฐ๋ง์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ์ด๋‹ค.

๋ชฉํ‘œ

์ด๋ฒˆ 5์ฃผ์ฐจ ์Šคํ„ฐ๋””์—์„œ๋Š” ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค์˜ ๊ธฐ๋ณธ ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ํ‘œ์ค€์œผ๋กœ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค, ์‹œ๊ฐํ™” ๋„๊ตฌ์ธ ๊ทธ๋ผํŒŒ๋‚˜์— ๋Œ€ํ•ด์„œ ํ•™์Šต ํ•  ๊ณ„ํš์ด๋‹ค.

์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ๊ธฐ๋ณธ ๋ชจ๋‹ˆํ„ฐ๋ง ๋„๊ตฌ

๋ณ„๋„์˜ metric-server Pod๋ฅผ ์„ค์น˜ํ•˜์—ฌ ํ•ด๋‹น ์„œ๋ฒ„๊ฐ€ cAdvisor๋ฅผ ํ†ตํ•ด ์ˆ˜์ง‘๋˜๋Š” Pod์˜ ๋ฆฌ์†Œ์Šค metric์„ ์ˆ˜์ง‘

ํ˜„์žฌ ์Šคํ„ฐ๋””์—์„œ ์‚ฌ์šฉํ•˜๋Š” kops์˜ ๊ฒฝ์šฐ ํด๋Ÿฌ์Šคํ„ฐ์˜ metric ์„œ๋ฒ„๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ

kops edit cluster
-----
spec:
  certManager:
    enabled: true
  awsLoadBalancerController:
    enabled: true
  externalDns:
    provider: external-dns
  #[๋ฉ”ํŠธ๋ฆญ ์„œ๋ฒ„ ๊ด€๋ จ ๋‚ด์šฉ]
  metricsServer:
    enabled: true
  kubeProxy:
    metricsBindAddress: 0.0.0.0
-----
  • top
    • ๋ฆฌ๋ˆ…์Šค top ๋ช…๋ น์–ด์™€ ์œ ์‚ฌํ•˜๋ฉฐ, ํด๋Ÿฌ์Šคํ„ฐ์˜ node ๋ฐ pod์˜ CPU, MEMORY ์‚ฌ์šฉ๋Ÿ‰ ํ™•์ธ ๊ฐ€๋Šฅ
    • kubectl top nodes, kubectl top pods

      --sort-by ์˜ต์…˜์„ ํ†ตํ•ด ์˜ค๋ฆ„์ฐจ์ˆœ ์ •๋ ฌ ๊ฐ€๋Šฅ
  • df-pv
    • Persistent Volume(PV)์˜ ์ •๋ณด ํ™•์ธ
    • krew๋ฅผ ํ†ตํ•ด ์„ค์น˜kubectl krew install df-pv
    • Pod ์ ‘์† ํ›„ df -h๋ฅผ ์ž…๋ ฅํ•˜์—ฌ disk ์ •๋ณด๋ฅผ ์กฐํšŒํ•˜์ง€ ์•Š์•„๋„ ๋””์Šคํฌ ์‚ฌ์šฉ ์ •๋ณด ํ™•์ธ ๊ฐ€๋Šฅ
  • k9s
    • ๋ช…๋ น์–ด ๊ธฐ๋ฐ˜ ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ๋ชจ๋‹ˆํ„ฐ๋ง ๋„๊ตฌ
    • ์ž์› ์‚ฌ์šฉ๋Ÿ‰, ์ „์ฒด Pod ์ˆ˜๋Ÿ‰ ๋“ฑ ์ „๋ฐ˜์ ์ธ ํ˜„ํ™ฉ ํŒŒ์•… ๊ฐ€๋Šฅ
    • ์ด๋ฒคํŠธ ๋ฉ”์‹œ์ง€๋Š” ์ตœ๋Œ€ 1์‹œ๊ฐ„ ์ด๋‚ด ์ •๋ณด๋งŒ ๋‚˜ํƒ€๋‚ด๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ ์ „ ๋‚ด์šฉ์˜ ๋ฐ์ดํ„ฐ๋Š” ํ™•์ธ ๋ถˆ๊ฐ€๋Šฅ
    • ์ถ”ํ›„ ์„ค์น˜ ๋ฐ ์‚ฌ์šฉ ํ›„ ๋‚ด์šฉ ๊ธฐ์ž…

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค

์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ํ™˜๊ฒฝ์—์„œ ๋ชจ๋‹ˆํ„ฐ๋ง

  • ๋…ธ๋“œ์™€ ์ปจํ…Œ์ด๋„ˆ ์ž์› ์‚ฌ์šฉ๋Ÿ‰ ๋ชจ๋‹ˆํ„ฐ๋ง
  • ํด๋Ÿฌ์Šคํ„ฐ ๋ชจ๋‹ˆํ„ฐ๋ง
  • ์• ํ”Œ๋ฆฌ์ผ€์ด์…” ๋ชจ๋‹ˆํ„ฐ๋ง

์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ํ™˜๊ฒฝ์—์„œ ๋ชจ๋‹ˆํ„ฐ๋ง ์†”๋ฃจ์…˜ ์ค‘ ์‚ฌ์‹ค์ƒ ํ‘œ์ค€์ธ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๊ฐ€ ์•Œ์•„๋ณด์ž

  • ์˜คํ”ˆ์†Œ์Šค
    • ๋ฌด๋ฃŒ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋ฉฐ ์‚ฌ์‹ค์ƒ ํ‘œ์ค€์œผ๋กœ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ์Œ
  • ์„œ๋น„์Šค ๋””์Šค์ปค๋ฒ„๋ฆฌ
    • ๋™์ ์œผ๋กœ Pod๊ฐ€ ํ™•์žฅ ๋ฐ ์ถ•์†Œ๋˜๋Š” ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ํ™˜๊ฒฝ์—์„œ ์ž๋™์œผ๋กœ Pod์˜ ์ƒ์„ฑ ๋ฐ ์‚ญ์ œ ๋ฐ˜์˜
  • Pull ๋ฐฉ์‹
    • ๋ชจ๋‹ˆํ„ฐ๋ง ๋Œ€์ƒ์— Agent๋ฅผ ์„ค์น˜ํ•˜๊ณ  Agent๊ฐ€ ์ค‘์•™ ์„œ๋ฒ„๋กœ ๋ชจ๋‹ˆํ„ฐ๋ง ์ •๋ณด๋ฅผ ์ „๋‹ฌํ•˜๋Š” Push ๋ฐฉ์‹์ด ์•„๋‹Œ ์ค‘์•™ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋ฒ„๊ฐ€ ๋ชจ๋‹ˆํ„ฐ๋ง ๋Œ€์ƒ์˜ ์ •๋ณด๋ฅผ ์ง์ ‘ ๊ฐ€์ €์˜ค๋Š” ๋ฐฉ์‹
    • ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ํ™˜๊ฒฝ์—์„œ๋Š” Pod๊ฐ€ ๋™์ ์œผ๋กœ ์ƒ์„ฑ ๋ฐ ์‚ญ์ œ๋˜๊ธฐ ๋•Œ๋ฌธ์— Push๋ฐฉ์‹๋ณด๋‹ค Pull ๋ฐฉ์‹์ด ํšจ์œจ์ ์ž„
  • ๋‹ค์–‘ํ•œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ต์Šคํฌํ„ฐ ์ œ๊ณต
    • ๋‹ค์–‘ํ•œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜๊ณผ ์—ฐ๋™ ๊ฐ€๋Šฅํ•œ ๋ฉ”ํŠธ๋ฆญ ์ •๋ณด๋ฅผ ์ œ๊ณต
  • ๋‹ค์–‘ํ•œ ๋ ˆ์ด๋ธ” ์ง€์›
    • ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ๋ ˆ์ด๋ธ” ์ฒ˜๋Ÿผ ๋‹ค์–‘ํ•œ ๋ ˆ์ด๋ธ”์„ ์ง€์›ํ•˜์—ฌ ์‚ฌ์šฉ์ž๊ฐ€ ์›ํ•˜๋Š” ๋ฉ”ํŠธ๋ฆญ๋งŒ ํ•„ํ„ฐ๋งํ•ด์„œ ์กฐํšŒ ๋ฐ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  • ์ž์ฒด ๊ฒ€์ƒ‰์–ธ์–ด ์ œ๊ณต(PromQL, Prometheus Query Language)
  • ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ๋ฒ ์ด์Šค(Time-series databases, TBDB)์‚ฌ์šฉ

[์ „์ฒด ๊ตฌ์กฐ]

์„ค์น˜

hele์„ ์ด์šฉํ•˜์—ฌ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค-์Šคํƒ ์„ค์น˜

# ๋„ค์ž„์ŠคํŽ˜์ด์Šค ์ƒ์„ฑ
kubectl create ns monitoring

# ์‚ฌ์šฉ ๋ฆฌ์ „์˜ ์ธ์ฆ์„œ ARN ํ™•์ธ
CERT_ARN=`aws acm list-certificates --query 'CertificateSummaryList[].CertificateArn[]' --output text`
echo "alb.ingress.kubernetes.io/certificate-arn: $CERT_ARN"

# ํ—ฌ๋ฆ„ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ €์žฅ๋„ ์ถ”๊ฐ€
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
# ์ธ์ฆ์„œ ์ •๋ณด ๋ฐ ํด๋Ÿฌ์Šคํ„ฐ๋ช…์„ ๋ณ€์ˆ˜๋กœ ์ง€์ •ํ•˜์—ฌ ์ €์žฅ
cat <<EOT > ~/monitor-values.yaml
alertmanager:
  ingress:
    enabled: true
    ingressClassName: alb

    annotations:
      alb.ingress.kubernetes.io/scheme: internet-facing
      alb.ingress.kubernetes.io/target-type: ip
      alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
      alb.ingress.kubernetes.io/certificate-arn: $CERT_ARN
      alb.ingress.kubernetes.io/success-codes: 200-399
      alb.ingress.kubernetes.io/group.name: "monitoring"

    hosts:
      - alertmanager.$KOPS_CLUSTER_NAME

    paths:
      - /*


grafana:
  defaultDashboardsTimezone: Asia/Seoul
  adminPassword: prom-operator

  ingress:
    enabled: true
    ingressClassName: alb

    annotations:
      alb.ingress.kubernetes.io/scheme: internet-facing
      alb.ingress.kubernetes.io/target-type: ip
      alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
      alb.ingress.kubernetes.io/certificate-arn: $CERT_ARN
      alb.ingress.kubernetes.io/success-codes: 200-399
      alb.ingress.kubernetes.io/group.name: "monitoring"

    hosts:
      - grafana.$KOPS_CLUSTER_NAME

    paths:
      - /*

prometheus:
  ingress:
    enabled: true
    ingressClassName: alb

    annotations:
      alb.ingress.kubernetes.io/scheme: internet-facing
      alb.ingress.kubernetes.io/target-type: ip
      alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
      alb.ingress.kubernetes.io/certificate-arn: $CERT_ARN
      alb.ingress.kubernetes.io/success-codes: 200-399
      alb.ingress.kubernetes.io/group.name: "monitoring"

    hosts:
      - prometheus.$KOPS_CLUSTER_NAME

    paths:
      - /*

  prometheusSpec:
    serviceMonitorSelectorNilUsesHelmValues: false
    retention: 5d
    retentionSize: "10GiB"
EOT

# ๋ฐฐํฌ
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack --version 45.0.0 -f monitor-values.yaml --namespace monitoring

ํ—ฌ๋ฆ„ ์ฐจํŠธ ๋ฐฐํฌ ํ›„ LB ๊ตฌ์„ฑ์ด ์™„๋ฃŒ๋˜๋ฉด, ๋„ˆ๋ฌด ๊ฐ„ํŽธํ•˜๊ฒŒ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค, ๊ทธ๋ผํŒŒ๋‚˜, ์–ผ๋Ÿฟ๋งค๋‹ˆ์ €๋ฅผ ๋ฐ”๋กœ ์‚ฌ์šฉ ํ•  ์ˆ˜ ์žˆ๋‹ค.(๊ฐœ๊ฟ€๐ŸŒ)

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์›นํŽ˜์ด์ง€์—์„œ ์ˆ˜์ง‘ํ•œ metric์— ๋Œ€ํ•˜์—ฌ ํ™•์ธ ๋ฐ ์„ค์ •์ด ๊ฐ€๋Šฅํ•˜์ง€๋งŒ, ๋ณด๋‹ค ์ข‹์€ ์‹œ๊ฐํ™”๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ทธ๋ผํŒŒ๋‚˜๋ฅผ ํ†ตํ•ด metirc ํ™•์ธ!

๋˜ํ•œ ๋‹ค์–‘ํ•œ ๊ณต์œ  dashboard๋ฅผ ์ ์šฉํ•˜์—ฌ ํšจ์œจ์ ์ด๊ณ  ์ด๋ฟ ๋ชจ๋‹ˆํ„ฐ๋ง dashboard๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

[๊ณผ์ œ]

๊ณผ์ œ1. ๊ทธ๋ผํŒŒ๋‚˜์— ์—ฌ๋Ÿฌ ๋Œ€์‰ฌ๋ณด๋“œ๋ฅผ ์ถ”๊ฐ€ํ•ด๋ณด์ž!

  • node Exporter Full ๋Œ€์‰ฌ๋ณด๋“œ ์ ์šฉ ํ™”๋ฉด
  • kube-state-metrics-v2 ๋Œ€์‰ฌ๋ณด๋“œ ์ ์šฉ ํ™”๋ฉด
  • 1 Kubernetes All-in-one Cluster Monitoring KR ๋Œ€์‰ฌ๋ณด๋“œ ์ ์šฉ ํ™”๋ฉด

๊ณผ์ œ2. Nginx ์›น ํŒŒ๋“œ ๋ฐฐํฌ ํ›„ ๊ด€๋ จ metric ๋ฅผ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์›น์—์„œ ํ™•์ธ, ๊ทธ๋ผํŒŒ๋‚˜์— nginx ์›น์„œ๋ฒ„ ๋Œ€์‹œ๋ณด๋“œ๋ฅผ ์ถ”๊ฐ€

  • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์›น์—์„œ nginx metric ํ™•์ธ ํ™”๋ฉด

  • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์˜ ํƒ€๊ฒŸ ํ™•์ธ
  • 12708 ๋Œ€์‰ฌ๋ณด๋“œ(Nginx ์›น) ์ ์šฉ ํ™”๋ฉด

๋งˆ๋ฌด๋ฆฌ

์ด๋ฒˆ ์Šคํ„ฐ๋””์—์„œ๋Š” ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ํ™˜๊ฒฝ์—์„œ ๊ธฐ๋ณธ ๋ชจ๋‹ˆํ„ฐ๋ง ๋„๊ตฌ ๋ฐ ํ‘œ์ค€์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋ฅผ ํ•™์Šตํ•˜์˜€๋‹ค.
ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๊ทธ๋ผํŒŒ๋‚˜, ์–ผ๋Ÿฟ๋งค๋‹ˆ์ €์™€ ์—ฐ๋™ํ•˜์—ฌ ์‹œ๊ฐํ™”, ์•Œ๋žŒ ๋“ฑ์„ ํŽธํ•˜๊ฒŒ ์„ค์ • ๋ฐ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.
๋˜ํ•œ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค-์Šคํƒ์„ ์„ค์น˜ํ•˜๋ฉด์„œ ํ•œ๋ฒˆ์— ๋ชจ๋“  ๊ตฌ์„ฑ์„ ํ•˜์—ฌ ํŽธ๋ฆฌํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.
๋‹ค์Œ ์‹œ๊ฐ„์—๋Š” ์–ผ๋Ÿฟ๋งค๋‹ˆ์ € ๋ฐ ๋กœ๊น… ์‹œ์Šคํ…œ์— ๋Œ€ํ•ด์„œ ํ•™์Šต ํ•  ๊ณ„ํš์ด๋‹ค.

profile
Cloud Developer

0๊ฐœ์˜ ๋Œ“๊ธ€