EFK를 정리하자 6일차 - EFK DaemonSet

놀고 싶은데, 왜 다들 공부하는거야·2025년 4월 4일

EFK

목록 보기

6/8

Fluentd daemonset

daemonset으로 fluentd를 배포하기 전에 확인해야할 것이있다. kubernetes에서 fluentd는 daemonset으로 각 node마다 하나씩 배포되는 형식인데, 각 node의 host path로 /var/lib/docker/containers를 volume으로 삼는다.

/var/lib/docker/containers path를 volume으로 삼는 이유는 docker에서 default로 해당 path에 container log file을 적재하기 때문이다. 따라서, 만약 개발자가 해당 위치가 아닌 다른 위치에 container log file을 적재한다면 daemonset volume을 변경하거나, config 파일을 수정해야한다.

vi /etc/docker/daemon.json

다음은 host의 docker daemon configuration파일이 있는 위치로 여기에 만약 data-root가 이미 지정되어 있다면 해당 위치에 docekr container log file들이 적재된다. 따라서 volume을 수정하거나, daemon.json을 volume위치로 수정해야한다. 만약, 로그가 적재되는 곳을 /var/lib/docker/containers로 설정했다면, daemon.json을 수정하고 docker를 재시작을 해야한다. 또는 해당 path가 default이기 때문에 data-root부분을 삭제하고 docker를 재시작하면 된다.

systemctl daemon-reload
systemctl restart docker

fluentd를 실행하기 앞서, elasticsearch와 kibana를 먼저 구동시키도록 하자.

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.17.14
docker run  -p 5601:5601 -e "ELASTICSEARCH_HOSTS=http://${elasticsearch-host}:9200" docker.elastic.co/kibana/kibana:7.17.14

elasticsearch container를 띄우고 9300 port로 port forward시켰다. 따라서 kibana를 elasticsearch와 연결시키는 ELASTICSEARCH_HOSTS에 elasticsearch가 배포된 host-ip를 적어주어야한다. 따라서 정의해야할 값은 다음과 같다.
1. elasticsearch-host: elasticsearch container가 배포된 host

이제 daemonset으로 fluentd를 실행시키면된다. daemonset을 간단하게 소개하면 kubernetes cluster의 각 node마다, 하나의 pod를 실행시켜주는 기능을 한다. 이때 특정 node는 pod를 생성하지 않도록 할 수도 있다.

이렇게 daemonset으로 fluentd를 실행시켜 각 node에서 실행 중인 pod의 log를 각 node의 fluentd pod가 elasticsearch나 지정된 path로 가져오는 것이다.

                                                     |--elasticsearch--|
-----------------master node-----------------        |                 |
|                                           |        |                 |
|   --pod--  --pod--    ---fluentd---       |        |  |------------| |
|   |     |  |     |    |           |       | -log-> |  |unified logs| |
|   -------  -------    -------------       |        |  |------------| |
---------------------------------------------        |                 |
                                                     |                 |
-----------------worker node-----------------        |                 |
|                                           |        |                 |
|   --pod--  --pod--    ---fluentd---       |        |                 |
|   |     |  |     |    |           |       | -log-> |                 |
|   -------  -------    -------------       |        |                 |
---------------------------------------------        |                 |
                                                     |                 |
-----------------worker node-----------------        |                 |
|                                           |        |                 |
|   --pod--  --pod--    ---fluentd---       |        |                 |
|   |     |  |     |    |           |       | -log-> |                 |
|   -------  -------    -------------       |        |                 |
---------------------------------------------        |                 |
                                                     |-----------------|

즉, pod의 log들이 node의 /var/log 아래에 저장되고, 이를 fluentd에서 가져와서 통합한 다음, elasticsearch로 보내는 것이다. 참고로, elasticsearch안에서는 logstash로 각 node에서 온 log들을 unified log로 만든다.

먼저 fluentd daemonset에서 사용할 configmap을 만들도록 하자. configmap을 만들어 fluent.conf를 fluentd daemonset에 넘겨주는 것이다.

config

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: fluentd
data:
  fluent.conf: |- 
    <source>
      @type tail
      @id in_tail_container_logs
      path "/var/log/containers/*_fluentd_*.log"
      exclude_path ["/var/log/containers/kube-apiserver*"]
      pos_file "/var/log/fluentd-containers.log.pos"
      tag "**"
      read_from_head true
      format json
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      <parse>
        time_format %Y-%m-%dT%H:%M:%S.%NZ
        @type json
        time_type string
      </parse>
    </source>
    <filter **>
      @type record_transformer
      <record>
        tag ${tag}
      </record>
    </filter>
    <match **>
      @type elasticsearch
      host ${elasticsearch-host}
      port ${elasticsearch-port}
      logstash_format true
      index_name logstash
      type_name fluentd
    </match>

fluent.conf에 elasticsearch가 배포된 host-ip와 host-port를 적어주어야 한다. 따라서 정의해야할 값은 다음과 같다.
1. elasticsearch-host: elasticsearch가 배포된 host ip
2. elasticsearch-port: elasticsearch가 배포된 host의 port

/var/log/containers에 있는 log들을 source로 삼는데, 여기를 source path로 삼는 이유는 pod의 이름과 namespace에 따라 쉽게 구분할 수 있기 때문이다. 단, 주의할 것이 있는데, /var/log/containers에 가서 ls -al을 해보도록 하자.

ls -al

lrwxrwxrwx  1 root root     121 10월 24 15:37 pod-2_default_update-desired-replicas-e7441a7abf695757a1766956acb668f3b3176dd147019d7f18684515e57fddc9.log -> /var/log/pods/pod-2_42de12ae-5944-4610-9f9d-3a813fa64425/update-desired-replicas/0.log

다음과 같이 symbolic link로 되어있을 것이다. 따라서, 추후에 fluentd의 daemonset에 symbolic link로 연결된 원본 파일이 있는 곳도 가져와야한다. 안그러면 symbolic link라서 원본을 가져오면 데이터가 없어 아무것도 log들이 들어가지 않는다.

[warn]: #0 [in_tail_container_logs] /var/log/containers/kube-proxy-v476t_kube-system_kube-proxy-76fa0d2039eccfb278edb824780e1c12c20b4e1ff985894a83d0ce557e5a6f85.log unreadable. It is excluded and would be examined next time.

이런 식의 warn이 발생하는 대부분은 symbolic link가 있는 파일만 가져오고, 링킹된 원본 파일은 volume으로 안가져와서 container에서 접근을 못해서 그렇다.

주의: kubernetes pod의 log를 가져오는 것은 /var/log/containers보다 /var/log/pods에서 가져오는 것이 좋다. /var/log/containers는 kubernetes log뿐만 아니라 container log를 가져온다는 문제도 있고, symbolic link로 가져오기 때문에 의도치 않은 문제가 발생할 수 있다.

다음으로 fluentd가 kubernetes cluster에 접근할 수 있도록 rule binding이 필요하다.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: fluentd

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: fluentd

마지막으로 fluentd daemonset을 만들어주면 된다.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: fluentd
  labels:
    k8s-app: fluentd-logging
    version: v1
spec:
  selector:
    matchLabels:
      k8s-app: fluentd-logging
      version: v1
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/control-plane
        effect: NoSchedule
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
        env:
          - name: K8S_NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          #- name:  FLUENT_ELASTICSEARCH_HOST
          #  value: "xx.xxx.xxx.xxx"
          #- name:  FLUENT_ELASTICSEARCH_PORT
          #  value: "xxxx"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          # Option to configure elasticsearch plugin with self signed certs
          # ================================================================
          - name: FLUENT_ELASTICSEARCH_SSL_VERIFY
            value: "true"
          # Option to configure elasticsearch plugin with tls
          # ================================================================
          - name: FLUENT_ELASTICSEARCH_SSL_VERSION
            value: "TLSv1_2"
          # X-Pack Authentication
          # =====================
          - name: FLUENT_ELASTICSEARCH_USER
            value: "elastic"
          - name: FLUENT_ELASTICSEARCH_PASSWORD
            value: "changeme"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: dockercontainerlogdirectory2
          mountPath: /var/lib/docker/containers
          readOnly: true
        # When actual pod logs in /var/log/pods, the following lines should be used.
        - name: dockercontainerlogdirectory
          mountPath: /var/log/pods
          readOnly: true
        - name: config
          mountPath: /fluentd/etc/fluent.conf
          subPath: fluent.conf
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      # When actual pod logs in /var/lib/docker/containers, the following lines should be used.
      # - name: dockercontainerlogdirectory
      #   hostPath:
      #     path: /var/lib/docker/containers
      # When actual pod logs in /var/log/pods, the following lines should be used.
      - name: dockercontainerlogdirectory
        hostPath:
          path: /var/log/pods
      - name: dockercontainerlogdirectory2
        hostPath:
          path: /var/lib/docker/containers
      - name: config
        configMap:
          name: fluentd-config

fluentd daemonset을 elasticsearch와 연결하게 해준다. env부분에 FLUENT_ELASTICSEARCH_HOST등의 각종 환경변수가 있지만 필자는 그냥 configmap에 다 적어주었기 때문에 사실상 의미는 없다. 즉, 앞선 configmap에서 정의한 fluent.conf의 elasticsearch-host, elasticsearch-port` 부분이 이 부분에 해당한다.

이들을 모두 하나의 파일로 합치면 다음이 된다.

fluentd-elasticsearch.yaml

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: fluentd
data:
  fluent.conf: |- 
    <source>
      @type tail
      @id in_tail_container_logs
      path "/var/log/containers/*_fluentd_*.log"
      exclude_path ["/var/log/containers/kube-apiserver*"]
      pos_file "/var/log/fluentd-containers.log.pos"
      tag "**"
      read_from_head true
      format json
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      <parse>
        time_format %Y-%m-%dT%H:%M:%S.%NZ
        @type json
        time_type string
      </parse>
    </source>
    <filter **>
      @type record_transformer
      <record>
        tag ${tag}
      </record>
    </filter>
    <match **>
      @type elasticsearch
      host ${elasticsearch-host}
      port ${elasticsearch-port}
      logstash_format true
      index_name logstash
      type_name fluentd
    </match>

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: fluentd

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: fluentd
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: fluentd
  labels:
    k8s-app: fluentd-logging
    version: v1
spec:
  selector:
    matchLabels:
      k8s-app: fluentd-logging
      version: v1
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/control-plane
        effect: NoSchedule
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
        env:
          - name: K8S_NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          #- name:  FLUENT_ELASTICSEARCH_HOST
          #  value: "xxx.xxx.xxx.xxx"
          #- name:  FLUENT_ELASTICSEARCH_PORT
          #  value: "xxxx"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          # Option to configure elasticsearch plugin with self signed certs
          # ================================================================
          - name: FLUENT_ELASTICSEARCH_SSL_VERIFY
            value: "true"
          # Option to configure elasticsearch plugin with tls
          # ================================================================
          - name: FLUENT_ELASTICSEARCH_SSL_VERSION
            value: "TLSv1_2"
          # X-Pack Authentication
          # =====================
          - name: FLUENT_ELASTICSEARCH_USER
            value: "elastic"
          - name: FLUENT_ELASTICSEARCH_PASSWORD
            value: "changeme"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: dockercontainerlogdirectory2
          mountPath: /var/lib/docker/containers
          readOnly: true
        # When actual pod logs in /var/log/pods, the following lines should be used.
        - name: dockercontainerlogdirectory
          mountPath: /var/log/pods
          readOnly: true
        - name: config
          mountPath: /fluentd/etc/fluent.conf
          subPath: fluent.conf
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      # When actual pod logs in /var/lib/docker/containers, the following lines should be used.
      # - name: dockercontainerlogdirectory
      #   hostPath:
      #     path: /var/lib/docker/containers
      # When actual pod logs in /var/log/pods, the following lines should be used.
      - name: dockercontainerlogdirectory
        hostPath:
          path: /var/log/pods
      - name: dockercontainerlogdirectory2
        hostPath:
          path: /var/lib/docker/containers
      - name: config
        configMap:
          name: fluentd-config

이제 fluentd를 실행해보도록 하자.

kubectl create -f fluentd-elasticsearch.yaml
configmap/fluentd-config created
serviceaccount/fluentd created
clusterrole.rbac.authorization.k8s.io/fluentd created
clusterrolebinding.rbac.authorization.k8s.io/fluentd created
daemonset.apps/fluentd created

실행된 것을 확인할 수 있다. log들이 정제되어 elasticsearch에 모아지고 있는 지 kibana로 확인해보도록 하자.

kibana를 배포한 host-ip에서 아래의 link를 따라가면 대문 페이지가 나온다.

http://{host-ip}:5601/app/management/kibana/indexPatterns

해당 페이지에서 Create index pattern을 누르자.
다음으로 index pattern을 만들어주면 되는데, 우리의 index는 logstash였으므로 logstash-*로 써주고 timestamp field를 넣어준다. 완성되었다면 Create index pattern을 눌러주도록 하자.
완료되었다면 logstash-*라는 페이지가 나온다. 이제 해당 index를 통해서 우리의 kubernetes cluster log들이 잘 모이고 있는 지 확인해보도록 하자.
왼쪽 상단의 메뉴바를 누르면 다음과 같이 Discover 탭이 나온다. 여기를 클릭하면 Elasticsearch로 전달된 log들을 histogram으로 확인할 수 있다.
왼쪽의 Add filter부분을 보면 우리가 만든 index pattern인 logstash-*가 있는 것을 확인할 수 있다. 만약 없다면 logstash-*로 바꾸어주면 된다.

이제 그래프를 눌러보면 바로 아래에 table 형식으로 log들이 보이게 된다.

놀고 싶은데, 왜 다들 공부하는거야

R3의 망령

이전 포스트

EFK를 정리하자 5일차 - Fluentd Configuration4

다음 포스트

EFK를 정리하자 6일차 - EFK DaemonSet

EFK

Fluentd daemonset

EFK를 정리하자 5일차 - Fluentd Configuration4

EFK를 정리하자 7일차 - Containerd, CRI-O, RollOver

0개의 댓글