'curl' 명령이 간헐적으로 실행됩니다.

'curl' 명령이 간헐적으로 실행됩니다.

고가용성 RKE2 클러스터(Rancher Kubernetes Engine 2)를 설치하고 구성하려고 합니다. 내 아키텍처는 4개의 VM으로 구성됩니다. 하나는 DNS 및 LoadBalancer가 구성되어 있고 다른 하나는 서버 노드가 구성되어 실행 중이며 2개의 VM은 조인 노드로 사용됩니다.

에이전트 노드의 로그에는 다음이 명시되어 있습니다.

    Jul 27 13:10:22 ha-rancher-2 rke2[31465]: time="2023-07-27T13:10:22Z" level=fatal msg="starting kubernetes: preparing server: failed to get CA certs: https://rancher.inwi.priv:9345/cacerts: 503 Service Unavailable"
Jul 27 13:10:22 ha-rancher-2 systemd[1]: rke2-server.service: main process exited, code=exited, status=1/FAILURE
Jul 27 13:10:22 ha-rancher-2 systemd[1]: Failed to start Rancher Kubernetes Engine v2 (server).
Jul 27 13:10:22 ha-rancher-2 systemd[1]: Unit rke2-server.service entered failed state.
Jul 27 13:10:22 ha-rancher-2 systemd[1]: rke2-server.service failed.
Jul 27 13:10:28 ha-rancher-2 systemd[1]: rke2-server.service holdoff time over, scheduling restart.
Jul 27 13:10:28 ha-rancher-2 systemd[1]: Stopped Rancher Kubernetes Engine v2 (server).
Jul 27 13:10:28 ha-rancher-2 systemd[1]: Starting Rancher Kubernetes Engine v2 (server)...
Jul 27 13:10:28 ha-rancher-2 sh[31479]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Jul 27 13:10:28 ha-rancher-2 sh[31479]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
Jul 27 13:10:28 ha-rancher-2 rke2[31485]: time="2023-07-27T13:10:28Z" level=warning msg="not running in CIS mode"
Jul 27 13:10:28 ha-rancher-2 rke2[31485]: time="2023-07-27T13:10:28Z" level=info msg="Starting rke2 v1.24.15+rke2r1 (8cf3a75d5ccd6e2aa0a99cdf869426f1decd970d)"
Jul 27 13:10:28 ha-rancher-2 rke2[31485]: time="2023-07-27T13:10:28Z" level=info msg="Managed etcd cluster not yet initialized"
Jul 27 13:10:28 ha-rancher-2 rke2[31485]: time="2023-07-27T13:10:28Z" level=fatal msg="starting kubernetes: preparing server: failed to validate server configuration: CA cert validation failed: https://rancher.inwi.priv:9345/cacerts: 503 Service Unavailable"
Jul 27 13:10:28 ha-rancher-2 systemd[1]: rke2-server.service: main process exited, code=exited, status=1/FAILURE
Jul 27 13:10:28 ha-rancher-2 systemd[1]: Failed to start Rancher Kubernetes Engine v2 (server).
Jul 27 13:10:28 ha-rancher-2 systemd[1]: Unit rke2-server.service entered failed state.
Jul 27 13:10:28 ha-rancher-2 systemd[1]: rke2-server.service failed.

문제를 조사하려고 했지만 이상하게도 "curl" 명령이 간헐적으로 작동하여 혼란스럽습니다.

    [root@HA-Rancher-2 ~]# curl -vks https://rancher.inwi.priv:9345/cacerts                                                                                * About to connect() to rancher.inwi.priv port 9345 (#0)
*   Trying 172.20.10.210...
* Connected to rancher.inwi.priv (172.20.10.210) port 9345 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* NSS: client certificate not found (nickname not specified)
* SSL connection using TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256
* Server certificate:
*       subject: CN=rke2,O=rke2
*       start date: Jul 25 19:12:23 2023 GMT
*       expire date: Jul 25 22:54:08 2024 GMT
*       common name: rke2
*       issuer: CN=rke2-server-ca@1690312343
> GET /cacerts HTTP/1.1
> User-Agent: curl/7.29.0
> Host: rancher.inwi.priv:9345
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Date: Thu, 27 Jul 2023 14:34:58 GMT
< Content-Length: 570
<
-----BEGIN CERTIFICATE-----
MIIBeTCCAR+gAwIBAgIBADAKBggqhkjOPQQDAjAkMSIwIAYDVQQDDBlya2UyLXNl
cnZlci1jYUAxNjkwMzEyMzQzMB4XDTIzMDcyNTE5MTIyM1oXDTMzMDcyMjE5MTIy
M1owJDEiMCAGA1UEAwwZcmtlMi1zZXJ2ZXItY2FAMTY5MDMxMjM0MzBZMBMGByqG
SM49AgEGCCqGSM49AwEHA0IABJdeIAgxOwLhgv7IH4hloybTf...
-----END CERTIFICATE-----
* Connection #0 to host rancher.inwi.priv left intact
[root@HA-Rancher-2 ~]# curl -vks https://rancher.inwi.priv:9345/cacerts
* About to connect() to rancher.inwi.priv port 9345 (#0)
*   Trying 172.20.10.210...
* Connected to rancher.inwi.priv (172.20.10.210) port 9345 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* NSS: client certificate not found (nickname not specified)
* SSL connection using TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256
* Server certificate:
*       subject: CN=rke2,O=rke2
*       start date: Jul 25 19:12:23 2023 GMT
*       expire date: Jul 25 22:54:20 2024 GMT
*       common name: rke2
*       issuer: CN=rke2-server-ca@1690312343
> GET /cacerts HTTP/1.1
> User-Agent: curl/7.29.0
> Host: rancher.inwi.priv:9345
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Thu, 27 Jul 2023 14:35:00 GMT
< Content-Length: 9
<
starting
* Connection #0 to host rancher.inwi.priv left intact**strong text**

LB vm의 "netstat" 명령을 통해 사용 중인 포트(예: 포트 9345)가 "ESTABLISHED" 상태로 전환되지 않고 연결 상태가 2분 이상 "TIME_WAIT" 상태로 유지되는 것을 확인할 수 있습니다. .

  Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 rancher:48916           172.20.10.11:9345       TIME_WAIT
tcp        0      0 rancher:48898           172.20.10.11:9345       TIME_WAIT
tcp        0      0 rancher:48958           172.20.10.11:9345       TIME_WAIT
tcp        0      0 rancher:56558           172.20.10.14:9345       TIME_WAIT
tcp        0      0 rancher:48988           172.20.10.11:9345       TIME_WAIT
tcp        0      0 rancher:ssh             172.20.10.200:44748     ESTABLISHED
tcp        0      0 rancher:48978           172.20.10.11:9345       TIME_WAIT
tcp        0      0 rancher:56568           172.20.10.14:9345       TIME_WAIT
tcp        0      0 rancher:9345            172.20.10.13:40856      TIME_WAIT
tcp        0      0 rancher:9345            172.20.10.13:40892      TIME_WAIT
tcp        0      0 rancher:56538           172.20.10.14:9345       TIME_WAIT
tcp        0      0 rancher:48924           172.20.10.11:9345       TIME_WAIT
tcp        0      0 rancher:56526           172.20.10.14:9345       TIME_WAIT
udp        0      0 rancher:34185           8.8.8.8:domain          ESTABLISHED
udp        0      0 rancher:41489           8.8.4.4:domain          ESTABLISHED
udp        0      0 rancher:47731           8.8.8.8:domain          ESTABLISHED
udp        0      0 rancher:40760           8.8.8.8:domain          ESTABLISHED 

관련 정보