Ubuntu/Debian 서버가 Docker에 연결되어 실행 중일 때 네트워크 연결이 끊어지는 경우가 있습니다.

Ubuntu/Debian 서버가 Docker에 연결되어 실행 중일 때 네트워크 연결이 끊어지는 경우가 있습니다.

나는 이 문제를 해결하기 위해 몇 달을 보냈고 이제 어찌할 바를 모르겠습니다. 컨테이너를 실행하기 위해 도커를 실행하는 홈 미디어 서버가 있습니다. 모든 것이 정의된 docker-compose 파일이 있습니다. 상자 자체에는 네트워크(이 경우 eero)에 의해 고정 IP가 할당됩니다. 나는 달려가서 docker-compose up -d나에게 남은 것을 남겨 두었습니다.

일주일에서 하루(일관되지 않음) 사이에 시스템의 네트워크 연결이 끊어집니다. 현재 네트워크 설정은 모뎀-->eero-->네트워크 스위치-->서버입니다. 서버에 다시 연결할 수 있는 유일한 방법은 서버를 다시 시작하는 것입니다. 그래야만 네트워크가 정상으로 돌아옵니다. 원래 Debian에서 이 문제가 발생했지만(9와 10 모두에서 발생) 내 친구가 문제 없이 Ubuntu를 실행했기 때문에 OS를 변경했습니다. Ubuntu Server(20)로 전환했는데 동일한 문제가 발생했습니다. 한마디로 나는 지켜봤다.https://github.com/moby/moby/issues/36153가능한 근본 원인이지만 제안된 파일을 추가해도 아무 일도 일어나지 않는 것 같습니다.

다음 고려 사항은 이것이 하드웨어 문제일 수 있다는 것이므로 온보드 이더넷 사용에서 USB-C 이더넷 어댑터 사용으로 전환했습니다. 이것은 3일 동안 작동하는 것처럼 보였지만 같은 문제에 직면했습니다.

현재로서는 문제의 범위를 좁히는 방법을 모르겠습니다. 나는 그것을 살펴 보았지만 syslog눈에 띄는 것은 아무것도없는 것 같습니다. 컨테이너 로그를 확인했지만 모든 컨테이너에 문제가 없습니다. Debian에서는 을 사용하고 있지만 Network ManagerUbuntu에서는 를 사용하고 있습니다 systemd-networkd. 둘 다 이 문제가 있습니다.

내 우분투 버전은Ubuntu 20.04 LTS (GNU/Linux 5.4.0-37-generic x86_64)

아래는 도움이 될 수 있도록 내 하드웨어 정보입니다.

H/W path              Device           Class          Description
=================================================================
                                       system         System Product Name (SKU)
/0                                     bus            PRIME X370-PRO
/0/0                                   memory         64KiB BIOS
/0/2c                                  memory         16GiB System Memory
/0/2c/0                                memory         8GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2133 MHz (0.5 ns)
/0/2c/1                                memory         [empty]
/0/2c/2                                memory         8GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2133 MHz (0.5 ns)
/0/2c/3                                memory         [empty]
/0/2e                                  memory         576KiB L1 cache
/0/2f                                  memory         3MiB L2 cache
/0/30                                  memory         16MiB L3 cache
/0/31                                  processor      AMD Ryzen 5 1600 Six-Core Processor
/0/100                                 bridge         Family 17h (Models 00h-0fh) Root Complex
/0/100/0.2                             generic        Family 17h (Models 00h-0fh) I/O Memory Management Unit
/0/100/1.3                             bridge         Family 17h (Models 00h-0fh) PCIe GPP Bridge
/0/100/1.3/0                           bus            X370 Series Chipset USB 3.1 xHCI Controller
/0/100/1.3/0/0        usb1             bus            xHCI Host Controller
/0/100/1.3/0/0/7                       generic        Belkin USB-C LAN
/0/100/1.3/0/1        usb2             bus            xHCI Host Controller
/0/100/1.3/0.1        scsi0            storage        X370 Series Chipset SATA Controller
/0/100/1.3/0.1/0      /dev/sda         disk           120GB SanDisk SDSSDA12
/0/100/1.3/0.1/0/1    /dev/sda1        volume         511MiB Windows FAT volume
/0/100/1.3/0.1/0/2    /dev/sda2        volume         111GiB EXT4 volume
/0/100/1.3/0.1/1      /dev/sdb         disk           3TB Hitachi HUS72403
/0/100/1.3/0.1/2      /dev/sdc         disk           3TB Hitachi HUS72403
/0/100/1.3/0.1/3      /dev/sdd         disk           3TB Hitachi HUS72403
/0/100/1.3/0.1/4      /dev/sde         disk           3TB Hitachi HUS72403
/0/100/1.3/0.1/5      /dev/sdf         disk           3TB Hitachi HUS72403
/0/100/1.3/0.2                         bridge         X370 Series Chipset PCIe Upstream Port
/0/100/1.3/0.2/0                       bridge         300 Series Chipset PCIe Port
/0/100/1.3/0.2/2                       bridge         300 Series Chipset PCIe Port
/0/100/1.3/0.2/3                       bridge         300 Series Chipset PCIe Port
/0/100/1.3/0.2/4                       bridge         300 Series Chipset PCIe Port
/0/100/1.3/0.2/4/0                     bus            ASM1142 USB 3.1 Host Controller
/0/100/1.3/0.2/4/0/0  usb3             bus            xHCI Host Controller
/0/100/1.3/0.2/4/0/1  usb4             bus            xHCI Host Controller
/0/100/1.3/0.2/6                       bridge         300 Series Chipset PCIe Port
/0/100/1.3/0.2/6/0    enp7s0           network        I211 Gigabit Network Connection
/0/100/1.3/0.2/7                       bridge         300 Series Chipset PCIe Port
/0/100/3.2                             bridge         Family 17h (Models 00h-0fh) PCIe GPP Bridge
/0/100/3.2/0                           display        GP107 [GeForce GTX 1050]
/0/100/3.2/0.1                         multimedia     GP107GL High Definition Audio Controller
/0/100/7.1                             bridge         Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
/0/100/7.1/0                           generic        Zeppelin/Raven/Raven2 PCIe Dummy Function
/0/100/7.1/0.2                         generic        Family 17h (Models 00h-0fh) Platform Security Processor
/0/100/7.1/0.3                         bus            Family 17h (Models 00h-0fh) USB 3.0 Host Controller
/0/100/7.1/0.3/0      usb5             bus            xHCI Host Controller
/0/100/7.1/0.3/1      usb6             bus            xHCI Host Controller
/0/100/8.1                             bridge         Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
/0/100/8.1/0                           generic        Zeppelin/Renoir PCIe Dummy Function
/0/100/8.1/0.2                         storage        FCH SATA Controller [AHCI mode]
/0/100/8.1/0.3                         multimedia     Family 17h (Models 00h-0fh) HD Audio Controller
/0/100/14                              bus            FCH SMBus Controller
/0/100/14.3                            bridge         FCH LPC Bridge
/0/101                                 bridge         Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/102                                 bridge         Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/103                                 bridge         Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/104                                 bridge         Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/105                                 bridge         Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/106                                 bridge         Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/107                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
/0/108                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
/0/109                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
/0/10a                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
/0/10b                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
/0/10c                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
/0/10d                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
/0/10e                                 bridge         Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
/0/1                                   system         PnP device PNP0c01
/0/2                                   system         PnP device PNP0b00
/0/3                                   system         PnP device PNP0c02
/0/4                                   communication  PnP device PNP0501
/0/5                                   system         PnP device PNP0c02
/1                    br-10d6cc4b0f64  network        Ethernet interface
/2                    veth80c7cea      network        Ethernet interface
/3                    enx302303052de3  network        Ethernet interface
/4                    vethf4fd33e      network        Ethernet interface
/5                    vethab1d028      network        Ethernet interface
/6                    vethb9ac1e0      network        Ethernet interface
/7                    veth00d454b      network        Ethernet interface
/8                    docker0          network        Ethernet interface

이것은 또한 내 docker-compose 파일입니다. 내 현재 docker 버전은 Docker version 19.03.11, build dd360c7이고 docker-compose 버전은 입니다 docker-compose version 1.26.0, build d4451659.

version: "3.7"

services:
  plex:
    image: plexinc/pms-docker
    container_name: plex
    volumes:
      - /mnt/plex/config:/config
      - /mnt/plex/Movies:/data/movies
      - /mnt/plex/Shows:/data/tvshows
      - /mnt/plex/transcode:/data/transcode
    ports:
      - 32400:32400/tcp
      - 3005:3005/tcp
      - 8324:8324/tcp
      - 32469:32469/tcp
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp
    restart: unless-stopped
    environment:
      - PUID=1000
      - PGID=1000
      - VERSION=latest
      - TZ=America/Los_Angeles
  homebridge:
    image: oznu/homebridge:latest
    container_name: homebridge
    restart: unless-stopped
    network_mode: host
    environment:
      - TZ=America/Los_Angeles
      - PGID=1000
      - PUID=1000
      - HOMEBRIDGE_CONFIG_UI=1
      - HOMEBRIDGE_CONFIG_UI_PORT=8008
    volumes:
      - /mnt/homebridge:/homebridge
  nzbget:
    image: linuxserver/nzbget:latest
    container_name: nzbget
    volumes:
      - /mnt/nzbget/config:/config
      - /mnt/nzbget/downloads:/downloads
    restart: unless-stopped
    environment:
      - TZ=America/Los_Angeles
      - PUID=1000
      - PGID=1000
    ports:
      - 6789:6789
  sonarr:
    image: linuxserver/sonarr:latest
    container_name: sonarr
    restart: unless-stopped
    depends_on:
      - nzbget
    volumes:
      - /mnt/sonarr/config:/config
      - /mnt/nzbget/downloads:/downloads
      - /mnt/plex/Shows:/tv
    environment:
      - TZ=America/Los_Angeles
      - PUID=1000
      - PGID=1000
    ports:
      - 8989:8989
  radarr:
    image: linuxserver/radarr:latest
    container_name: radarr
    restart: unless-stopped
    depends_on:
      - nzbget
    volumes:
      - /mnt/radarr/config:/config
      - /mnt/nzbget/downloads:/downloads
      - /mnt/plex/Movies:/movies
    environment:
      - TZ=America/Los_Angeles
      - PUID=1000
      - PGID=1000
    ports:
      - 7878:7878
  tautulli:
    image: linuxserver/tautulli:latest
    container_name: tautulli
    depends_on:
      - plex
    restart: unless-stopped
    environment:
      - TZ=America/Los_Angeles
      - PUID=1000
      - GUID=1000
    volumes:
      - /mnt/tautulli/config:/config
      - /mnt/tautulli/logs:/logs:ro
    ports:
      - 8181:8181

제가 놓친 내용이 있는 경우 알려주시면 더 많은 정보를 제공해 드리겠습니다.

편집하다:

또한 어젯밤에 Realtek 드라이버를 최신 버전으로 업데이트하여 이것이 문제의 원인인지 확인하려고 했습니다.journalctl

Jun 14 01:17:25 phoenix kernel: xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command.
Jun 14 01:17:25 phoenix kernel: xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: xhci_hcd 0000:01:00.0: HC died; cleaning up
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx timeout
Jun 14 01:17:25 phoenix kernel: usb 1-7: USB disconnect, device number 2
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Get ether addr fail
Jun 14 01:17:25 phoenix systemd-networkd[933]: enx302303052de3: Link DOWN

나는 다음을했다https://www.pcsuggest.com/install-rtl8153-driver-linux/. 하지만 오늘 아침에는 상황이 서로 어긋나는 것 같아서 그것이 도움이 될지 잘 모르겠습니다.

편집 2:

스냅샷으로 인해 docker가 실패하거나 다시 시작될 수 있는 것 같나요?

Jun 24 05:01:47 phoenix docker.dockerd[998]: failed to start containerd: timeout waiting for containerd to start
Jun 24 05:01:47 phoenix systemd[1]: snap.docker.dockerd.service: Main process exited, code=exited, status=1/FAILURE
Jun 24 05:01:47 phoenix systemd[1]: snap.docker.dockerd.service: Failed with result 'exit-code'.
Jun 24 05:01:47 phoenix systemd[1]: snap.docker.dockerd.service: Scheduled restart job, restart counter is at 1.
Jun 24 05:01:47 phoenix systemd[1]: Stopped Service for snap application docker.dockerd.
Jun 24 05:01:47 phoenix systemd[1]: Started Service for snap application docker.dockerd.

그 후에는 IP 재할당 트리거를 명확하게 볼 수 있으며 이로 인해 내 상자가 오프라인 상태가 됩니다.

편집 3:

이것은 iplog의 일부입니다.

[2020-07-05T00:24:28.507613] Deleted dev vetha537571 lladdr 02:42:ac:13:00:02 STALE
[2020-07-05T00:24:29.019491] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE
[2020-07-05T00:24:29.019674] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed REACHABLE
[2020-07-05T00:24:32.603688] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 STALE
[2020-07-05T00:24:59.227481] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 STALE
[2020-07-05T00:25:01.275258] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 STALE
[2020-07-05T00:25:30.715499] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 PROBE
[2020-07-05T00:25:30.715641] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 REACHABLE
[2020-07-05T00:25:34.299181] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed STALE
[2020-07-05T00:25:38.139499] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 STALE
[2020-07-05T00:25:38.139586] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 STALE
[2020-07-05T00:25:39.931537] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE
[2020-07-05T00:25:39.931823] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed REACHABLE
[2020-07-05T00:25:47.099314] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 PROBE
[2020-07-05T00:25:47.099401] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 PROBE
[2020-07-05T00:25:47.101034] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 REACHABLE
[2020-07-05T00:25:47.102485] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 REACHABLE
[2020-07-05T00:25:57.595220] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 PROBE
[2020-07-05T00:25:57.595308] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 REACHABLE
[2020-07-05T00:25:58.363503] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 PROBE
[2020-07-05T00:25:58.363730] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 REACHABLE
[2020-07-05T00:26:00.667505] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 STALE
[2020-07-05T00:26:12.955465] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed STALE
[2020-07-05T00:26:19.099249] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE
[2020-07-05T00:26:19.099393] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed REACHABLE
[2020-07-05T00:26:29.339502] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 STALE
[2020-07-05T00:26:29.339583] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 STALE
[2020-07-05T00:26:37.531222] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 STALE
[2020-07-05T00:26:37.531304] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 STALE
[2020-07-05T00:26:47.003597] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 PROBE
[2020-07-05T00:26:47.003678] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 PROBE
[2020-07-05T00:26:47.005742] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 REACHABLE
[2020-07-05T00:26:47.007351] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 REACHABLE
[2020-07-05T00:27:00.827525] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 PROBE
[2020-07-05T00:27:00.827816] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 REACHABLE
[2020-07-05T00:27:12.859480] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed STALE
[2020-07-05T00:27:19.003172] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE

답변1

이에 대한 답을 찾기 위해 더 깊이 파고들어야 했습니다. 하지만 결국 컴퓨터를 모니터에 연결하고 실행했는데, 다음에 네트워크 연결이 끊어졌을 때 CPU가 잠겨 있는 것을 발견했습니다.

일부 빠른 검색에 따르면 이는 Ryzen CPU 전원 상태 문제일 수 있습니다.https://askubuntu.com/a/1259021

이 답변을 바탕으로 이 가이드에 따라 C6 전원 상태를 비활성화했습니다.https://forum.manjaro.org/t/fix-ryzen-lockups-lated-to-low-system-usage/39723

내 가동 시간은 아무런 문제 없이 3일에 가깝습니다. 현재 Wi-Fi를 사용하고 있지만 기기를 다시 유선으로 전환할 계획입니다. 한 달 후에 업데이트하고 그 이후로 가동 시간이 어떻게 되었는지 확인하겠습니다. 비슷한 문제가 있는 다음 사람에게 도움이 되기를 바랍니다.

2022년 편집: 원래 링크가 끊어졌습니다. 여기 archive.org 링크가 있습니다.https://web.archive.org/web/20200417190251/https://forum.manjaro.org/t/fix-ryzen-lockups-lated-to-low-system-usage/39723

관련 정보