네트워크 연결의 무작위 손실 "eno1: PCIe 링크 손실, 장치가 이제 분리됨"

네트워크 연결의 무작위 손실 "eno1: PCIe 링크 손실, 장치가 이제 분리됨"

나는 아치 리눅스 커널 6.6.8-arch1-1을 실행하고 있는데 무작위로 (며칠마다) 인터넷 연결이 끊어지는 것 같습니다. 재부팅하는 것 외에 네트워크 연결을 성공적으로 복원하는 방법을 모르겠습니다. (systemd-networkd를 다시 시작해도 연결이 복원되지 않습니다)

이 문제는 계속 발생하지만 dmesg 로그의 내용을 완전히 이해하지 못합니다.

커널이 이더넷 포트를 제어하는 ​​모듈을 비활성화하고 있는 것 같습니다. 아니면 모듈이 충돌한 것 같습니다.

BIOS 펌웨어 업데이트를 시도했지만 문제가 해결되지 않았습니다.

이 문제는 마더보드 자체의 결함으로 인해 발생합니까, 아니면 사용 중인 일부 실험적인 패키징 아키텍처로 인해 발생합니까? 패키지 관리자를 사용하여 특정 패키지 버전을 롤백할 수 있나요? 이 문제를 해결하기 위해 내가 무엇을 더 할 수 있는지 잘 모르겠습니다. 아래 게시한 내용이 충분하지 않은 경우 자세한 내용을 확인하기 위해 기록을 시작할 수 있습니까?

Jan 06 11:30:35 nix kernel: igc 0000:08:00.0 eno1: PCIe link lost, device now detached
Jan 06 11:30:35 nix kernel: ------------[ cut here ]------------
Jan 06 11:30:35 nix kernel: igc: Failed to read reg 0xc030!
Jan 06 11:30:35 nix kernel: WARNING: CPU: 6 PID: 1259 at drivers/net/ethernet/intel/igc/igc_main.c:6641 igc_rd32+0x8d/0xa0 [igc]
Jan 06 11:30:35 nix kernel: Modules linked in: snd_seq_dummy snd_seq btusb btrtl btintel btbcm btmtk bluetooth ecdh_generic mousedev vfat fat joydev intel_rapl_msr intel_rapl_common edac_mce_amd amdgpu kvm_amd kvm mt7921e mt7921_common mt792x_lib irqbypass snd_hda_codec_hdmi mt76_connac_lib crct10dif_pclmul crc32_pclmul polyval_clmulni mt76 drm_exec polyval_generic gf128mul snd_hda_intel amdxcp snd_usb_audio ghash_clmulni_intel drm_buddy snd_intel_dspcfg sha512_ssse3 mac80211 gpu_sched asus_nb_wmi snd_intel_sdw_acpi sha256_ssse3 eeepc_wmi i2c_algo_bit snd_usbmidi_lib sha1_ssse3 asus_wmi snd_hda_codec snd_ump drm_suballoc_helper ledtrig_audio aesni_intel snd_rawmidi drm_ttm_helper sparse_keymap snd_hda_core libarc4 ttm snd_seq_device crypto_simd platform_profile mc snd_hwdep i8042 cryptd cfg80211 razermouse(OE) usbhid drm_display_helper snd_pcm serio wmi_bmof cec snd_timer rapl sp5100_tco pcspkr video snd soundcore k10temp rfkill ccp igc i2c_piix4 gpio_amdpt wmi gpio_generic mac_hid i2c_dev crypto_user dm_mod fuse loop nfnetlink
Jan 06 11:30:35 nix kernel:  ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 nvme crc32c_intel xhci_pci nvme_core xhci_pci_renesas nvme_common
Jan 06 11:30:35 nix kernel: CPU: 6 PID: 1259 Comm: Qt bearer threa Tainted: G           OE      6.6.8-arch1-1 #1 2ffcc416f976199fcae9446e8159d64f5aa7b1db
Jan 06 11:30:35 nix kernel: Hardware name: ASUS System Product Name/ROG STRIX B650E-E GAMING WIFI, BIOS 1813 10/13/2023
Jan 06 11:30:35 nix kernel: RIP: 0010:igc_rd32+0x8d/0xa0 [igc]
Jan 06 11:30:35 nix kernel: Code: 48 c7 c6 58 09 56 c0 e8 b1 ca 1e ca 48 8b bb 28 ff ff ff e8 05 e3 dd c9 84 c0 74 bc 89 ee 48 c7 c7 80 09 56 c0 e8 c3 62 77 c9 <0f> 0b eb aa b8 ff ff ff ff e9 15 34 47 ca 0f 1f 44 00 00 90 90 90
Jan 06 11:30:35 nix kernel: RSP: 0018:ffffc900062d7568 EFLAGS: 00010282
Jan 06 11:30:35 nix kernel: RAX: 0000000000000000 RBX: ffff88810d626cb8 RCX: 0000000000000027
Jan 06 11:30:35 nix kernel: RDX: ffff88901e5a16c8 RSI: 0000000000000001 RDI: ffff88901e5a16c0
Jan 06 11:30:35 nix kernel: RBP: 000000000000c030 R08: 0000000000000000 R09: ffffc900062d73f0
Jan 06 11:30:35 nix kernel: R10: 0000000000000003 R11: ffffffff8baca428 R12: ffff88810d626000
Jan 06 11:30:35 nix kernel: R13: 0000000000000000 R14: ffff888482c98d40 R15: 000000000000c030
Jan 06 11:30:35 nix kernel: FS:  00007f654a7fc6c0(0000) GS:ffff88901e580000(0000) knlGS:0000000000000000
Jan 06 11:30:35 nix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 06 11:30:35 nix kernel: CR2: 00000007b92e02d0 CR3: 00000001054c4000 CR4: 0000000000f50ee0
Jan 06 11:30:35 nix kernel: PKRU: 55555554
Jan 06 11:30:35 nix kernel: Call Trace:
Jan 06 11:30:35 nix kernel:  <TASK>
Jan 06 11:30:35 nix kernel:  ? igc_rd32+0x8d/0xa0 [igc 4cdf20728952a10500352267157deb5b852fffac]
Jan 06 11:30:35 nix kernel:  ? __warn+0x81/0x130
Jan 06 11:30:35 nix kernel:  ? igc_rd32+0x8d/0xa0 [igc 4cdf20728952a10500352267157deb5b852fffac]
Jan 06 11:30:35 nix kernel:  ? report_bug+0x171/0x1a0
Jan 06 11:30:35 nix kernel:  ? prb_read_valid+0x1b/0x30
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  ? handle_bug+0x3c/0x80
Jan 06 11:30:35 nix kernel:  ? exc_invalid_op+0x17/0x70
Jan 06 11:30:35 nix kernel:  ? asm_exc_invalid_op+0x1a/0x20
Jan 06 11:30:35 nix kernel:  ? igc_rd32+0x8d/0xa0 [igc 4cdf20728952a10500352267157deb5b852fffac]
Jan 06 11:30:35 nix kernel:  ? igc_rd32+0x8d/0xa0 [igc 4cdf20728952a10500352267157deb5b852fffac]
Jan 06 11:30:35 nix kernel:  igc_update_stats+0x8a/0x6d0 [igc 4cdf20728952a10500352267157deb5b852fffac]
Jan 06 11:30:35 nix kernel:  igc_get_stats64+0x85/0x90 [igc 4cdf20728952a10500352267157deb5b852fffac]
Jan 06 11:30:35 nix kernel:  dev_get_stats+0x60/0x110
Jan 06 11:30:35 nix kernel:  rtnl_fill_stats+0x3b/0x130
Jan 06 11:30:35 nix kernel:  rtnl_fill_ifinfo+0x868/0x1530
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  rtnl_dump_ifinfo+0x55f/0x670
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  ? __alloc_skb+0xde/0x1a0
Jan 06 11:30:35 nix kernel:  netlink_dump+0x126/0x320
Jan 06 11:30:35 nix kernel:  __netlink_dump_start+0x1d6/0x290
Jan 06 11:30:35 nix kernel:  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
Jan 06 11:30:35 nix kernel:  rtnetlink_rcv_msg+0x277/0x3c0
Jan 06 11:30:35 nix kernel:  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
Jan 06 11:30:35 nix kernel:  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
Jan 06 11:30:35 nix kernel:  netlink_rcv_skb+0x58/0x110
Jan 06 11:30:35 nix kernel:  netlink_unicast+0x1a3/0x290
Jan 06 11:30:35 nix kernel:  netlink_sendmsg+0x254/0x4d0
Jan 06 11:30:35 nix kernel:  __sys_sendto+0x1f6/0x200
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  __x64_sys_sendto+0x24/0x30
Jan 06 11:30:35 nix kernel:  do_syscall_64+0x5d/0x90
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  ? __fget_light+0x99/0x100
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  ? __sys_setsockopt+0x129/0x1d0
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
Jan 06 11:30:35 nix kernel:  ? srso_alias_return_thunk+0x5/0x7f
Jan 06 11:30:35 nix kernel:  ? do_syscall_64+0x6c/0x90
Jan 06 11:30:35 nix kernel:  ? do_syscall_64+0x6c/0x90
Jan 06 11:30:35 nix kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Jan 06 11:30:35 nix kernel: RIP: 0033:0x7f65cc7969ec
Jan 06 11:30:35 nix kernel: Code: 89 4c 24 1c e8 a5 63 f7 ff 44 8b 54 24 1c 8b 3c 24 45 31 c9 89 c5 48 8b 54 24 10 48 8b 74 24 08 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 04 24 e8 f1 63 f7 ff 48 8b 04
Jan 06 11:30:35 nix kernel: RSP: 002b:00007f654a7faea0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
Jan 06 11:30:35 nix kernel: RAX: ffffffffffffffda RBX: 0000000000000023 RCX: 00007f65cc7969ec
Jan 06 11:30:35 nix kernel: RDX: 0000000000000020 RSI: 00007f654a7faf50 RDI: 0000000000000023
Jan 06 11:30:35 nix kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Jan 06 11:30:35 nix kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f654a7fb0e0
Jan 06 11:30:35 nix kernel: R13: 00007f654a7faf20 R14: 0000000000000001 R15: 00007f654a7faf50
Jan 06 11:30:35 nix kernel:  </TASK>
Jan 06 11:30:35 nix kernel: ---[ end trace 0000000000000000 ]---

이 문제가 Arch에만 해당되는 경우를 대비해 Linux Mint로 전환해 보았습니다. 아쉽게도 다시 받았어요. dmesg 출력은 매우 유사해 보입니다. 제 생각엔 이건 메인보드 특유의 문제인 것 같아요. 잠재적인 드라이버 오류를 보고하는 방법을 모르겠습니다. 다른 마더보드를 선택해야 할 것 같습니다.

[49705.450735] igc 0000:08:00.0 eno1: PCIe link lost, device now detached
[49705.450744] ------------[ cut here ]------------
[49705.450745] igc: Failed to read reg 0xc030!
[49705.450788] WARNING: CPU: 1 PID: 20682 at drivers/net/ethernet/intel/igc/igc_main.c:6583 igc_rd32+0xa4/0xc0 [igc]
[49705.450801] Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc cmac algif_hash algif_skcipher af_alg bnep binfmt_misc zfs(PO) spl(O) intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_usb_audio snd_intel_sdw_acpi snd_hda_codec snd_usbmidi_lib snd_ump snd_hda_core mc snd_hwdep mt7921e mt7921_common snd_pcm btusb btrtl mt76_connac_lib edac_mce_amd snd_seq_midi btbcm snd_seq_midi_event mt76 btintel snd_rawmidi kvm_amd btmtk nls_iso8859_1 mac80211 snd_seq bluetooth kvm snd_seq_device snd_timer irqbypass ecdh_generic cfg80211 rapl ecc joydev input_leds asus_nb_wmi eeepc_wmi snd wmi_bmof k10temp ccp libarc4 soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c dm_mirror dm_region_hash dm_log amdgpu amdxcp iommu_v2 drm_buddy gpu_sched
[49705.450885]  i2c_algo_bit drm_suballoc_helper hid_generic drm_ttm_helper ttm drm_display_helper crct10dif_pclmul cec usbhid crc32_pclmul polyval_clmulni rc_core polyval_generic hid ghash_clmulni_intel mfd_aaeon drm_kms_helper asus_wmi aesni_intel ledtrig_audio crypto_simd sparse_keymap platform_profile cryptd nvme ahci i2c_piix4 xhci_pci drm xhci_pci_renesas libahci nvme_core igc nvme_common video wmi gpio_amdpt
[49705.450918] CPU: 1 PID: 20682 Comm: kworker/1:1 Tainted: P           O       6.5.0-17-generic #17~22.04.1-Ubuntu
[49705.450921] Hardware name: ASUS System Product Name/ROG STRIX B650E-E GAMING WIFI, BIOS 1813 10/13/2023
[49705.450923] Workqueue: events igc_watchdog_task [igc]
[49705.450931] RIP: 0010:igc_rd32+0xa4/0xc0 [igc]
[49705.450938] Code: c7 c6 40 a7 4a c0 e8 5b 88 78 ce 48 8b bb 28 ff ff ff e8 0f 67 27 ce 84 c0 74 b4 44 89 e6 48 c7 c7 68 a7 4a c0 e8 bc 16 a7 cd <0f> 0b eb a1 b8 ff ff ff ff 31 d2 31 f6 31 ff e9 b8 13 ab ce 0f 1f
[49705.450940] RSP: 0018:ffffb7ccce62bd98 EFLAGS: 00010246
[49705.450942] RAX: 0000000000000000 RBX: ffff9880a3512cb8 RCX: 0000000000000000
[49705.450943] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[49705.450945] RBP: ffffb7ccce62bdb0 R08: 0000000000000000 R09: 0000000000000000
[49705.450946] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000c030
[49705.450947] R13: ffff9880a3512000 R14: 0000000000000000 R15: ffff9881a7274d40
[49705.450949] FS:  0000000000000000(0000) GS:ffff988f9e440000(0000) knlGS:0000000000000000
[49705.450950] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[49705.450952] CR2: 000033ffda940000 CR3: 0000000e06e3a000 CR4: 0000000000750ee0
[49705.450954] PKRU: 55555554
[49705.450955] Call Trace:
[49705.450957]  <TASK>
[49705.450959]  ? show_regs+0x6d/0x80
[49705.450965]  ? __warn+0x89/0x160
[49705.450969]  ? igc_rd32+0xa4/0xc0 [igc]
[49705.450976]  ? report_bug+0x17e/0x1b0
[49705.450980]  ? handle_bug+0x46/0x90
[49705.450985]  ? exc_invalid_op+0x18/0x80
[49705.450988]  ? asm_exc_invalid_op+0x1b/0x20
[49705.450993]  ? igc_rd32+0xa4/0xc0 [igc]
[49705.451000]  ? igc_rd32+0xa4/0xc0 [igc]
[49705.451006]  igc_update_stats+0xab/0x770 [igc]
[49705.451013]  igc_watchdog_task+0xa1/0x540 [igc]
[49705.451019]  ? __schedule+0x2d4/0x750
[49705.451023]  process_one_work+0x23d/0x450
[49705.451027]  worker_thread+0x50/0x3f0
[49705.451030]  ? srso_alias_return_thunk+0x5/0x7f
[49705.451034]  ? __pfx_worker_thread+0x10/0x10
[49705.451036]  kthread+0xef/0x120
[49705.451039]  ? __pfx_kthread+0x10/0x10
[49705.451042]  ret_from_fork+0x44/0x70
[49705.451046]  ? __pfx_kthread+0x10/0x10
[49705.451049]  ret_from_fork_asm+0x1b/0x30
[49705.451054]  </TASK>
[49705.451055] ---[ end trace 0000000000000000 ]---

관련 정보