얼마 전 오래된 데스크톱을 데비안 서버로 전환했는데 반년 동안 완벽하게 작동했습니다.
그러나 나는 인터넷 연결이 더 나은 곳으로 컴퓨터를 옮기고 하드 드라이브를 추가하여 적절한 스토리지 서버(가령 홈브류 NAS)로 만들기로 결정했습니다.
이제부터 서버가 무작위로 충돌합니다. 때로는 분해하는 데 한 달 이상이 걸립니다. 때로는 하루가 걸릴 때도 있습니다. 최근에는 충돌 빈도가 2~3일 정도입니다.
dmesg를 보면 크래시 원인이 각각 다른 것 같습니다. 충돌 원인이 무엇인지 전혀 모르겠습니다.
설정
- CPU: 인텔(R) 코어(TM) i5-4670K CPU @ 3.40GHz
- 마더보드: MSI MS-7821/Z87-G45 게이밍
- 머신이 Linux 4.9.0-8-amd64에서 Debian Stretch를 실행 중입니다.
- Kdump가 설치되었습니다
- Samsung SSD 840 PRO(128GB)에 설치된 시스템
- 저장용 8TB Western Digital Red HDD 5개
- HDD는 원래 소프트웨어 RAID5용 mdadm을 사용하여 구성되었지만 이제는 raidz2를 사용하여 ZFS에서 관리됩니다.
- Apache2(nextcloud 포함) 및 전송 데몬 실행 중
정보
- dmesg.201904090640
- dmesg.201904111340
- dmesg.201904140557
- dmesg.201904172335
- dmesg.201904260559
- dmesg.201904270957
- dmesg.201904272249
dmesg.201904140557
[230866.137537] PANIC: double fault, error_code: 0x0
[230866.137548] PANIC: double fault, error_code: 0x0
[230866.137550] CPU: 2 PID: 25608 Comm: apache2 Tainted: P IO 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[230866.137551] Hardware name: MSI MS-7821/Z87-G45 GAMING (MS-7821), BIOS V1.1 05/03/2013
[230866.137551] task: ffff8d7d1eabe0c0 task.stack: ffffa02483d5c000
[230866.137555] RIP: 0010:[<ffffffffad8192fa>] [<ffffffffad8192fa>] syscall_return_via_sysret+0x3e/0x4d
[230866.137556] RSP: 0018:ffffa02483d5ff50 EFLAGS: 00010002
[230866.137556] RAX: 0000000510035080 RBX: 0000000000000000 RCX: 00007fec9d79eacf
[230866.137557] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[230866.137557] RBP: 0000000000000000 R08: 00007fec6461ee20 R09: 0000000000000000
[230866.137558] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[230866.137558] R13: 0000000000000000 R14: 00007fec6461ee20 R15: 0000000000000000
[230866.137559] FS: 00007fec6461f700(0000) GS:ffff8d7e9fb00000(0000) knlGS:0000000000000000
[230866.137560] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[230866.137560] CR2: ffffa02483d5ff48 CR3: 0000000510034000 CR4: 0000000000160670
[230866.137561] Stack:
[230866.137563] 0000000000000000 0000000000000000 00007fec6461ee20 0000000000000000
[230866.137564] 0000000000000000 0000000000000000 0000000000000000 0000000000000293
[230866.137565] 0000000000000000 0000000000000000 00007fec6461ee20 0000000000000000
[230866.137565] Call Trace:
[230866.137580] Code: 50 48 8b 54 24 60 48 8b 74 24 68 48 8b 7c 24 70 50 90 0f 20 d8 65 48 0b 04 25 e0 02 01 00 78 08 65 88 04 25 e7 02 01 00 0f 22 d8 <58> 48 8b a4 24 98 00 00 00 0f 01 f8 48 0f 07 50 90 0f 20 d8 65
[230866.137580] Kernel panic - not syncing: Machine halted.
[230866.137581] CPU: 2 PID: 25608 Comm: apache2 Tainted: P IO 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[230866.137582] Hardware name: MSI MS-7821/Z87-G45 GAMING (MS-7821), BIOS V1.1 05/03/2013
[230866.137583] 0000000000000000 ffffffffad534524 ffff8d7e9fb07f00 ffff8d7e9fb07f18
[230866.137584] ffffffffad380ecd ffffffff00000008 ffff8d7e9fb07f28 ffff8d7e9fb07ec0
[230866.137585] 88dd6d6a799c212f 00000000000000c8 0000000000000092 0000000000000000
[230866.137585] Call Trace:
[230866.137589] <#DF>
[230866.137589] [<ffffffffad534524>] ? dump_stack+0x5c/0x78
[230866.137591] [<ffffffffad380ecd>] ? panic+0xe4/0x23f
[230866.137592] [<ffffffffad258ac9>] ? df_debug+0x29/0x30
[230866.137594] [<ffffffffad227b0f>] ? do_double_fault+0x9f/0x130
[230866.137595] [<ffffffffad81a038>] ? double_fault+0x28/0x30
[230866.137596] [<ffffffffad8192fa>] ? syscall_return_via_sysret+0x3e/0x4d
dmesg.201904172335
[322137.449206] general protection fault: 0000 [#1] SMP
[322137.464088] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_nat xt_tcpudp veth ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc xt_multiport iptable_filter wireguard(O) ip6_udp_tunnel udp_tunnel overlay nls_ascii nls_cp437 vfat fat snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic zfs(PO) intel_rapl zunicode(PO) x86_pkg_temp_thermal zavl(PO) intel_powerclamp zcommon(PO) znvpair(PO) snd_hda_intel kvm_intel spl(O) kvm i915 snd_hda_codec irqbypass snd_hda_core snd_hwdep snd_pcm crct10dif_pclmul crc32_pclmul iTCO_wdt ghash_clmulni_intel drm_kms_helper intel_cstate mei_me iTCO_vendor_support snd_timer drm intel_uncore snd
[322137.678356] soundcore evdev i2c_algo_bit mxm_wmi mei efi_pstore intel_rapl_perf lpc_ich sg shpchp serio_raw mfd_core pcspkr efivars wmi intel_smartconnect video button nfsd auth_rpcgss oid_registry nfs_acl lockd grace nct6775 hwmon_vid coretemp sunrpc efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto ecb mbcache raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod hid_generic usbhid hid dm_mod sd_mod xhci_pci ahci ehci_pci xhci_hcd ehci_hcd crc32c_intel libahci libata aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper psmouse cryptd scsi_mod i2c_i801 i2c_smbus alx usbcore mdio thermal usb_common fan
[322137.867812] CPU: 2 PID: 2034 Comm: transmission-da Tainted: P IO 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[322137.898560] Hardware name: MSI MS-7821/Z87-G45 GAMING (MS-7821), BIOS V1.1 05/03/2013
[322137.922267] task: ffff9d0366de8040 task.stack: ffffb6ca48838000
[322137.940254] RIP: 0010:[<ffffffffc0dc49e2>] [<ffffffffc0dc49e2>] zio_create+0x52/0x470 [zfs]
[322137.965860] RSP: 0018:ffffb6ca4883b970 EFLAGS: 00010282
[322137.982034] RAX: fbff9cff4e756040 RBX: fbff9cff4e756040 RCX: fbff9cff4e756040
[322138.003667] RDX: 0000000000000000 RSI: 0000000002404200 RDI: fbff9cff4e756048
[322138.025297] RBP: ffff9d03710ec680 R08: 000039c6a0245fd0 R09: 0000000000000002
[322138.046929] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb6ca4883bb30
[322138.068560] R13: 0000000000000001 R14: 00000000000f99d1 R15: ffff9cff040b1a10
[322138.090191] FS: 00007fee5e413700(0000) GS:ffff9d039fb00000(0000) knlGS:0000000000000000
[322138.114681] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[322138.132151] CR2: 000056466d3a1060 CR3: 00000005e6e22000 CR4: 0000000000160670
[322138.153783] Stack:
[322138.160066] 0000000000004000 ffff9cfebc544000 ffff9d0373c44000 ffff9d03710ec680
[322138.182681] ffffffffc0d1eae0 ffff9cff040b1a10 ffff9cfebc544000 0000000000004000
[322138.205299] ffff9d0373c44000 ffffffffc0dc551c ffffffffc0d1eae0 ffff9d027d98eaa8
[322138.227918] Call Trace:
[322138.235528] [<ffffffffc0d1eae0>] ? arc_hdr_destroy+0x1e0/0x1e0 [zfs]
[322138.255086] [<ffffffffc0dc551c>] ? zio_read+0xcc/0xe0 [zfs]
[322138.272293] [<ffffffffc0d1eae0>] ? arc_hdr_destroy+0x1e0/0x1e0 [zfs]
[322138.291847] [<ffffffffc0d21eb0>] ? arc_read+0x520/0xa30 [zfs]
[322138.309576] [<ffffffffc0d28b8e>] ? dbuf_read+0x29e/0x7d0 [zfs]
[322138.327569] [<ffffffffc0d294f8>] ? __dbuf_hold_impl+0x438/0x4d0 [zfs]
[322138.347379] [<ffffffffc0d295fb>] ? dbuf_hold_impl+0x6b/0x90 [zfs]
[322138.366147] [<ffffffffc0d298fb>] ? dbuf_hold+0x2b/0x60 [zfs]
[322138.383622] [<ffffffffc0d30799>] ? dmu_buf_hold_array_by_dnode+0xf9/0x460 [zfs]
[322138.406034] [<ffffffffc0d313d0>] ? dmu_read_uio_dnode+0x50/0xf0 [zfs]
[322138.426487] [<ffffffffc0d323cd>] ? dmu_read_uio_dbuf+0x3d/0x60 [zfs]
[322138.446691] [<ffffffffc0db0b97>] ? zfs_read+0x127/0x3b0 [zfs]
[322138.465045] [<ffffffffc0dcae24>] ? zpl_read_common_iovec+0x84/0xd0 [zfs]
[322138.486274] [<ffffffffc0dcb8e1>] ? zpl_iter_read+0xa1/0xe0 [zfs]
[322138.505406] [<ffffffff8ae0aacd>] ? new_sync_read+0xdd/0x130
[322138.523175] [<ffffffff8ae0b261>] ? vfs_read+0x91/0x130
[322138.539686] [<ffffffff8ae0c8f0>] ? SyS_pread64+0x90/0xb0
[322138.556649] [<ffffffff8ac03b7d>] ? do_syscall_64+0x8d/0xf0
[322138.574196] [<ffffffff8b21924e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[322138.595828] Code: 10 31 f6 4c 89 44 24 08 4c 89 0c 24 4c 8b a4 24 88 00 00 00 44 8b ac 24 90 00 00 00 e8 68 02 f4 ff 48 8d 78 08 48 89 c1 48 89 c3 <48> c7 00 00 00 00 00 48 c7 80 30 04 00 00 00 00 00 00 31 c0 48
[322138.656162] RIP [<ffffffffc0dc49e2>] zio_create+0x52/0x470 [zfs]
[322138.675286] RSP <ffffb6ca4883b970>
dmesg.201904260559
[72133.666580] general protection fault: 0000 [#1] SMP
[72133.681200] Modules linked in: xt_nat xt_tcpudp veth ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter overlay wireguard(O) ip6_udp_tunnel udp_tunnel nls_ascii nls_cp437 vfat fat snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp zfs(PO) zunicode(PO) kvm_intel snd_hda_codec_realtek kvm zavl(PO) snd_hda_codec_generic irqbypass crct10dif_pclmul zcommon(PO) crc32_pclmul snd_hda_intel znvpair(PO) i915 snd_hda_codec spl(O) ghash_clmulni_intel intel_cstate snd_hda_core snd_hwdep snd_pcm intel_uncore iTCO_wdt efi_pstore iTCO_vendor_support drm_kms_helper snd_timer drm
[72133.895207] mxm_wmi intel_rapl_perf mei_me sg snd serio_raw mei i2c_algo_bit lpc_ich pcspkr soundcore mfd_core evdev efivars shpchp wmi video intel_smartconnect button nct6775 hwmon_vid coretemp nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto ecb mbcache raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod hid_generic dm_mod usbhid hid sd_mod ahci libahci ehci_pci xhci_pci xhci_hcd ehci_hcd crc32c_intel libata aesni_intel psmouse aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd i2c_i801 scsi_mod i2c_smbus alx mdio usbcore usb_common fan thermal
[72134.084709] CPU: 3 PID: 4246 Comm: java Tainted: P IO 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[72134.112335] Hardware name: MSI MS-7821/Z87-G45 GAMING (MS-7821), BIOS V1.1 05/03/2013
[72134.135784] task: ffff8dbb009d7100 task.stack: ffffb42103b38000
[72134.153510] RIP: 0010:[<ffffffffa9eea7a8>] [<ffffffffa9eea7a8>] hrtimer_active+0x28/0x50
[72134.178049] RSP: 0018:ffffb42103b3be28 EFLAGS: 00010046
[72134.193962] RAX: 0000000000000000 RBX: ffff8dbb00c3c600 RCX: 0000000000000023
[72134.215337] RDX: fffd8dbb1fb94c00 RSI: 0000000000000008 RDI: ffff8dbb00c3c600
[72134.236710] RBP: 0000000000000000 R08: ffffffffaaa3eee0 R09: ffff8dbac7341380
[72134.258082] R10: 0000000000000013 R11: ffff8dbb01041b38 R12: ffff8dbb00c3c600
[72134.279452] R13: ffffb42103b3bec0 R14: 0000000000000000 R15: 0000000000000000
[72134.300824] FS: 00007fd2336ce700(0000) GS:ffff8dbb1fb80000(0000) knlGS:0000000000000000
[72134.325054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[72134.342261] CR2: 00007f36d94688a0 CR3: 00000005f211e000 CR4: 0000000000160670
[72134.363633] Stack:
[72134.369656] ffffffffa9eeac77 0000000000000000 8a7c0674a85ffec5 ffff8dbb00c3c688
[72134.392008] ffffb42103b3beb0 ffff8dbb00c3c600 ffffffffaa057b59 00007fd24811c410
[72134.414343] ffffb42103b3bee0 ffff8dbb01041b00 0000000000000001 8a7c0674a85ffec5
[72134.436702] Call Trace:
[72134.444039] [<ffffffffa9eeac77>] ? hrtimer_try_to_cancel+0x27/0x110
[72134.463080] [<ffffffffaa057b59>] ? do_timerfd_settime+0x119/0x430
[72134.481590] [<ffffffffaa058127>] ? SyS_timerfd_settime+0x57/0xb0
[72134.499837] [<ffffffffa9e03b7d>] ? do_syscall_64+0x8d/0xf0
[72134.516529] [<ffffffffaa41924e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[72134.537380] Code: 00 00 00 0f 1f 44 00 00 48 8b 57 30 eb 1d 80 7f 38 00 75 32 48 3b 78 08 74 2c 39 50 04 75 e9 48 8b 57 30 48 8b 0a 48 39 c8 74 21 <48> 8b 02 8b 50 04 f6 c2 01 74 d8 f3 90 8b 50 04 f6 c2 01 75 f6
[72134.596590] RIP [<ffffffffa9eea7a8>] hrtimer_active+0x28/0x50
[72134.614098] RSP <ffffb42103b3be28>
dmesg.201904270957
[100366.341655] general protection fault: 0000 [#1] SMP
[100366.356517] Modules linked in: veth xt_nat xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter overlay wireguard(O) ip6_udp_tunnel udp_tunnel nls_ascii nls_cp437 vfat fat snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel zfs(PO) zunicode(PO) kvm zavl(PO) irqbypass zcommon(PO) crct10dif_pclmul znvpair(PO) crc32_pclmul spl(O) ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic i915 intel_cstate iTCO_wdt iTCO_vendor_support snd_hda_intel intel_uncore mxm_wmi evdev serio_raw efi_pstore intel_rapl_perf snd_hda_codec pcspkr snd_hda_core
[100366.570669] snd_hwdep drm_kms_helper mei_me sg snd_pcm lpc_ich snd_timer drm snd mfd_core mei i2c_algo_bit soundcore shpchp intel_smartconnect wmi efivars video button nct6775 hwmon_vid coretemp nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto ecb mbcache raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod hid_generic dm_mod usbhid hid sd_mod ahci libahci libata xhci_pci crc32c_intel aesni_intel ehci_pci psmouse aes_x86_64 glue_helper i2c_i801 lrw xhci_hcd ehci_hcd gf128mul i2c_smbus ablk_helper cryptd usbcore alx scsi_mod mdio usb_common fan thermal
[100366.760030] CPU: 3 PID: 28567 Comm: apache2 Tainted: P IO 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[100366.788960] Hardware name: MSI MS-7821/Z87-G45 GAMING (MS-7821), BIOS V1.1 05/03/2013
[100366.812667] task: ffff8c41b1eb4100 task.stack: ffffac678f30c000
[100366.830659] RIP: 0010:[<ffffffff8549800a>] [<ffffffff8549800a>] __task_pid_nr_ns+0x3a/0x90
[100366.855979] RSP: 0018:ffffac678f30fcc8 EFLAGS: 00010282
[100366.872152] RAX: 0000000000000508 RBX: ffff8c4292b7ba40 RCX: 0000000000000001
[100366.893787] RDX: ffffffff86045d20 RSI: 0000000000000004 RDI: f7ff8c428aaa95c8
[100366.915418] RBP: ffffac678f30ff30 R08: 0000000000000000 R09: 0000000000000000
[100366.937052] R10: 0000000000000000 R11: 0000000000000000 R12: ffffac678f30fd78
[100366.958683] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001
[100366.980317] FS: 00007f29e0c20700(0000) GS:ffff8c445fb80000(0000) knlGS:0000000000000000
[100367.004809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[100367.022279] CR2: 00007f773f92a1f8 CR3: 00000002575ee000 CR4: 0000000000160670
[100367.043913] Stack:
[100367.050195] ffffffff8569cb93 00007f29e0c1fe20 0000000000000000 0000000000000000
[100367.072811] 0000000000000000 ffffffff8608b548 ffff8c400bc4ef80 ffff8c4292b7bb08
[100367.095407] ffffac678f30fd20 00000000000b0008 0000000000000000 ffffac678f30fd20
[100367.118027] Call Trace:
[100367.125627] [<ffffffff8569cb93>] ? SYSC_semtimedop+0x3b3/0xc50
[100367.143623] [<ffffffff8552bd04>] ? __seccomp_filter+0x74/0x270
[100367.161615] [<ffffffff8542f1f0>] ? recalibrate_cpu_khz+0x10/0x10
[100367.180130] [<ffffffff854f01dc>] ? ktime_get_ts64+0x4c/0xf0
[100367.197342] [<ffffffff85620bbf>] ? poll_select_copy_remaining+0xdf/0x150
[100367.217934] [<ffffffff85403337>] ? syscall_trace_enter+0x117/0x2c0
[100367.236964] [<ffffffff85403b7d>] ? do_syscall_64+0x8d/0xf0
[100367.253918] [<ffffffff85a1924e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[100367.275029] Code: 00 00 00 74 4e 85 f6 b8 08 05 00 00 74 1a 83 fe 04 74 0e 89 f6 48 8d 04 76 48 8d 04 c5 08 05 00 00 48 8b bf d0 04 00 00 48 01 c7 <48> 8b 0f 48 85 c9 74 20 8b b2 30 08 00 00 31 c0 3b 71 04 77 0d
[100367.334428] RIP [<ffffffff8549800a>] __task_pid_nr_ns+0x3a/0x90
[100367.352738] RSP <ffffac678f30fcc8>
명령 출력
# uname -a
Linux example.com 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64 GNU/Linux
# lsmod
Module Size Used by
ipt_REJECT 16384 6
nf_reject_ipv4 16384 1 ipt_REJECT
veth 16384 0
xt_nat 16384 1
xt_tcpudp 16384 3
ipt_MASQUERADE 16384 2
nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE
nf_conntrack_netlink 36864 0
nfnetlink 16384 2 nf_conntrack_netlink
xfrm_user 36864 1
xfrm_algo 16384 1 xfrm_user
iptable_nat 16384 1
nf_conntrack_ipv4 16384 2
nf_defrag_ipv4 16384 1 nf_conntrack_ipv4
nf_nat_ipv4 16384 1 iptable_nat
xt_addrtype 16384 2
xt_conntrack 16384 1
nf_nat 24576 3 xt_nat,nf_nat_masquerade_ipv4,nf_nat_ipv4
nf_conntrack 114688 6 nf_conntrack_ipv4,nf_conntrack_netlink,nf_nat_masquerade_ipv4,xt_conntrack,nf_nat_ipv4,nf_nat
br_netfilter 24576 0
bridge 135168 1 br_netfilter
stp 16384 1 bridge
llc 16384 2 bridge,stp
xt_multiport 16384 1
iptable_filter 16384 1
wireguard 217088 0
ip6_udp_tunnel 16384 1 wireguard
udp_tunnel 16384 1 wireguard
overlay 49152 1
nls_ascii 16384 1
nls_cp437 20480 1
vfat 20480 1
fat 69632 1 vfat
snd_hda_codec_hdmi 49152 1
intel_rapl 20480 0
x86_pkg_temp_thermal 16384 0
intel_powerclamp 16384 0
kvm_intel 200704 0
kvm 598016 1 kvm_intel
zfs 2707456 8
irqbypass 16384 1 kvm
crct10dif_pclmul 16384 0
zunicode 331776 1 zfs
crc32_pclmul 16384 0
zavl 16384 1 zfs
ghash_clmulni_intel 16384 0
zcommon 53248 1 zfs
intel_cstate 16384 0
znvpair 90112 2 zcommon,zfs
snd_hda_codec_realtek 90112 1
snd_hda_codec_generic 69632 1 snd_hda_codec_realtek
snd_hda_intel 36864 0
i915 1257472 2
snd_hda_codec 135168 4 snd_hda_intel,snd_hda_codec_hdmi,snd_hda_codec_generic,snd_hda_codec_realtek
drm_kms_helper 155648 1 i915
intel_uncore 118784 0
spl 98304 3 znvpair,zcommon,zfs
snd_hda_core 90112 5 snd_hda_intel,snd_hda_codec,snd_hda_codec_hdmi,snd_hda_codec_generic,snd_hda_codec_realtek
iTCO_wdt 16384 0
mei_me 36864 0
efi_pstore 16384 0
snd_hwdep 16384 1 snd_hda_codec
mxm_wmi 16384 0
iTCO_vendor_support 16384 1 iTCO_wdt
evdev 24576 2
drm 360448 3 i915,drm_kms_helper
snd_pcm 110592 4 snd_hda_intel,snd_hda_codec,snd_hda_core,snd_hda_codec_hdmi
snd_timer 32768 1 snd_pcm
intel_rapl_perf 16384 0
efivars 20480 1 efi_pstore
serio_raw 16384 0
lpc_ich 24576 0
sg 32768 0
snd 86016 8 snd_hda_intel,snd_hwdep,snd_hda_codec,snd_timer,snd_hda_codec_hdmi,snd_hda_codec_generic,snd_hda_codec_realtek,snd_pcm
pcspkr 16384 0
mei 102400 1 mei_me
i2c_algo_bit 16384 1 i915
soundcore 16384 1 snd
mfd_core 16384 1 lpc_ich
shpchp 36864 0
wmi 16384 1 mxm_wmi
intel_smartconnect 16384 0
video 40960 1 i915
button 16384 1 i915
nfsd 331776 13
auth_rpcgss 61440 1 nfsd
oid_registry 16384 1 auth_rpcgss
nfs_acl 16384 1 nfsd
lockd 90112 1 nfsd
grace 16384 2 nfsd,lockd
sunrpc 344064 18 auth_rpcgss,nfsd,nfs_acl,lockd
nct6775 57344 0
hwmon_vid 16384 1 nct6775
coretemp 16384 0
efivarfs 16384 1
ip_tables 24576 2 iptable_filter,iptable_nat
x_tables 36864 9 xt_multiport,ipt_REJECT,xt_nat,ip_tables,iptable_filter,xt_tcpudp,ipt_MASQUERADE,xt_addrtype,xt_conntrack
autofs4 40960 3
ext4 585728 2
crc16 16384 1 ext4
jbd2 106496 1 ext4
fscrypto 28672 1 ext4
ecb 16384 0
mbcache 16384 3 ext4
raid10 49152 0
raid456 106496 0
async_raid6_recov 20480 1 raid456
async_memcpy 16384 2 raid456,async_raid6_recov
async_pq 16384 2 raid456,async_raid6_recov
async_xor 16384 3 async_pq,raid456,async_raid6_recov
async_tx 16384 5 async_xor,async_pq,raid456,async_memcpy,async_raid6_recov
xor 24576 1 async_xor
raid6_pq 110592 3 async_pq,raid456,async_raid6_recov
libcrc32c 16384 1 raid456
crc32c_generic 16384 0
raid1 36864 0
raid0 20480 0
multipath 16384 0
linear 16384 0
md_mod 135168 6 raid1,raid10,multipath,linear,raid0,raid456
hid_generic 16384 0
usbhid 53248 0
hid 122880 2 hid_generic,usbhid
dm_mod 118784 6
sd_mod 49152 14
ehci_pci 16384 0
xhci_pci 16384 0
xhci_hcd 188416 1 xhci_pci
ahci 40960 8
ehci_hcd 81920 1 ehci_pci
crc32c_intel 24576 5
libahci 32768 1 ahci
aesni_intel 167936 1
aes_x86_64 20480 1 aesni_intel
libata 249856 2 ahci,libahci
glue_helper 16384 1 aesni_intel
lrw 16384 1 aesni_intel
usbcore 253952 6 usbhid,ehci_hcd,xhci_pci,xhci_hcd,ehci_pci
gf128mul 16384 1 lrw
ablk_helper 16384 1 aesni_intel
i2c_i801 24576 0
cryptd 24576 3 ablk_helper,ghash_clmulni_intel,aesni_intel
psmouse 135168 0
i2c_smbus 16384 1 i2c_i801
alx 45056 0
scsi_mod 225280 3 sd_mod,libata,sg
mdio 16384 1 alx
usb_common 16384 1 usbcore
fan 16384 0
thermal 20480 0
고쳐 쓰다
RAM 모듈을 다시 설치하기 전과 후에 memtest86(memtest86.com의 원본 버전)을 실행했습니다. 메모리 테스트 로그
오류가 발견되지 않았습니다.
고쳐 쓰다
RAM 모듈을 다시 설치해도 아무런 효과가 없습니다. 그래서 나는 새로운 가설을 탐구했습니다.
전기적 간섭이 있는지 확인했지만 충돌 시간과 모터 사용량 사이에는 상관 관계가 없었습니다.
또한 디스크 액세스와 충돌 간의 상관관계도 확인했습니다. 디스크 활동이 적은 경우에도 충돌이 발생할 수 있지만 일부 디스크 활동에서는 충돌이 훨씬 빠르게 발생합니다. 예를 들어, 모든 디스크를 병렬로 읽는다면( cat /dev/sdX > /dev/null
), 한 시간 안에 머신이 충돌할 수 있습니다. 그러나 SMART 데이터에서는 아무런 문제가 없는 것으로 나타났습니다. 출력은 다음과 같습니다 smartctl -a /dev/sdb
(다른 디스크도 동일하게 보임).
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 132 132 054 Pre-fail Offline - 112
3 Spin_Up_Time 0x0007 160 160 024 Pre-fail Always - 401 (Average 420)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 40
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 140 140 020 Pre-fail Offline - 15
9 Power_On_Hours 0x0012 099 099 000 Old_age Always - 7274
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 35
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 260
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 260
194 Temperature_Celsius 0x0002 224 224 000 Old_age Always - 29 (Min/Max 10/46)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
그래서 충돌은 어떻게든 디스크와 관련이 있지만 어떻게 되는지는 모르겠습니다.
답변1
로그를 확인하고,커널이 오염되었습니다., 또는 지원되지 않는 상태에서 실행 중:
Tainted: P IO
오염 플래그 목록은 다음에서 확인할 수 있습니다.커널 문서. P 및 O 섹션은 GPL과 호환되지 않는 라이센스를 받고 외부에서 구축된 커널 모듈을 나타내며, 특히 ZFS 및 관련 모듈이 여기에 나열되어 있습니다. 제공한 로그 조각 중 하나는 ZFS 모듈에서 일반 보호 오류가 발생했음을 나타내지만 나머지는 커널의 다른 곳에 있습니다. 또한 GPF 및 이중 오류는 프로세서 자체에서 생성되므로 모듈에 오류가 없을 수도 있습니다.
내가 더 걱정하는 것은 I taint 플래그입니다. I 플래그는 "애플리케이션 플랫폼 펌웨어의 버그 해결"을 의미합니다. 이는 오류를 일으킬 수 있는 시스템의 UEFI/BIOS 펌웨어에 잠재적으로 심각한 문제가 있음을 나타냅니다. 이 작업을 시작하기 전에 BIOS 업데이트를 수행했습니까? 하드웨어 업그레이드를 수행하기 전에 이 플래그를 설정했습니까?
안타깝게도 전체 로그에 대한 링크가 더 이상 작동하지 않아 더 구체적인 도움을 드릴 수 없습니다. 전체 로그는 시스템이 해결 중인 펌웨어 오류에 대한 세부 정보와 기타 가능한 오류 표시기를 제공할 수 있습니다.