저는 Lenovo T440s에서 Linux 4.9.0-16-amd64와 함께 Debian 9를 실행하고 있습니다. 이는 최근까지 안정적이었지만 하루에도 몇 번씩 멈추기 시작했습니다. 아직 업그레이드가 이루어지지 않았기 때문에 하드웨어로 인해 중단이 발생한 것으로 의심됩니다.
/var/log/syslog에 다음과 같은 오류가 있습니다(즉각적으로 중단되지는 않음).
Jul 4 12:46:39 dumaty kernel: [ 2345.071294] ------------[ cut here ]------------
Jul 4 12:46:39 dumaty kernel: [ 2345.071314] WARNING: CPU: 2 PID: 366 at /build/linux-hrcSIZ/linux-4.9.272/drivers/net/wireless/intel/iwlwifi/mvm/rs.c:1212 iwl_mvm_rs_tx_status+0x159/0x1950 [iwlmvm]
Jul 4 12:46:39 dumaty kernel: [ 2345.071315] Modules linked in: ctr ccm binfmt_misc rfcomm fuse cmac bnep iTCO_wdt iTCO_vendor_support intel_rapl arc4 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi kvm iwlmvm irqbypass intel_cstate mac80211 joydev evdev intel_uncore pcspkr intel_rapl_perf snd_hda_codec_realtek rtsx_pci_ms serio_raw iwlwifi sg hid_multitouch snd_hda_codec_generic memstick uvcvideo lpc_ich cfg80211 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core btusb cdc_mbim btrtl cdc_wdm snd_hda_intel videodev btbcm shpchp btintel snd_hda_codec i915 media cdc_ncm cdc_acm snd_hda_core bluetooth usbnet drm_kms_helper mii snd_hwdep drm mei_me snd_pcm snd_timer mei i2c_algo_bit thinkpad_acpi wmi nvram snd soundcore ac rfkill battery video button parport_pc ppdev lp parport ip_tables x_tables
Jul 4 12:46:39 dumaty kernel: [ 2345.071369] autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache algif_skcipher af_alg usbhid hid dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel rtsx_pci_sdmmc mmc_core aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse ahci libahci i2c_i801 i2c_smbus libata xhci_pci scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci e1000e mfd_core ptp usbcore pps_core usb_common thermal
Jul 4 12:46:39 dumaty kernel: [ 2345.071401] CPU: 2 PID: 366 Comm: irq/47-iwlwifi Not tainted 4.9.0-16-amd64 #1 Debian 4.9.272-1
Jul 4 12:46:39 dumaty kernel: [ 2345.071402] Hardware name: LENOVO 20ARA0YL00/20ARA0YL00, BIOS GJET77WW (2.27 ) 05/20/2014
Jul 4 12:46:39 dumaty kernel: [ 2345.071403] 0000000000000000 ffffffffae213377 0000000000000000 0000000000000000
Jul 4 12:46:39 dumaty kernel: [ 2345.071406] ffffffffadc7aa2b ffff9cf649fb0900 0000000000000005 ffff9cf64a4c1568
Jul 4 12:46:39 dumaty kernel: [ 2345.071409] 00000000ffffffea 000000000d9afcfb ffff9cf580809a28 ffffffffc0b479e9
Jul 4 12:46:39 dumaty kernel: [ 2345.071411] Call Trace:
Jul 4 12:46:39 dumaty kernel: [ 2345.071417] [<ffffffffae213377>] ? dump_stack+0x66/0x81
Jul 4 12:46:39 dumaty kernel: [ 2345.071421] [<ffffffffadc7aa2b>] ? __warn+0xcb/0xf0
Jul 4 12:46:39 dumaty kernel: [ 2345.071429] [<ffffffffc0b479e9>] ? iwl_mvm_rs_tx_status+0x159/0x1950 [iwlmvm]
Jul 4 12:46:39 dumaty kernel: [ 2345.071432] [<ffffffffadcb768e>] ? find_busiest_group+0x3e/0x4d0
Jul 4 12:46:39 dumaty kernel: [ 2345.071436] [<ffffffffadce86b4>] ? lock_timer_base+0x74/0x90
Jul 4 12:46:39 dumaty kernel: [ 2345.071453] [<ffffffffc0d68162>] ? ieee80211_tx_status+0x3b2/0x8b0 [mac80211]
Jul 4 12:46:39 dumaty kernel: [ 2345.071459] [<ffffffffc0b3b8d6>] ? iwl_mvm_rx_tx_cmd+0x296/0x770 [iwlmvm]
Jul 4 12:46:39 dumaty kernel: [ 2345.071462] [<ffffffffae2224a5>] ? __switch_to_asm+0x35/0x70
Jul 4 12:46:39 dumaty kernel: [ 2345.071468] [<ffffffffc0cd3832>] ? iwl_pcie_rx_handle+0x2d2/0x840 [iwlwifi]
Jul 4 12:46:39 dumaty kernel: [ 2345.071473] [<ffffffffc0cd4e51>] ? iwl_pcie_irq_handler+0x181/0x730 [iwlwifi]
Jul 4 12:46:39 dumaty kernel: [ 2345.071475] [<ffffffffadcd7190>] ? irq_finalize_oneshot.part.36+0xf0/0xf0
Jul 4 12:46:39 dumaty kernel: [ 2345.071477] [<ffffffffadcd71b1>] ? irq_thread_fn+0x21/0x60
Jul 4 12:46:39 dumaty kernel: [ 2345.071479] [<ffffffffadcd79b6>] ? irq_thread+0x136/0x1c0
Jul 4 12:46:39 dumaty kernel: [ 2345.071481] [<ffffffffae21d4d1>] ? __schedule+0x241/0x6f0
Jul 4 12:46:39 dumaty kernel: [ 2345.071483] [<ffffffffadcbdb0f>] ? __wake_up_common+0x4f/0x90
Jul 4 12:46:39 dumaty kernel: [ 2345.071485] [<ffffffffadcd7280>] ? irq_forced_thread_fn+0x90/0x90
Jul 4 12:46:39 dumaty kernel: [ 2345.071487] [<ffffffffadcd7880>] ? irq_thread_check_affinity+0xd0/0xd0
Jul 4 12:46:39 dumaty kernel: [ 2345.071490] [<ffffffffadc9af29>] ? kthread+0xd9/0xf0
Jul 4 12:46:39 dumaty kernel: [ 2345.071493] [<ffffffffae2224b1>] ? __switch_to_asm+0x41/0x70
Jul 4 12:46:39 dumaty kernel: [ 2345.071496] [<ffffffffadc9ae50>] ? kthread_park+0x60/0x60
Jul 4 12:46:39 dumaty kernel: [ 2345.071498] [<ffffffffae222537>] ? ret_from_fork+0x57/0x70
Jul 4 12:46:39 dumaty kernel: [ 2345.071499] ---[ end trace e62295838fbe3e4e ]---
나중에 또 다른 오류가 발생했습니다. 이전에 다른 swap_free 오류를 본 기억이 납니다.
Jul 4 15:11:21 dumaty kernel: [11027.163548] swap_free: Unused swap file entry 3ffff8c9d3f8a
Jul 4 15:11:21 dumaty kernel: [11027.163554] BUG: Bad page map in process CompositorTileW pte:e6c580ea2a pmd:24ca96067
Jul 4 15:11:21 dumaty kernel: [11027.163557] addr:000055f7a8fc0000 vm_flags:08100073 anon_vma:ffff9cf5bb2a9e10 mapping: (null) index:55f7a8fc0
Jul 4 15:11:21 dumaty kernel: [11027.163559] file: (null) fault: (null) mmap: (null) readpage: (null)
Jul 4 15:11:21 dumaty kernel: [11027.163563] CPU: 3 PID: 6137 Comm: CompositorTileW Tainted: G W 4.9.0-16-amd64 #1 Debian 4.9.272-1
Jul 4 15:11:21 dumaty kernel: [11027.163564] Hardware name: LENOVO 20ARA0YL00/20ARA0YL00, BIOS GJET77WW (2.27 ) 05/20/2014
Jul 4 15:11:21 dumaty kernel: [11027.163565] 0000000000000000 ffffffffae213377 000055f7a8fc0000 ffff9cf55af3b0c8
Jul 4 15:11:21 dumaty kernel: [11027.163568] ffffffffaddb7c31 000055f7a9034000 0000000000000000 0000000000000000
Jul 4 15:11:21 dumaty kernel: [11027.163571] 000055f7a8fc0000 ffff9cf58ca96e00 000000e6c580ea2a ffffb7c4839a3c38
Jul 4 15:11:21 dumaty kernel: [11027.163573] Call Trace:
Jul 4 15:11:21 dumaty kernel: [11027.163580] [<ffffffffae213377>] ? dump_stack+0x66/0x81
Jul 4 15:11:21 dumaty kernel: [11027.163582] [<ffffffffaddb7c31>] ? print_bad_pte+0x1d1/0x2a0
Jul 4 15:11:21 dumaty kernel: [11027.163584] [<ffffffffaddba434>] ? unmap_page_range+0x5d4/0x9d0
Jul 4 15:11:21 dumaty kernel: [11027.163586] [<ffffffffaddbabfc>] ? unmap_vmas+0x4c/0xa0
Jul 4 15:11:21 dumaty kernel: [11027.163589] [<ffffffffaddc3b9f>] ? exit_mmap+0x8f/0x140
Jul 4 15:11:21 dumaty kernel: [11027.163593] [<ffffffffadc77604>] ? mmput+0x54/0x100
Jul 4 15:11:21 dumaty kernel: [11027.163594] [<ffffffffadc7f1be>] ? do_exit+0x27e/0xb60
Jul 4 15:11:21 dumaty kernel: [11027.163596] [<ffffffffadc7fb1a>] ? do_group_exit+0x3a/0xa0
Jul 4 15:11:21 dumaty kernel: [11027.163599] [<ffffffffadc8abe1>] ? get_signal+0x161/0x850
Jul 4 15:11:21 dumaty kernel: [11027.163602] [<ffffffffadcfea0f>] ? do_futex+0x14f/0xba0
Jul 4 15:11:21 dumaty kernel: [11027.163605] [<ffffffffadc26486>] ? do_signal+0x36/0x690
Jul 4 15:11:21 dumaty kernel: [11027.163607] [<ffffffffadd2d5a4>] ? __seccomp_filter+0x74/0x270
Jul 4 15:11:21 dumaty kernel: [11027.163610] [<ffffffffadcff4df>] ? SyS_futex+0x7f/0x160
Jul 4 15:11:21 dumaty kernel: [11027.163613] [<ffffffffadc03721>] ? exit_to_usermode_loop+0x71/0xb0
Jul 4 15:11:21 dumaty kernel: [11027.163615] [<ffffffffadc03bd9>] ? do_syscall_64+0xe9/0x100
Jul 4 15:11:21 dumaty kernel: [11027.163619] [<ffffffffae22238e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jul 4 15:11:21 dumaty kernel: [11027.163620] Disabling lock debugging due to kernel taint
Jul 4 15:11:21 dumaty kernel: [11027.165144] BUG: Bad rss-counter state mm:ffff9cf5bb398000 idx:2 val:-1
나중에:
Jul 4 16:03:38 dumaty kernel: [14164.368364] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
Jul 4 16:03:38 dumaty kernel: [14164.368412] IP: [<ffffffffadf61e1f>] swiotlb_unmap_sg_attrs+0x1f/0x50
Jul 4 16:03:38 dumaty kernel: [14164.368447] PGD 0
Jul 4 16:03:38 dumaty kernel: [14164.368457]
Jul 4 16:03:38 dumaty kernel: [14164.368467] Oops: 0000 [#2] SMP
Jul 4 16:03:38 dumaty kernel: [14164.368483] Modules linked in: ctr ccm binfmt_misc rfcomm fuse cmac bnep iTCO_wdt iTCO_vendor_support intel_rapl arc4 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi kvm iwlmvm irqbypass intel_cstate mac80211 joydev evdev intel_uncore pcspkr intel_rapl_perf snd_hda_codec_realtek rtsx_pci_ms serio_raw iwlwifi sg hid_multitouch snd_hda_codec_generic memstick uvcvideo lpc_ich cfg80211 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core btusb cdc_mbim btrtl cdc_wdm snd_hda_intel videodev btbcm shpchp btintel snd_hda_codec i915 media cdc_ncm cdc_acm snd_hda_core bluetooth usbnet drm_kms_helper mii snd_hwdep drm mei_me snd_pcm snd_timer mei i2c_algo_bit thinkpad_acpi wmi nvram snd soundcore ac rfkill battery video button parport_pc ppdev lp parport ip_tables x_tables
Jul 4 16:03:38 dumaty kernel: [14164.368930] autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache algif_skcipher af_alg usbhid hid dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel rtsx_pci_sdmmc mmc_core aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse ahci libahci i2c_i801 i2c_smbus libata xhci_pci scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci e1000e mfd_core ptp usbcore pps_core usb_common thermal
Jul 4 16:03:38 dumaty kernel: [14164.369092] CPU: 2 PID: 1819 Comm: chrome Tainted: G B D W 4.9.0-16-amd64 #1 Debian 4.9.272-1
Jul 4 16:03:38 dumaty kernel: [14164.369126] Hardware name: LENOVO 20ARA0YL00/20ARA0YL00, BIOS GJET77WW (2.27 ) 05/20/2014
Jul 4 16:03:38 dumaty kernel: [14164.369158] task: ffff9cf5e8136100 task.stack: ffffb7c482578000
Jul 4 16:03:38 dumaty kernel: [14164.369191] RIP: 0010:[<ffffffffadf61e1f>] [<ffffffffadf61e1f>] swiotlb_unmap_sg_attrs+0x1f/0x50
Jul 4 16:03:38 dumaty kernel: [14164.369230] RSP: 0018:ffffb7c48257bc70 EFLAGS: 00010212
Jul 4 16:03:38 dumaty kernel: [14164.369257] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jul 4 16:03:38 dumaty kernel: [14164.369294] RDX: 0000000000001000 RSI: 0000000080eed000 RDI: ffff9cf36fd09400
Jul 4 16:03:38 dumaty kernel: [14164.369328] RBP: 0000000000000021 R08: 0000000000000000 R09: 000000000000ffff
Jul 4 16:03:38 dumaty kernel: [14164.369357] R10: ffff9cf62fd12a20 R11: ffff9cf5e1bbf738 R12: 0000000000000000
Jul 4 16:03:38 dumaty kernel: [14164.370826] R13: 0000000000000040 R14: ffff9cf64f8a40a0 R15: ffff9cf64b600000
Jul 4 16:03:38 dumaty kernel: [14164.372260] FS: 00007fac918be000(0000) GS:ffff9cf65e280000(0000) knlGS:0000000000000000
Jul 4 16:03:38 dumaty kernel: [14164.373705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 4 16:03:38 dumaty kernel: [14164.375100] CR2: 0000000000000018 CR3: 00000002a82f2000 CR4: 0000000000160670
Jul 4 16:03:38 dumaty kernel: [14164.376543] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 4 16:03:38 dumaty kernel: [14164.377938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jul 4 16:03:38 dumaty kernel: [14164.379216] Stack:
Jul 4 16:03:38 dumaty kernel: [14164.380610] ffff9cf649f5d300 0000000000000000 ffffffffc09d6da0 ffff9cf649f5d300
Jul 4 16:03:38 dumaty kernel: [14164.382141] ffffffffc06c73b8 ffffffffc094b02e ffff9cf649f5d300 0000000000000000
Jul 4 16:03:38 dumaty kernel: [14164.383372] ffffffffc09d6da0 ffff9cf64b600000 ffffffffc06c73b8 ffff9cf64b600000
Jul 4 16:03:38 dumaty kernel: [14164.384541] Call Trace:
Jul 4 16:03:38 dumaty kernel: [14164.385717] [<ffffffffc094b02e>] ? i915_gem_object_put_pages_gtt+0x3e/0x260 [i915]
Jul 4 16:03:38 dumaty kernel: [14164.386885] [<ffffffffc09490e2>] ? i915_gem_object_put_pages+0x72/0xf0 [i915]
Jul 4 16:03:38 dumaty kernel: [14164.388043] [<ffffffffc094de9c>] ? i915_gem_free_object+0xcc/0x280 [i915]
Jul 4 16:03:38 dumaty kernel: [14164.389419] [<ffffffffc06a48c6>] ? drm_gem_object_unreference_unlocked+0x76/0x80 [drm]
Jul 4 16:03:38 dumaty kernel: [14164.391121] [<ffffffffc06a49e1>] ? drm_gem_object_release_handle+0x51/0x90 [drm]
Jul 4 16:03:38 dumaty kernel: [14164.393000] [<ffffffffc06a4a79>] ? drm_gem_handle_delete+0x59/0x80 [drm]
Jul 4 16:03:38 dumaty kernel: [14164.394899] [<ffffffffc06a5c2a>] ? drm_ioctl+0x1fa/0x470 [drm]
Jul 4 16:03:38 dumaty kernel: [14164.396774] [<ffffffffc06a5150>] ? drm_gem_handle_create+0x40/0x40 [drm]
Jul 4 16:03:38 dumaty kernel: [14164.398721] [<ffffffffade2a5b6>] ? current_time+0x36/0x70
Jul 4 16:03:38 dumaty kernel: [14164.400573] [<ffffffffadda43ec>] ? shmem_truncate_range+0x1c/0x40
Jul 4 16:03:38 dumaty kernel: [14164.402625] [<ffffffffadd2d5a4>] ? __seccomp_filter+0x74/0x270
Jul 4 16:03:38 dumaty kernel: [14164.404488] [<ffffffffade220e2>] ? do_vfs_ioctl+0xa2/0x620
Jul 4 16:03:38 dumaty kernel: [14164.406324] [<ffffffffadc03337>] ? syscall_trace_enter+0x117/0x2c0
Jul 4 16:03:38 dumaty kernel: [14164.408169] [<ffffffffade226d4>] ? SyS_ioctl+0x74/0x80
Jul 4 16:03:38 dumaty kernel: [14164.410000] [<ffffffffadc03b7d>] ? do_syscall_64+0x8d/0x100
Jul 4 16:03:38 dumaty kernel: [14164.411822] [<ffffffffae22238e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jul 4 16:03:38 dumaty kernel: [14164.413687] Code: 40 00 66 2e 0f 1f 84 00 00 00 00 00 83 f9 03 74 48 41 56 41 55 49 89 fe 41 54 55 31 ed 85 d2 53 41 89 d5 48 89 f3 41 89 cc 7e 25 <8b> 53 18 48 8b 73 10 44 89 e1 4c 89 f7 83 c5 01 e8 9c ff ff ff
Jul 4 16:03:38 dumaty kernel: [14164.415765] RIP [<ffffffffadf61e1f>] swiotlb_unmap_sg_attrs+0x1f/0x50
Jul 4 16:03:38 dumaty kernel: [14164.417692] RSP <ffffb7c48257bc70>
Jul 4 16:03:38 dumaty kernel: [14164.419631] CR2: 0000000000000018
Jul 4 16:03:38 dumaty kernel: [14164.421605] ---[ end trace e62295838fbe3e50 ]---
Jul 4 16:04:00 dumaty kernel: [14186.946643] GpuWatchdog[1835]: segfault at 0 ip 00005564adf60a02 sp 00007fac7f8656f0 error 6 in chrome[5564a96c6000+7bf3000]
Jul 4 16:04:52 dumaty kernel: [14238.504806] BUG: unable to handle kernel paging request at 000000030ea51897
Jul 4 16:04:52 dumaty kernel: [14238.507190] IP: [<ffffffffadc98962>] __task_pid_nr_ns+0x42/0x90
Jul 4 16:04:52 dumaty kernel: [14238.509452] PGD 0
Jul 4 16:04:52 dumaty kernel: [14238.509464]
Jul 4 16:04:52 dumaty kernel: [14238.511711] Oops: 0000 [#3] SMP
Jul 4 16:04:52 dumaty kernel: [14238.513959] Modules linked in: ctr ccm binfmt_misc rfcomm fuse cmac bnep iTCO_wdt iTCO_vendor_support intel_rapl arc4 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi kvm iwlmvm irqbypass intel_cstate mac80211 joydev evdev intel_uncore pcspkr intel_rapl_perf snd_hda_codec_realtek rtsx_pci_ms serio_raw iwlwifi sg hid_multitouch snd_hda_codec_generic memstick uvcvideo lpc_ich cfg80211 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core btusb cdc_mbim btrtl cdc_wdm snd_hda_intel videodev btbcm shpchp btintel snd_hda_codec i915 media cdc_ncm cdc_acm snd_hda_core bluetooth usbnet drm_kms_helper mii snd_hwdep drm mei_me snd_pcm snd_timer mei i2c_algo_bit thinkpad_acpi wmi nvram snd soundcore ac rfkill battery video button parport_pc ppdev lp parport ip_tables x_tables
Jul 4 16:04:52 dumaty kernel: [14238.521475] autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache algif_skcipher af_alg usbhid hid dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel rtsx_pci_sdmmc mmc_core aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse ahci libahci i2c_i801 i2c_smbus libata xhci_pci scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci e1000e mfd_core ptp usbcore pps_core usb_common thermal
Jul 4 16:04:52 dumaty kernel: [14238.529141] CPU: 2 PID: 8675 Comm: top Tainted: G B D W 4.9.0-16-amd64 #1 Debian 4.9.272-1
Jul 4 16:04:52 dumaty kernel: [14238.531734] Hardware name: LENOVO 20ARA0YL00/20ARA0YL00, BIOS GJET77WW (2.27 ) 05/20/2014
Jul 4 16:04:52 dumaty kernel: [14238.534341] task: ffff9cf60fe45100 task.stack: ffffb7c488158000
Jul 4 16:04:52 dumaty kernel: [14238.537059] RIP: 0010:[<ffffffffadc98962>] [<ffffffffadc98962>] __task_pid_nr_ns+0x42/0x90
Jul 4 16:04:52 dumaty kernel: [14238.539736] RSP: 0018:ffffb7c48815bd78 EFLAGS: 00010286
Jul 4 16:04:52 dumaty kernel: [14238.542405] RAX: 0000000000000508 RBX: ffff9cf64a93de40 RCX: ffff9cf64d816e00
Jul 4 16:04:52 dumaty kernel: [14238.545097] RDX: 000000030ea51067 RSI: 0000000000000004 RDI: ffff9cf6354d1588
Jul 4 16:04:52 dumaty kernel: [14238.547772] RBP: ffff9cf64d816e00 R08: 000000000000044c R09: 0000000000000000
Jul 4 16:04:52 dumaty kernel: [14238.550439] R10: 0000000000000007 R11: ffff9cf64aa452a6 R12: ffff9cf6354d1080
Jul 4 16:04:52 dumaty kernel: [14238.553111] R13: ffffffffae61bb79 R14: 0000000000000066 R15: ffff9cf64da0c840
Jul 4 16:04:52 dumaty kernel: [14238.555792] FS: 00007fa93fea2280(0000) GS:ffff9cf65e280000(0000) knlGS:0000000000000000
Jul 4 16:04:52 dumaty kernel: [14238.558422] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 4 16:04:52 dumaty kernel: [14238.560971] CR2: 000000030ea51897 CR3: 000000030b9c2000 CR4: 0000000000160670
Jul 4 16:04:52 dumaty kernel: [14238.563467] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 4 16:04:52 dumaty kernel: [14238.565880] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jul 4 16:04:52 dumaty kernel: [14238.568210] Stack:
Jul 4 16:04:52 dumaty kernel: [14238.570452] ffffffffade856ff ffffffffae83eee0 ffffffffae845d20 ffff9cf635772800
Jul 4 16:04:52 dumaty kernel: [14238.572715] 00000000000003ff 000000000000044c 0000000000000040 ffffb7c48815bed8
Jul 4 16:04:52 dumaty kernel: [14238.574970] ffffb7c48815beec 0000000000000001 0000000000000000 0000000000000000
Jul 4 16:04:52 dumaty kernel: [14238.577206] Call Trace:
Jul 4 16:04:52 dumaty kernel: [14238.579412] [<ffffffffade856ff>] ? proc_pid_status+0x46f/0x9f0
Jul 4 16:04:52 dumaty kernel: [14238.581621] [<ffffffffaddea428>] ? __kmalloc+0x188/0x580
Jul 4 16:04:52 dumaty kernel: [14238.583821] [<ffffffffade7ff51>] ? proc_single_show+0x51/0x80
Jul 4 16:04:52 dumaty kernel: [14238.586022] [<ffffffffade34326>] ? seq_read+0x106/0x400
Jul 4 16:04:52 dumaty kernel: [14238.588217] [<ffffffffade0d6e1>] ? vfs_read+0x91/0x130
Jul 4 16:04:52 dumaty kernel: [14238.590405] [<ffffffffade0ebfa>] ? SyS_read+0x5a/0xd0
Jul 4 16:04:52 dumaty kernel: [14238.592578] [<ffffffffadc03b7d>] ? do_syscall_64+0x8d/0x100
Jul 4 16:04:52 dumaty kernel: [14238.594739] [<ffffffffae22238e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jul 4 16:04:52 dumaty kernel: [14238.596894] Code: 08 05 00 00 74 1a 83 fe 04 74 0e 89 f6 48 8d 04 76 48 8d 04 c5 08 05 00 00 48 8b bf d0 04 00 00 48 01 c7 48 8b 0f 48 85 c9 74 1f <8b> b2 30 08 00 00 31 c0 3b 71 04 77 0d 48 c1 e6 05 48 01 f1 48
Jul 4 16:04:52 dumaty kernel: [14238.599315] RIP [<ffffffffadc98962>] __task_pid_nr_ns+0x42/0x90
Jul 4 16:04:52 dumaty kernel: [14238.601602] RSP <ffffb7c48815bd78>
Jul 4 16:04:52 dumaty kernel: [14238.603887] CR2: 000000030ea51897
Jul 4 16:04:52 dumaty kernel: [14238.606195] ---[ end trace e62295838fbe3e51 ]---
나는 실패를 유발하지 않고 몇 시간 동안 memtest, memtest86+, 스트레스-ng(cpu, hdd, vm 스트레스 요인)를 실행해 왔습니다. 거래소를 종료하면 거의 하루 동안 상황이 안정적으로 유지되었습니다. 이런 이유로, 그리고 smartctl -t short
전혀 이루어지지 않은 것 같아서 교체용 SSD를 주문했습니다. 위의 실패가 발생한 직후입니다. 나는 YouTube를 시청하는 동안 모든 충돌이 발생했다고 생각합니다(그렇지만 그다지 많지는 않습니다). glxgears는 결함을 유발하지 않습니다.
이 문제의 원인과 진단 방법에 대한 아이디어가 있습니까?
답변1
HDD 또는 SSD가 있는지 지정하지 않았지만 디스크 짧은 테스트를 시도하는 대신 다음을 사용하여 누적된 SMART 오류 수 요약을 볼 수 있습니다.
sudo smartctl -A /dev/sda
Reallocation_Sector_Ct의 0이 아닌 "원래 값"은 특히 우려됩니다.
많은 지표에서는 "고령" 또는 심지어 "사전 장애"가 발생했다고 주장하는데 이는 명백히 완전히 정상적인 현상입니다.
답변2
이는 메모리 부족으로 인한 i915 드라이버 문제였던 것으로 밝혀졌으며, 이는 Linux 4.19.0으로 업그레이드하여 해결되었습니다. 몇 달 전에는 왜 그런 일이 일어나지 않았는지 모르겠습니다. 이로 인해 소프트웨어 문제에 대한 생각을 완전히 포기하게 되었습니다.
답변3
나는 동일하거나 매우 유사한 문제에 직면하고 있습니다. Thinkpad T470/i5-7200U는 수년 동안 커널 4.9.0이 포함된 Debian 9를 사용해 왔습니다. 적어도 작년의 4.9.0-14 및 4.9.0-15는 괜찮았습니다(첫 번째 시도에서 가끔 일시 중지할 수 없는 경우를 제외하고). 최근에 메모리를 업그레이드했습니다. 케이스를 열고(어디서든 문제가 발생할 수 있음) 4.9.0-16으로 업그레이드하세요. 그런 다음 하루에 여러 번 다양한 커널 오류가 발생하여 응용 프로그램이 종료되거나 전체 시스템이 정지됩니다. 시스템을 정지하는 것도 거의 불가능합니다. 오류는 다음과 같습니다.
Oct 15 10:19:48 tp470 kernel: [ 3214.743652] BUG: Bad page map in process chromium-browse pte:ac00000000000000 pmd:19dedf067
Oct 15 10:19:48 tp470 kernel: [ 3214.743799] BUG: Bad page map in process chromium-browse pte:1ac9625ced pmd:19dedf067
Oct 15 12:00:51 tp470 kernel: [ 9277.955817] BUG: unable to handle kernel NULL pointer dereference at 0000000000000017
Oct 15 12:12:22 tp470 kernel: [ 9968.541038] BUG: unable to handle kernel NULL pointer dereference at 0000000000000007
Oct 15 12:26:32 tp470 kernel: [10818.769010] BUG: unable to handle kernel NULL pointer dereference at 0000000000000017
Oct 15 13:16:42 tp470 kernel: [ 2294.294541] BUG: unable to handle kernel paging request at ffff9d1ac9625ced
Oct 15 13:34:42 tp470 kernel: [ 3374.077080] BUG: Bad page map in process Privileged Cont pte:8000001ac9625ced pmd:1d2f4e067
Oct 15 13:34:42 tp470 kernel: [ 3374.087870] BUG: Bad page cache in process firefox-esr pfn:1a94c9
Oct 15 16:17:04 tp470 kernel: [13115.567010] BUG: unable to handle kernel paging request at ffff9d1ac9625d35
내 첫 번째 생각은 동일했습니다. 하드웨어 손상, 두 RAM 모듈 모두 테스트 등이었습니다. 그런 다음 USB에서 Debian 11을 테스트했는데 전혀 문제가 없었습니다. 이제 4.9.0-14로 돌아왔고 커널에는 문제가 없습니다. 따라서 버전 4.9.0-16-amd64에는 (적어도) 일부 Intel CPU에 대한 내부 문제가 있는 것 같습니다...