재부팅/정전 시 Fedora 23이 중단됨 [nvidia.ko NULL 포인터 역참조]

재부팅/정전 시 Fedora 23이 중단됨 [nvidia.ko NULL 포인터 역참조]

이것은 매우 무작위적인 것으로 보이며 오랫동안 발생해 왔으며 모든 것을 정기적으로 업데이트합니다.

  • 때로는 재부팅하고 전원을 끄는 것이 잘 작동하고 빠릅니다.
  • 때로는 검은 화면에 앉아 있는 데 시간이 걸릴 때도 있습니다. Esc 키를 누르면 작업 중지가 실행 중이고 1.5분 후에 다시 시작되는 것을 확인할 수 있습니다. 이러한 경우 2초 안에 ctrl-alt-del을 7번 누르고 재부팅할 수 있습니다(그러나 언급된 것처럼 즉시는 아닙니다).
  • 때로는 카운트다운이 완료되기 전에 작동이 멈춰 플러그를 뽑아야 하는 경우도 있습니다.
  • 때로는 종료하기 전에 정지되어 검은 화면만 남을 때도 있습니다.

자세한 정보를 얻으려면 어떤 로그를 볼 수 있습니까(예: 특정 종료 로그가 있습니까)?

원인은 무엇이며 어떻게 다시 시작할 수 있나요?

나는 모든 중요한 프로그램이 완료되었는지 확인하고 프로세스를 전혀 종료할 시간을 주지 않는 강제 재시작에 반대하지 않는 것이 내 책임이라고 생각합니다.

이 문제매우 유사하지만 일반적으로 빈 화면만 나타나므로 X가 꺼져 있다고 가정합니다.

고쳐 쓰다:
reboot이것은 실패한 사용의 꼬리입니다 journalctl --boot=-1. 이러한 행 중 일부에 대한 오류 보고서가 있지만 아직 해결책을 찾지 못했습니다.

May 10 22:55:37 localhost abrt-hook-ccpp[15180]: /var/spool/abrt is 3433423071 bytes (more than 1279MiB), deleting 'ccpp-2016-05-10-22:55:30-1387'
May 10 22:55:38 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000178
May 10 22:55:38 localhost kernel: IP: [<ffffffffa02d83f8>] _nv003139rm+0x68/0x240 [nvidia]
May 10 22:55:38 localhost kernel: PGD 0 
May 10 22:55:38 localhost kernel: Oops: 0000 [#1] SMP 
May 10 22:55:38 localhost kernel: Modules linked in: nvidia_modeset(POE) ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm fuse irqbypass snd_hda_codec_analog snd_hda_codec_generic crc32c_intel i2c_i801 snd_hda_codec_hdmi snd_usb_audio snd_usbmidi_lib nvidia(POE) snd_rawmidi lpc_ich joydev snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device drm snd_pcm snd_timer snd soundcore i7core_edac i5500_temp edac_core shpchp
May 10 22:55:38 localhost kernel:  tpm_infineon asus_atk0110 acpi_cpufreq tpm_tis tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc hid_microsoft mxm_wmi serio_raw ata_generic pata_acpi sky2 wmi fjes
May 10 22:55:38 localhost kernel: CPU: 1 PID: 1387 Comm: kwin_x11 Tainted: P          IOE   4.4.8-300.fc23.x86_64 #1
May 10 22:55:38 localhost kernel: Hardware name: System manufacturer System Product Name/P6T DELUXE V2, BIOS 1003    03/08/2010
May 10 22:55:38 localhost kernel: task: ffff8800b1305640 ti: ffff8801ae410000 task.ti: ffff8801ae410000
May 10 22:55:38 localhost kernel: RIP: 0010:[<ffffffffa02d83f8>]  [<ffffffffa02d83f8>] _nv003139rm+0x68/0x240 [nvidia]
May 10 22:55:38 localhost kernel: RSP: 0018:ffff8801ae4137d8  EFLAGS: 00010286
May 10 22:55:38 localhost kernel: RAX: ffff8800b9ffafe8 RBX: ffff8801ad9e2008 RCX: 0000000000000000
May 10 22:55:38 localhost kernel: RDX: ffff880198d6d408 RSI: ffff8801b7bc5808 RDI: ffff8801b6694008
May 10 22:55:38 localhost kernel: RBP: ffff880197bcaca8 R08: 0000000000000001 R09: 0000000000000000
May 10 22:55:38 localhost kernel: R10: ffffea0002e8cc40 R11: ffffffffa076f040 R12: ffff880198d6d408
May 10 22:55:38 localhost kernel: R13: ffff8801b6694008 R14: 0000000000000000 R15: ffff880198d6d408
May 10 22:55:38 localhost kernel: FS:  00007f59274b6940(0000) GS:ffff8801b9240000(0000) knlGS:0000000000000000
May 10 22:55:38 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 10 22:55:38 localhost kernel: CR2: 0000000000000178 CR3: 0000000001c09000 CR4: 00000000000006e0
May 10 22:55:38 localhost kernel: Stack:
May 10 22:55:38 localhost kernel:  ffff8801b6694008 ffff880198d6d408 0000000000000000 ffff8801ad9e2008
May 10 22:55:38 localhost kernel:  ffff880197bcad20 ffffffffa066b918 ffff8801b6faedb0 ffff880198d6d408
May 10 22:55:38 localhost kernel:  ffff8801b5acce08 ffff8801b6694008 0000000000000000 ffffffffa066bb44
May 10 22:55:38 localhost kernel: Call Trace:
May 10 22:55:38 localhost kernel:  [<ffffffffa066b918>] ? _nv015306rm+0xb8/0x3d70 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa066bb44>] ? _nv015306rm+0x2e4/0x3d70 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa061a299>] ? _nv014791rm+0x439/0x920 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa061a46e>] ? _nv014791rm+0x60e/0x920 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa0560129>] ? _nv010603rm+0x1e9/0x310 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa04ab422>] ? _nv007106rm+0x352/0x380 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa04a6efb>] ? _nv007114rm+0xbb/0xf0 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa049f5d6>] ? _nv007665rm+0xb56/0x13c0 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa059939a>] ? _nv011536rm+0x20a/0x380 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa0598112>] ? _nv011559rm+0x10b2/0x1690 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa05947a1>] ? _nv011534rm+0x81/0x690 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa049f2d9>] ? _nv007665rm+0x859/0x13c0 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa049d1d8>] ? _nv007704rm+0x108/0x1750 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa0528163>] ? _nv002124rm+0x2c33/0x3cf0 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa07ab5cf>] ? _nv000818rm+0x1cf/0x270 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa079fba8>] ? rm_shutdown_adapter+0xc8/0xf0 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffff8141a500>] ? free_msi_irqs+0xc0/0x190
May 10 22:55:38 localhost kernel:  [<ffffffffa0279768>] ? nv_close_device+0x118/0x130 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa027b940>] ? nvidia_close+0xd0/0x300 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffffa027939c>] ? nvidia_frontend_close+0x2c/0x50 [nvidia]
May 10 22:55:38 localhost kernel:  [<ffffffff8122fd5c>] ? __fput+0xdc/0x1e0
May 10 22:55:38 localhost kernel:  [<ffffffff8122fe9e>] ? ____fput+0xe/0x10
May 10 22:55:38 localhost kernel:  [<ffffffff810c0b53>] ? task_work_run+0x73/0x90
May 10 22:55:38 localhost kernel:  [<ffffffff810a6d32>] ? do_exit+0x2d2/0xad0
May 10 22:55:38 localhost kernel:  [<ffffffff810a75b7>] ? do_group_exit+0x47/0xb0
May 10 22:55:38 localhost kernel:  [<ffffffff810b2a34>] ? get_signal+0x294/0x610
May 10 22:55:38 localhost kernel:  [<ffffffff81017297>] ? do_signal+0x37/0x6b0
May 10 22:55:38 localhost kernel:  [<ffffffff810b0dbe>] ? send_signal+0x3e/0x80
May 10 22:55:38 localhost kernel:  [<ffffffff817a0e4e>] ? _raw_spin_unlock_irqrestore+0xe/0x10
May 10 22:55:38 localhost kernel:  [<ffffffff810b17ec>] ? do_send_sig_info+0x6c/0xa0
May 10 22:55:38 localhost kernel:  [<ffffffff8100320c>] ? exit_to_usermode_loop+0x8c/0xd0
May 10 22:55:38 localhost kernel:  [<ffffffff81003d41>] ? syscall_return_slowpath+0xa1/0xb0
May 10 22:55:38 localhost kernel:  [<ffffffff817a150c>] ? int_ret_from_sys_call+0x25/0x8f
May 10 22:55:38 localhost kernel: Code: 00 00 48 89 45 28 be 30 00 00 00 48 8b bb f0 0a 00 00 ff 93 88 06 00 00 4c 8b 30 48 8b 75 28 4c 89 fa 41 b8 01 00 00 00 4c 89 ef <41> 8b 8e 78 01 00 00 e8 ac 54 19 00 85 c0 ba 1f 00 00 00 75 46 
May 10 22:55:38 localhost kernel: RIP  [<ffffffffa02d83f8>] _nv003139rm+0x68/0x240 [nvidia]
May 10 22:55:38 localhost kernel:  RSP <ffff8801ae4137d8>
May 10 22:55:38 localhost kernel: CR2: 0000000000000178
May 10 22:55:38 localhost kernel: ---[ end trace b99676761802a72c ]---
May 10 22:55:38 localhost kernel: Fixing recursive fault but reboot is needed!
May 10 22:55:38 localhost systemd[1]: Received SIGCHLD from PID 1304 (kded5).
May 10 22:55:38 localhost systemd[1]: Child 1304 (kded5) died (code=killed, status=11/SEGV)
May 10 22:55:38 localhost systemd[1]: session-1.scope: Child 1304 belongs to session-1.scope
May 10 22:55:49 localhost systemd-coredump[15204]: Process 15193 (klauncher) of user 1000 dumped core.

  Stack trace of thread 15193:
  #0  0x00007fda66c30a98 raise (libc.so.6)
  #1  0x00007fda66c3269a abort (libc.so.6)
  #2  0x00007fda678ed031 _ZNK14QMessageLogger5fatalEPKcz (libQt5Core.so.5)
  #3  0x00007fda4d8e24d0 _ZN14QXcbConnectionC1EP19QXcbNativeInterfacebjPKc (libQt5XcbQpa.so.5)
  #4  0x00007fda4d8e7a91 _ZN15QXcbIntegrationC1ERK11QStringListRiPPc (libQt5XcbQpa.so.5)
  #5  0x00007fda699a55cd _ZN21QXcbIntegrationPlugin6createERK7QStringRK11QStringListRiPPc (/usr/lib64/qt5/plugins/platforms/libqxcb.so)
  #6  0x00007fda67e1159f _ZN27QPlatformIntegrationFactory6createERK7QStringRK11QStringListRiPPcS2_ (libQt5Gui.so.5)
  #7  0x00007fda67e1f933 _ZN22QGuiApplicationPrivate25createPlatformIntegrationEv (libQt5Gui.so.5)
  #8  0x00007fda67e2075d _ZN22QGuiApplicationPrivate21createEventDispatcherEv (libQt5Gui.so.5)
  #9  0x00007fda67adeb7f _ZN16QCoreApplication4initEv (libQt5Core.so.5)
  #10 0x00007fda67adec56 _ZN16QCoreApplicationC1ER23QCoreApplicationPrivate (libQt5Core.so.5)
  #11 0x00007fda67e22e9d _ZN15QGuiApplicationC2ERiPPci (libQt5Gui.so.5)
  #12 0x00007fda4e2634aa kdemain (libkdeinit5_klauncher.so)
  #13 0x000055e2014b0a8d _ZL6launchiPKcS0_S0_iS0_bS0_bS0_.constprop.27 (kdeinit5)
  #14 0x000055e2014ad691 main (kdeinit5)
  #15 0x00007fda66c1c580 __libc_start_main (libc.so.6)
  #16 0x000055e2014adb79 _start (kdeinit5)

(Nvidia 드라이버 361.42, Linux 커널 4.4.7)

답변1

불행하게도 "nvidia.ko는 종료 시 NULL을 먹습니다"는 구체적으로 작동하지 않습니다. 완전한 디버깅 정보 없이는 커널 개발자에게 링 0을 디버깅하도록 요청할 수 없습니다(트레이스백 참조). 단 하나의 회사만이 그러한 능력을 가지고 있습니다 ;). NVidia에 충돌 세부 정보를 보고해야 합니다(자동 보고 기능이 없는 것으로 가정합니다). 따라야 할 문제 해결 프로세스가 있을 수 있습니다.

(또는 nouveau로 다시 전환하면 이 특정 추적이 중지됩니다).

관련 정보