커널: 버그: 소프트 lock_raw_spin_unlock_irqrestore

커널: 버그: 소프트 lock_raw_spin_unlock_irqrestore

시스템이 중단된 후 재부팅되고 소프트 잠금 메시지가 계속 나타납니다. 재부팅하기 전에 vmcore가 활성화되지 않았으므로 vmcore가 없습니다. 커널: 3.10.0-327.el7.x86_64.

이전에 비슷한 문제를 겪은 사람이 있다면 문제가 무엇인지 아시나요? 감사해요.

Nov 14 06:25:07 localhost kernel: BUG: soft lockup - CPU#3 stuck for 37s! [xfsaild/dm-0:487]
Nov 14 06:25:07 localhost kernel: Modules linked in: fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 binfmt_misc ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter vmw_vsock_vmci_transport vsock coretemp crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ppdev vmw_balloon pcspkr sg parport_pc parport shpchp i2c_piix4 vmw_vmci ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi sd_mod crc_t10dif crct10dif_generic serio_raw crct10dif_pclmul
Nov 14 06:25:07 localhost kernel: crct10dif_common vmwgfx crc32c_intel drm_kms_helper ttm drm ata_piix vmxnet3 libata i2c_core vmw_pvscsi floppy dm_mirror dm_region_hash dm_log dm_mod
Nov 14 06:25:07 localhost kernel: CPU: 3 PID: 487 Comm: xfsaild/dm-0 Tainted: G             L ------------   3.10.0-327.el7.x86_64 #1
Nov 14 06:25:07 localhost kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
Nov 14 06:25:07 localhost kernel: task: ffff880fe4ac9700 ti: ffff880fe33f4000 task.ti: ffff880fe33f4000
Nov 14 06:25:07 localhost kernel: RIP: 0010:[<ffffffff8163ca4b>]  [<ffffffff8163ca4b>] _raw_spin_unlock_irqrestore+0x1b/0x40
Nov 14 06:25:07 localhost kernel: RSP: 0018:ffff880fe33f7b68  EFLAGS: 00000282
Nov 14 06:25:07 localhost kernel: RAX: 0000000000000000 RBX: ffff880fe33f7b30 RCX: 0000000000000200
Nov 14 06:25:07 localhost kernel: RDX: ffffc90006060000 RSI: 0000000000000282 RDI: 0000000000000282
Nov 14 06:25:07 localhost kernel: RBP: ffff880fe33f7b70 R08: 0000000000000000 R09: ffff8805e761ec00
Nov 14 06:25:07 localhost kernel: R10: ffff880fe471a000 R11: ffff880fe88db800 R12: ffff880b28fc9f00
Nov 14 06:25:07 localhost kernel: R13: 0000000000000020 R14: ffffffff8141e59f R15: ffff880fe33f7ae0
Nov 14 06:25:07 localhost kernel: FS:  0000000000000000(0000) GS:ffff88103fcc0000(0000) knlGS:0000000000000000
Nov 14 06:25:07 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 06:25:07 localhost kernel: CR2: 00007ff539c0c810 CR3: 0000000fe7528000 CR4: 00000000001407e0
Nov 14 06:25:07 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 14 06:25:07 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Nov 14 06:25:07 localhost kernel: Stack:
Nov 14 06:25:07 localhost kernel: 0000000000000000 ffff880fe33f7be0 ffffffffa00554f7 ffff8805e761ec00
Nov 14 06:25:07 localhost kernel: ffff880fe471a000 ffffffff8141e0c0 0000000000000002 ffff880fe88dc754
Nov 14 06:25:07 localhost kernel: ffff880fe31e6a00 0000000000000282 ffff880b28fc9f80 0000000000000000
Nov 14 06:25:07 localhost kernel: Call Trace:
Nov 14 06:25:07 localhost kernel: [<ffffffffa00554f7>] pvscsi_queue+0x3b7/0x5c0 [vmw_pvscsi]
Nov 14 06:25:07 localhost kernel: [<ffffffff8141e0c0>] ? scsi_kmap_atomic_sg+0x190/0x190
Nov 14 06:25:07 localhost kernel: [<ffffffff81417b1a>] scsi_dispatch_cmd+0xaa/0x230
Nov 14 06:25:07 localhost kernel: [<ffffffff81420aa1>] scsi_request_fn+0x501/0x770
Nov 14 06:25:07 localhost kernel: [<ffffffff812c73e3>] __blk_run_queue+0x33/0x40
Nov 14 06:25:07 localhost kernel: [<ffffffff812c749a>] queue_unplugged+0x2a/0xa0
Nov 14 06:25:07 localhost kernel: [<ffffffff812cbcc5>] blk_flush_plug_list+0x185/0x230
Nov 14 06:25:07 localhost kernel: [<ffffffff812cc124>] blk_finish_plug+0x14/0x40
Nov 14 06:25:07 localhost kernel: [<ffffffffa0222a79>] __xfs_buf_delwri_submit+0x1e9/0x250 [xfs]
Nov 14 06:25:07 localhost kernel: [<ffffffffa022367f>] ? xfs_buf_delwri_submit_nowait+0x2f/0x50 [xfs]
Nov 14 06:25:07 localhost kernel: [<ffffffffa024e470>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Nov 14 06:25:07 localhost kernel: [<ffffffffa022367f>] xfs_buf_delwri_submit_nowait+0x2f/0x50 [xfs]
Nov 14 06:25:07 localhost kernel: [<ffffffffa024e6b0>] xfsaild+0x240/0x5e0 [xfs]
Nov 14 06:25:07 localhost kernel: [<ffffffffa024e470>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Nov 14 06:25:07 localhost kernel: [<ffffffff810a5aef>] kthread+0xcf/0xe0
Nov 14 06:25:07 localhost kernel: [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Nov 14 06:25:07 localhost kernel: [<ffffffff81645858>] ret_from_fork+0x58/0x90
Nov 14 06:25:07 localhost kernel: [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Nov 14 06:25:07 localhost kernel: Code: 08 e8 aa 72 a4 ff 5d c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 f3 0f 1f 44 00 00 66 83 07 02 48 89 df 57 9d <0f> 1f 44 00 00 5b 5d c3 0f 1f 44 00 00 8b 37 f0 66 83 07 02 f6

답변1

소프트 잠금은 디스크 I/O 요청 처리와 관련된 것으로 보입니다. 하드웨어 시스템에서는 SMART 데이터와 기타 사용 가능한 디스크 상태 정보를 확인하여 하드웨어 문제의 가능성을 배제합니다.

그러나 이는 VMware 가상 머신인 것으로 보이므로 가장 먼저 확인해야 할 것은 가상화 호스트의 통계입니다. 호스트 또는 해당 스토리지가 모든 가상 머신으로 인해 과부하되었는지 여부입니다. 이로 인해 I/O 요청에 응답하는 데 오랜 지연이 발생할 수 있습니다. 이 지연이 30초 이상 지속되면 근본 원인이 호스트의 모든 가상 머신을 충족할 만큼 CPU 용량이나 스토리지 I/O 대역폭이 충분하지 않은 것일 수도 있지만 이러한 소프트 잠금 알림을 받기 시작합니다.

관련 정보