커널 업그레이드 후 Linux I/O 교착 상태

커널 업그레이드 후 Linux I/O 교착 상태

4.19에서 업그레이드한 후 Linux 커널 5.4.35 이상을 사용하고 있는데 그 이후로 며칠(2-3일) 후에 hpsa md RAID 0이 중단되고 RAID가 읽기 전용 /I /Odeny로 변경되었습니다. (데비안 "바닐라 커널"에서 컴파일)

SMART 통계를 확인해 보면 치명적/중요한 오류가 표시되지 않습니다.

나는 또한 Github에서 찾을 수 있는 hpsahba의 6개 패치를 사용합니다.여기.

해당 시스템 로그는 다음과 같습니다. 전체 시스템 로그는 Pastebin에서 찾을 수 있습니다.여기

Apr 30 15:58:31 srv381 kernel: [544209.588021] sd 0:0:10:0: [sdj] tag#173 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:58:31 srv381 kernel: [544209.588026] sd 0:0:10:0: [sdj] tag#173 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:58:31 srv381 kernel: [544209.588028] sd 0:0:10:0: [sdj] tag#173 Add. Sense: Record not found
Apr 30 15:58:31 srv381 kernel: [544209.588032] sd 0:0:10:0: [sdj] tag#173 CDB: Write(16) 8a 00 00 00 00 01 91 28 00 00 00 00 01 30 00 00
Apr 30 15:58:31 srv381 kernel: [544209.588035] blk_update_request: critical medium error, dev sdj, sector 6730285056 op 0x1:(WRITE) flags 0x100000 phys_seg 5 prio class 0
Apr 30 15:58:42 srv381 kernel: [544220.603519] sd 0:0:10:0: [sdj] tag#179 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:58:42 srv381 kernel: [544220.603523] sd 0:0:10:0: [sdj] tag#179 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:58:42 srv381 kernel: [544220.603527] sd 0:0:10:0: [sdj] tag#179 Add. Sense: Unrecovered read error
Apr 30 15:58:42 srv381 kernel: [544220.603530] sd 0:0:10:0: [sdj] tag#179 CDB: Read(16) 88 00 00 00 00 00 4a d1 69 b0 00 00 02 50 00 00
Apr 30 15:58:42 srv381 kernel: [544220.603533] blk_update_request: critical medium error, dev sdj, sector 1255238064 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 0
Apr 30 15:59:05 srv381 kernel: [544243.400236] XFS (md0p2): writeback error on sector 6730284320
Apr 30 15:59:41 srv381 kernel: [544279.528345] sd 0:0:10:0: [sdj] tag#143 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:59:41 srv381 kernel: [544279.528352] sd 0:0:10:0: [sdj] tag#143 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:59:41 srv381 kernel: [544279.528354] sd 0:0:10:0: [sdj] tag#143 Add. Sense: Record not found
Apr 30 15:59:41 srv381 kernel: [544279.528358] sd 0:0:10:0: [sdj] tag#143 CDB: Write(16) 8a 00 00 00 00 01 91 2c c2 c8 00 00 01 38 00 00
Apr 30 15:59:41 srv381 kernel: [544279.528361] blk_update_request: critical medium error, dev sdj, sector 6730597064 op 0x1:(WRITE) flags 0x100000 phys_seg 20 prio class 0
Apr 30 15:59:41 srv381 kernel: [544279.557380] XFS (md0p2): writeback error on sector 6730597056
Apr 30 16:00:19 srv381 kernel: [544317.433932] hpsa 0000:05:00.0: scsi 0:0:10:0: resetting physical  Direct-Access     ATA      TP04000GB        PHYS DRV SSDSmartPathCap- En- Exp=1
Apr 30 16:00:24 srv381 kernel: [544322.470747] hpsa 0000:05:00.0: waiting 2 secs for device to become ready.
Apr 30 16:00:26 srv381 kernel: [544324.497534] hpsa 0000:05:00.0: waiting 4 secs for device to become ready.
Apr 30 16:00:30 srv381 kernel: [544328.529549] hpsa 0000:05:00.0: waiting 8 secs for device to become ready.
Apr 30 16:00:38 srv381 kernel: [544336.721590] hpsa 0000:05:00.0: waiting 16 secs for device to become ready.
Apr 30 16:00:54 srv381 kernel: [544352.849662] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:01:27 srv381 kernel: [544385.617802] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:00 srv381 kernel: [544418.386133] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:32 srv381 kernel: [544451.154095] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:55 srv381 kernel: [544473.682061] INFO: task jbd2/sda2-8:270 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682101]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682164] jbd2/sda2-8     D    0   270      2 0x80004000
Apr 30 16:02:55 srv381 kernel: [544473.682166] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682176]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682178]  ? bit_wait_timeout+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682179]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682181]  io_schedule+0x12/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682182]  bit_wait_io+0xd/0x50
Apr 30 16:02:55 srv381 kernel: [544473.682184]  __wait_on_bit+0x2a/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682186]  out_of_line_wait_on_bit+0x92/0xb0
Apr 30 16:02:55 srv381 kernel: [544473.682190]  ? var_wake_function+0x20/0x20
Apr 30 16:02:55 srv381 kernel: [544473.682198]  jbd2_journal_commit_transaction+0x107c/0x1930 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682203]  ? try_to_del_timer_sync+0x4f/0x80
Apr 30 16:02:55 srv381 kernel: [544473.682208]  kjournald2+0xb7/0x280 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682210]  ? finish_wait+0x80/0x80
Apr 30 16:02:55 srv381 kernel: [544473.682213]  kthread+0xf9/0x130
Apr 30 16:02:55 srv381 kernel: [544473.682217]  ? commit_timeout+0x10/0x10 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682219]  ? kthread_park+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682222]  ret_from_fork+0x35/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682228] INFO: task rs:main Q:Reg:917 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682261]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682288] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682323] rs:main Q:Reg   D    0   917      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682325] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682328]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682329]  ? _cond_resched+0x15/0x30
Apr 30 16:02:55 srv381 kernel: [544473.682331]  ? bit_wait_timeout+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682332]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682334]  io_schedule+0x12/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682335]  bit_wait_io+0xd/0x50
Apr 30 16:02:55 srv381 kernel: [544473.682337]  __wait_on_bit+0x2a/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682338]  out_of_line_wait_on_bit+0x92/0xb0
Apr 30 16:02:55 srv381 kernel: [544473.682340]  ? var_wake_function+0x20/0x20
Apr 30 16:02:55 srv381 kernel: [544473.682345]  do_get_write_access+0x297/0x3e0 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682350]  jbd2_journal_get_write_access+0x5c/0x80 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682372]  __ext4_journal_get_write_access+0x37/0x80 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682385]  ? ext4_dirty_inode+0x44/0x60 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682398]  ext4_reserve_inode_write+0x93/0xc0 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682412]  ext4_mark_inode_dirty+0x51/0x1d0 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682416]  ? jbd2__journal_start+0xdc/0x1e0 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682429]  ext4_dirty_inode+0x44/0x60 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682432]  __mark_inode_dirty+0x262/0x380
Apr 30 16:02:55 srv381 kernel: [544473.682435]  generic_update_time+0x9d/0xc0
Apr 30 16:02:55 srv381 kernel: [544473.682437]  file_update_time+0xeb/0x140
Apr 30 16:02:55 srv381 kernel: [544473.682439]  __generic_file_write_iter+0x96/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682452]  ext4_file_write_iter+0xb6/0x360 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682456]  new_sync_write+0x12d/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682459]  vfs_write+0xb6/0x1a0
Apr 30 16:02:55 srv381 kernel: [544473.682461]  ksys_write+0x5f/0xe0
Apr 30 16:02:55 srv381 kernel: [544473.682465]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682467]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682469] RIP: 0033:0x7ffa65862e0f
Apr 30 16:02:55 srv381 kernel: [544473.682474] Code: Bad RIP value.
Apr 30 16:02:55 srv381 kernel: [544473.682475] RSP: 002b:00007ffa64936860 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682477] RAX: ffffffffffffffda RBX: 00007ffa5c06b7a0 RCX: 00007ffa65862e0f
Apr 30 16:02:55 srv381 kernel: [544473.682478] RDX: 000000000000006d RSI: 00007ffa5c06b7a0 RDI: 000000000000000c
Apr 30 16:02:55 srv381 kernel: [544473.682479] RBP: 00007ffa5c004ea0 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682480] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffa5c00a120
Apr 30 16:02:55 srv381 kernel: [544473.682481] R13: 000000000000006d R14: 0000000000000000 R15: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682491] INFO: task deluged:10450 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682523]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682550] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682585] deluged         D    0 10450      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682587] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682590]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682592]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682595]  rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.682649]  ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682684]  xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682719]  xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682724]  do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682726]  do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.682727]  vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.682731]  ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.682733]  ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.682734]  do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.682737]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682739]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682741] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682743] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.682744] RSP: 002b:00007fc0d5b510f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.682745] RAX: ffffffffffffffda RBX: 00007fc0d5b51190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682746] RDX: 0000000000000001 RSI: 00007fc0d5b51190 RDI: 0000000000001696
Apr 30 16:02:55 srv381 kernel: [544473.682747] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682748] R10: 0000000004dbadbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682749] R13: 0000000000001696 R14: 0000000000000001 R15: 0000000004dbadbe
Apr 30 16:02:55 srv381 kernel: [544473.682751] INFO: task deluged:10452 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682783]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682845] deluged         D    0 10452      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682847] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682849]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682853]  ? enqueue_task_fair+0x8c/0x4c0
Apr 30 16:02:55 srv381 kernel: [544473.682854]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682856]  rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.682894]  ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682928]  xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682963]  xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682967]  do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682969]  do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.682970]  vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.682973]  ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.682975]  ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.682976]  do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.682979]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682981]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682982] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682984] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.682985] RSP: 002b:00007fc0d491f0f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.682986] RAX: ffffffffffffffda RBX: 00007fc0d491f190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682987] RDX: 0000000000000001 RSI: 00007fc0d491f190 RDI: 0000000000001697
Apr 30 16:02:55 srv381 kernel: [544473.682988] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682989] R10: 0000000005e5ccbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682990] R13: 0000000000001697 R14: 0000000000000001 R15: 0000000005e5ccbe
Apr 30 16:02:55 srv381 kernel: [544473.682992] INFO: task deluged:10454 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.683024]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.683051] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.683086] deluged         D    0 10454      1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.683088] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.683090]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.683092]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.683094]  rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.683131]  ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683166]  xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683201]  xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683204]  do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.683206]  do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.683208]  vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.683210]  ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.683212]  ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.683214]  do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.683216]  do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.683218]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.683220] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.683221] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.683222] RSP: 002b:00007fc0cf7f70f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.683224] RAX: ffffffffffffffda RBX: 00007fc0cf7f7190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.683225] RDX: 0000000000000001 RSI: 00007fc0cf7f7190 RDI: 0000000000001697
Apr 30 16:02:55 srv381 kernel: [544473.683226] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.683226] R10: 0000000002e5ccbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.683227] R13: 0000000000001697 R14: 0000000000000001 R15: 0000000002e5ccbe
Apr 30 16:02:55 srv381 kernel: [544473.683235] INFO: task kworker/2:2:21309 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.683268]       Tainted: G          I E     5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.683295] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.683330] kworker/2:2     D    0 21309      2 0x80004000
Apr 30 16:02:55 srv381 kernel: [544473.683370] Workqueue: xfs-sync/md0p2 xfs_log_worker [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683372] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.683375]  ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.683376]  schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.683385]  md_flush_request+0xa8/0x1b0 [md_mod]

답변1

SMART 오류는 없지만 sdj실제로 디스크를 사용할 때 오류가 보고되고 있으며 이는 RAID 볼륨에 영향을 미치는 것으로 보입니다 md0p2.

메시지를 남긴 후

hpsa 0000:05:00.0: scsi 0:0:10:0: resetting physical  Direct-Access     ATA      TP04000GB        PHYS DRV SSDSmartPathCap- En- Exp=1

문제의 디스크가 완전히 응답을 멈춘 것 같습니다. 이는 쓰기 저장 오류이므로 커널이 쓰기 작업을 캐시하고 디스크에 기록하겠다고 사용자 공간 애플리케이션에 "약속"했음을 의미합니다. 이제 실제로 쓰기가 불가능하다는 것이 밝혀졌으며 RAID 0을 사용하면 디스크가 다시 응답하기를 기다리는 것 외에는 복구할 수 있는 방법이 없습니다. 또 다른 옵션은 의도적으로 데이터를 잃는 것입니다. 이는 커널 문제입니다.그냥 나 혼자 어떻게 해야할지 모르겠어.

4월 30일 16:00:19에 커널은 오류에서 복구하기 위해 디스크에 재설정 명령을 실행했지만 디스크는 분명히 명령을 완료하지 못했습니다.

시스템 로그를 바탕으로 디스크가 손상되었다고 선언할 준비가 되었습니다. 사망시간은 4월 30일 16시 0분 24초경이었다.

전원을 껐다가 켜서 디스크가 복구되면 콘텐츠를 백업하겠습니다.다른 조치를 취하기 전에 최대한 빨리.

관련 정보