디스크 오류, blk_update_request: I/O 오류

디스크 오류, blk_update_request: I/O 오류

현재 zfs 풀 재구축 후 서버에 최근 마운트된 모든 디스크에 액세스하는 데 문제가 있습니다. parted를 사용하여 파티션을 나누려고 하면 다음 오류가 발생합니다. Error: Input/output error during write

dmesg와 smartctl을 확인한 결과 디스크에 문제가 있는 것 같습니다. 어레이 재구축 프로세스 중에 디스크가 어떤 방식으로든 손상되었을 가능성이 있습니까? 동시에 모두 파손될 가능성은 거의 없어 보입니다.

smartctl 테스트 결과는 괜찮지만 파티션 테이블을 생성할 수 없습니다.

더 많은 정보가 필요하면 알려주세요.

dmesg:

[   16.439672] blk_update_request: I/O error, dev sdb, sector 8 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   16.439686] ata2: EH complete
[   16.686094] ata2.00: exception Emask 0x10 SAct 0x70000 SErr 0x0 action 0x6 frozen
[   16.686103] ata2.00: irq_stat 0x08000000, interface fatal error
[   16.686107] ata2.00: failed command: READ FPDMA QUEUED
[   16.686110] ata2.00: cmd 60/08:80:10:00:00/00:00:00:00:00/40 tag 16 ncq dma 4096 in
                        res 40/00:00:10:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   16.686118] ata2.00: status: { DRDY }
[   16.686121] ata2.00: failed command: READ FPDMA QUEUED
[   16.686123] ata2.00: cmd 60/10:88:28:00:00/00:00:00:00:00/40 tag 17 ncq dma 8192 in
                        res 40/00:00:10:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   16.686130] ata2.00: status: { DRDY }
[   16.686132] ata2.00: failed command: READ FPDMA QUEUED
[   16.686133] ata2.00: cmd 60/30:90:48:00:00/00:00:00:00:00/40 tag 18 ncq dma 24576 in
                        res 40/00:00:10:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   16.686139] ata2.00: status: { DRDY }
[   16.686142] ata2: hard resetting link
[   17.162142] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   17.209854] ata2.00: configured for UDMA/133
[   17.209865] sd 1:0:0:0: [sdb] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[   17.209867] sd 1:0:0:0: [sdb] tag#16 Sense Key : Illegal Request [current] 
[   17.209869] sd 1:0:0:0: [sdb] tag#16 Add. Sense: Unaligned write command
[   17.209871] sd 1:0:0:0: [sdb] tag#16 CDB: Read(16) 88 00 00 00 00 00 00 00 00 10 00 00 00 08 00 00
[   17.209871] blk_update_request: I/O error, dev sdb, sector 16 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   17.209883] sd 1:0:0:0: [sdb] tag#17 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[   17.209884] sd 1:0:0:0: [sdb] tag#17 Sense Key : Illegal Request [current] 
[   17.209885] sd 1:0:0:0: [sdb] tag#17 Add. Sense: Unaligned write command
[   17.209886] sd 1:0:0:0: [sdb] tag#17 CDB: Read(16) 88 00 00 00 00 00 00 00 00 28 00 00 00 10 00 00
[   17.209887] blk_update_request: I/O error, dev sdb, sector 40 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0
[   17.209890] sd 1:0:0:0: [sdb] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[   17.209891] sd 1:0:0:0: [sdb] tag#18 Sense Key : Illegal Request [current] 
[   17.209892] sd 1:0:0:0: [sdb] tag#18 Add. Sense: Unaligned write command
[   17.209893] sd 1:0:0:0: [sdb] tag#18 CDB: Read(16) 88 00 00 00 00 00 00 00 00 48 00 00 00 30 00 00
[   17.209893] blk_update_request: I/O error, dev sdb, sector 72 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0

smartctl -a산출:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (   87) seconds.
Offline data collection
capabilities:            (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    (1332) minutes.
SCT capabilities:          (0x003d) SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   144   144   054    Pre-fail  Offline      -       55
  3 Spin_Up_Time            0x0007   166   166   024    Pre-fail  Always       -       396 (Average 396)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       24
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   140   140   020    Pre-fail  Offline      -       15
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       434
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       24
 22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       123
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       123
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -       36 (Min/Max 25/37)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       138

SMART Error Log Version: 1
ATA Error Count: 138 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 138 occurred at disk power-on lifetime: 428 hours (17 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 28 a0 00 00 00 40 08   1d+10:52:46.156  WRITE FPDMA QUEUED
  61 28 98 d8 ff ff 40 08   1d+10:52:46.156  WRITE FPDMA QUEUED
  47 00 01 12 00 00 a0 08   1d+10:52:46.148  READ LOG DMA EXT
  47 00 01 00 00 00 a0 08   1d+10:52:46.148  READ LOG DMA EXT
  47 00 01 13 00 00 a0 08   1d+10:52:46.147  READ LOG DMA EXT

Error 137 occurred at disk power-on lifetime: 428 hours (17 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 28 18 00 00 00 40 08   1d+10:52:45.602  WRITE FPDMA QUEUED
  61 28 20 d8 ff ff 40 08   1d+10:52:45.602  WRITE FPDMA QUEUED
  47 00 01 12 00 00 a0 08   1d+10:52:45.601  READ LOG DMA EXT
  47 00 01 00 00 00 a0 08   1d+10:52:45.600  READ LOG DMA EXT
  47 00 01 13 00 00 a0 08   1d+10:52:45.599  READ LOG DMA EXT

Error 136 occurred at disk power-on lifetime: 428 hours (17 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 28 a8 00 00 00 40 08   1d+10:52:45.070  WRITE FPDMA QUEUED
  61 28 a0 d8 ff ff 40 08   1d+10:52:45.069  WRITE FPDMA QUEUED
  47 00 01 12 00 00 a0 08   1d+10:52:45.068  READ LOG DMA EXT
  47 00 01 00 00 00 a0 08   1d+10:52:45.068  READ LOG DMA EXT
  47 00 01 13 00 00 a0 08   1d+10:52:45.066  READ LOG DMA EXT

Error 135 occurred at disk power-on lifetime: 428 hours (17 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 28 68 00 00 00 40 08   1d+10:52:44.540  WRITE FPDMA QUEUED
  61 28 70 d8 ff ff 40 08   1d+10:52:44.539  WRITE FPDMA QUEUED
  47 00 01 12 00 00 a0 08   1d+10:52:44.538  READ LOG DMA EXT
  47 00 01 00 00 00 a0 08   1d+10:52:44.538  READ LOG DMA EXT
  47 00 01 13 00 00 a0 08   1d+10:52:44.530  READ LOG DMA EXT

Error 134 occurred at disk power-on lifetime: 428 hours (17 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 43 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 28 58 00 00 00 40 08   1d+10:52:44.004  WRITE FPDMA QUEUED
  61 28 50 d8 ff ff 40 08   1d+10:52:44.003  WRITE FPDMA QUEUED
  47 00 01 12 00 00 a0 08   1d+10:52:44.002  READ LOG DMA EXT
  47 00 01 00 00 00 a0 08   1d+10:52:44.002  READ LOG DMA EXT
  47 00 01 13 00 00 a0 08   1d+10:52:44.000  READ LOG DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%       433         -
# 2  Short offline       Completed without error       00%       428         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

관련 정보