다운그레이드된 ZFS 풀에서 상시 대기로 강제 장애 조치

2024-6-9 • tag-icon

전역 스페어가 할당된 간단한 5x1TB RAIDz1 구성(탱크? 풀? vdev?)이 있습니다. 어레이에 있는 5개 드라이브 중 하나가 상태 FAULTED( corrupted data)로 표시되고, 예비 드라이브는 로 표시됩니다 AVAIL. 어레이 DEGRADED가 대기 어레이로 정상적으로 장애 조치되는 메커니즘이 없는 것 같습니다. 그렇다면 장애 조치를 강제로 수행하려면 어떻게 해야 합니까 ?

detach드라이브에 대해 논의하고, replace예비 드라이브를 사용하고, 드라이브를 물리적으로 제거하고, 예비 드라이브를 동일한 슬롯으로 이동하는 등에 대해 여러 곳에서 많은 포럼 게시물을 읽었습니다.

이 replace명령은 예비 드라이브가 예비 또는 교체 구성에 있으므로 드라이브를 교체할 수 없다고 알려주며 을 시도합니다 detach.

이 detach명령은 미러링 및 vdev 교체와만 호환된다는 것을 알려줍니다.

어레이를 재구축하는 데 예비 디스크가 사용되고 있다는 표시는 없습니다.

현재 어레이 구성원이든 핫 스페어 실행이든 드라이브를 물리적으로 이동하는 것을 시작하고 싶지 않습니다. 어떤 것도 방해하고 싶지 않습니다.

또한 어레이를 종료하거나 서버를 다시 시작하는 등의 작업도 원하지 않습니다. 이 시스템은 이러한 작업 없이 투명하게 복구하도록 설계되어 있는데 어떻게 해야 하는지 알고 싶습니다. 데이터가 백업되므로 자유롭게 통치할 수 있습니다.

리눅스 커널: 3.10.0-1160

ZFS 버전: 5

고쳐 쓰다:

함수 출력 replace:

[root@localhost ~]# zpool replace <name> 4896358983234274072 ata-WDC_WD10EFRX-68PJCN0_WD-<serial>
cannot replace 4896358983234274072 with ata-WDC_WD10EFRX-68PJCN0_WD-<serial>: already in replacing/spare config; wait for completion or use 'zpool detach'

함수 출력 detach:

[root@localhost ~]# zpool detach <name> 4896358983234274072
cannot detach 4896358983234274072: only applicable to mirror and replacing vdevs

ZFS 버전:

[root@localhost ~]# zfs upgrade
This system is currently running ZFS filesystem version 5.

All filesystems are formatted with the current version.

[root@localhost ~]# modinfo zfs | grep version
version:        0.8.2-1
rhelversion:    7.9
srcversion:     29C160FF878154256C93164
vermagic:       3.10.0-1160.49.1.el7.x86_64 SMP mod_unload modversions

zpool 상태:

[root@localhost ~]# zpool status <name>
  pool: <name>
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 0h18m with 0 errors on Mon Apr  4 13:29:39 2022
config:

        NAME                                               STATE     READ WRITE CKSUM
        <name>                                                 DEGRADED     0     0     0
          raidz1-0                                         DEGRADED     0     0     0
            pci-0000:01:00.0-sas-0x443322110c000000-lun-0  ONLINE       0     0     0
            ata-WDC_WD10EFRX-68FYTN0_WD-<serial>       ONLINE       0     0     0
            pci-0000:01:00.0-sas-0x4433221109000000-lun-0  ONLINE       0     0     0
            4896358983234274072                            FAULTED      0     0     0  corrupted data
            pci-0000:01:00.0-sas-0x443322110b000000-lun-0  ONLINE       0     0     0
        spares
          ata-WDC_WD10EFRX-68PJCN0_WD-<serial>         AVAIL

업데이트 2:

서버를 다시 시작하면 중단이나 문제 없이 교체 작업을 수행할 수 있습니다. 저는 이제 ZFS 및 커널 업데이트를 고려하고 있으며 이전 시스템에 구축된 기존 어레이의 안전한 작동을 보장하고 싶습니다.

관련 정보