연구실에서 ceph Storage를 사용하고 있고 서버도 있어서 MON, OSD, MDS 등과 같은 모든 서비스를 단일 시스템에 설치하려고 합니다.
loopdevice를 이용하여 디스크 2개를 만들었습니다. (서버에 SSD 디스크가 있어서 속도가 매우 좋습니다.)
root@ceph2# losetup -a
/dev/loop1: [64769]:26869770 (/root/100G-2.img)
/dev/loop0: [64769]:26869769 (/root/100G-1.img)
이것이 내 ceph -s
출력의 모습입니다
root@ceph2# ceph -s
cluster:
id: 1106ae5c-e5bf-4316-8185-3e559d246ac5
health: HEALTH_WARN
1 MDSs report slow metadata IOs
Reduced data availability: 65 pgs inactive
Degraded data redundancy: 65 pgs undersized
services:
mon: 1 daemons, quorum ceph2 (age 8m)
mgr: ceph2(active, since 9m)
mds: 1/1 daemons up
osd: 2 osds: 2 up (since 20m), 2 in (since 38m)
data:
volumes: 1/1 healthy
pools: 3 pools, 65 pgs
objects: 0 objects, 0 B
usage: 11 MiB used, 198 GiB / 198 GiB avail
pgs: 100.000% pgs not active
65 undersized+peered
MDS 느린 IO 오류가 어디서 발생하는지 모르고 mds 통계가 생성 상태로 유지됩니다.
root@ceph2# ceph mds stat
cephfs:1 {0=ceph2=up:creating}
건강 세부정보는 다음과 같습니다.
root@ceph2# ceph health detail
HEALTH_WARN 1 MDSs report slow metadata IOs; Reduced data availability: 65 pgs inactive; Degraded data redundancy: 65 pgs undersized
[WRN] MDS_SLOW_METADATA_IO: 1 MDSs report slow metadata IOs
mds.ceph2(mds.0): 31 slow metadata IOs are blocked > 30 secs, oldest blocked for 864 secs
[WRN] PG_AVAILABILITY: Reduced data availability: 65 pgs inactive
pg 1.0 is stuck inactive for 22m, current state undersized+peered, last acting [1]
pg 2.0 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.1 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.2 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.3 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.4 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.5 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.6 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.7 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.8 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.c is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.d is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.e is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.f is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.10 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.11 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.12 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.13 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.14 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.15 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.16 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.17 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 2.18 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.19 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.1a is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 2.1b is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.0 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.1 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.2 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.3 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.4 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.5 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.6 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.7 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.9 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.c is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.d is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.e is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.f is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.10 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.11 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.12 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.13 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.14 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.15 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.16 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.17 is stuck inactive for 14m, current state undersized+peered, last acting [0]
pg 3.18 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.19 is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.1a is stuck inactive for 14m, current state undersized+peered, last acting [1]
pg 3.1b is stuck inactive for 14m, current state undersized+peered, last acting [0]
[WRN] PG_DEGRADED: Degraded data redundancy: 65 pgs undersized
pg 1.0 is stuck undersized for 22m, current state undersized+peered, last acting [1]
pg 2.0 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.1 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.2 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.3 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.4 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.5 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.6 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.7 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.8 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.c is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.d is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.e is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.f is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.10 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.11 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.12 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.13 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.14 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.15 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.16 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.17 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 2.18 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.19 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.1a is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 2.1b is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.0 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.1 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.2 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.3 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.4 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.5 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.6 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.7 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.9 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.c is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.d is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.e is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.f is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.10 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.11 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.12 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.13 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.14 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.15 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.16 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.17 is stuck undersized for 14m, current state undersized+peered, last acting [0]
pg 3.18 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.19 is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.1a is stuck undersized for 14m, current state undersized+peered, last acting [1]
pg 3.1b is stuck undersized for 14m, current state undersized+peered, last acting [0]
여기서 무엇이 잘못될 수 있나요? 서버가 1개이고 OSD가 2개밖에 없기 때문이라고 생각하시나요?
답변1
MDS는 어떤 PG에도 연결할 수 없고 모든 PG가 "비활성" 상태이므로 메타데이터 보고 속도가 느립니다. PG를 실행하면 결국 경고가 사라집니다. 풀당 기본 압축 규칙 크기는 3이며, OSD가 2개만 있는 경우에는 절대 달성할 수 없습니다. 또한 OSD가 호스트가 아닌 스매시 오류 도메인이 되도록 osd_crush_chooseleaf_type
이 값을 0으로 변경 . 그런 다음 모든 PG가 두 OSD에 모두 맞도록 풀 크기를 2로 변경해야 합니다. 그러나 풀 크기 2는 테스트 목적으로만 사용되며 데이터를 중요하게 생각하지 않는 경우 프로덕션 용도로 권장되지 않습니다.