AIX rootvg — Mirror Disk Replacement Runbook

DISK FAILURE rootvg
VG
rootvg
VG Type
Mirrored (2-copy)
Good Disk
hdisk0 — active
Failed Disk
hdisk1 — failed
Other VGs
datavg (2× PV) — unaffected
Spare PVs
None
Phase 1 Identify the Failed Disk
01 Identify failed PV and confirm stale PP count CRITICAL
# List all PVs — failed disk will show "missing" or no VG association lspv # Check VG state — note STALE PPs count lsvg rootvg # Confirm which LVs/PPs are on each disk lspv -l hdisk0 lspv -l hdisk1 # Check for stale PPs explicitly lsvg rootvg | grep -i stale

A failed mirror member will appear as missing in lspv output. lsvg rootvg will report a non-zero STALE PP count equal to the LP count of all mirrored LVs. Note the PVID of the failed disk from lspv — you may need it if ODM manipulation is required.

Phase 2 Verify Boot Integrity
02 Confirm surviving disk is in bootlist — correct if necessary CRITICAL
# Check current bootlist — failed disk must NOT be first bootlist -m normal -o bootlist -m service -o # If failed disk appears — correct both bootlists immediately bootlist -m normal hdisk0 bootlist -m service hdisk0 # Confirm BLV (hd5) exists and is healthy on good disk lslv hd5 lspv -l hdisk0 | grep hd5

Do not proceed until the surviving disk (hdisk0) is confirmed first in the bootlist. If the system reboots with the failed disk as primary boot device, the system will fail to boot entirely.

Phase 3 Handle Stale PPs
03 Assess stale PP state and attempt sync (if disk partially accessible) CAUTION
# Confirm stale PP count — will be non-zero on a failed mirror member lsvg rootvg # Attempt sync only if disk is intermittently accessible (likely to fail if truly dead) syncvg -v rootvg # Re-check — if STALE PPs remain, disk is unrecoverable, proceed to Phase 4 lsvg rootvg | grep -i stale

syncvg will almost certainly fail against a truly dead disk. Its value here is in confirming the disk is unrecoverable before proceeding with forced removal. If STALE PPs drop to 0, you have a degraded-but-recoverable disk — treat as Option B in the next step.

Phase 4 Remove Failed Disk from rootvg
04 Remove mirror — Option A: disk is dead / missing DISK DEAD
Option A — Disk dead / missing
Disk is unresponsive. lspv shows "missing". syncvg fails. Use -d flag to force removal of stale PPs.
Option B — Disk degraded / accessible
Disk responds intermittently. Use unmirrorvg to gracefully migrate PP allocations before reducevg.
Option A — Force removal (dead disk) # -d forces removal even with allocated (stale) PPs on hdisk1 reducevg -d rootvg hdisk1 # If reducevg still refuses (disk ODM entry blocking) — remove from ODM chpv -vr hdisk1 odmdelete -o CuAt -q "name=hdisk1" odmdelete -o CuDv -q "name=hdisk1" # Reboot after ODM deletion — rootvg will no longer reference the missing disk Option B — Graceful removal (degraded disk) # Unmirror: migrates all PP allocations off hdisk1 to hdisk0 unmirrorvg rootvg hdisk1 # Confirm no LPs remain on hdisk1 lspv -l hdisk1 # Now safe to remove from VG reducevg rootvg hdisk1

reducevg -d vs unmirrorvg: Use unmirrorvg when the disk responds — it gracefully migrates LP allocations. Use reducevg -d only when the disk is confirmed dead and stale PPs cannot be resolved. The -d flag bypasses the allocated-PPs guard and discards the stale mirror copies.

Phase 5 Post-Removal Verification
05 Verify rootvg is clean — no stale PPs, single-copy LVs VERIFY
# STALE PPs must be 0 — LP count should equal PP count (1:1, no mirror) lsvg rootvg # Each LV should show LPs == PPs (no mirror copies remaining) lsvg -l rootvg # hdisk1 should now show None or no VG association lspv # Reconfirm bootlist is clean bootlist -m normal -o

At this point rootvg is running single-copy on hdisk0 only. The system is operational but unprotected — complete the replacement and re-mirror as soon as possible.

Phase 6 Physical Disk Replacement
06 Unconfigure, swap disk, rediscover HARDWARE
Cold swap — no hot-plug # Schedule maintenance window — halt cleanly shutdown -h now Hot swap — SAS backplane / hot-plug capable # Unconfigure the device before physical removal rmdev -dl hdisk1 # Alternatively use diagnostics menu for hot-plug task: # diag → Task Selection → Hot Plug Task → SCSI/FC Hot Plug Manager diag After physical insertion — rediscover new disk # Probe for new hardware cfgmgr -v # Confirm new disk appears with no PVID / no VG association lspv # Verify new disk size is >= hdisk0 before proceeding lspv hdisk0 # note PP SIZE and total PP count lspv hdisk1 # confirm new disk is equal or larger

Size requirement: The replacement disk must be equal to or larger than hdisk0. If the PP size or total PP count is smaller, extendvg will fail. Replacement disk must present with no existing PVID — if it has one from a previous system, clear it with chpv -C hdisk1.

Phase 7 Re-mirror rootvg
07 extendvg → mirrorvg → monitor sync REMIRROR
# Add new disk to rootvg extendvg rootvg hdisk1 # Confirm both disks are now members lsvg -p rootvg # Mirror all LVs in rootvg onto hdisk1 (includes hd5 BLV) mirrorvg rootvg hdisk1 # Monitor sync progress — wait until STALE PPs = 0 before proceeding lsvg rootvg | grep -i stale # Loop to watch sync progress (Ctrl+C to exit when 0) while true; do lsvg rootvg | grep STALE; sleep 10; done

mirrorvg will immediately begin background synchronisation of all PPs. Do not reboot or interrupt power during this phase. On heavily loaded systems or large rootvg, sync may take 10–30+ minutes. The system remains fully operational throughout.

Phase 8 Bootlist and BLV Finalisation
08 Write boot record, update bootlist, final verification FINALISE
# Confirm STALE PPs = 0 before this step lsvg rootvg | grep STALE # Write boot record to both disks bosboot -a -d /dev/hdisk0 bosboot -a -d /dev/hdisk1 # Update bootlist — both disks in normal and service mode bootlist -m normal hdisk0 hdisk1 bootlist -m service hdisk0 hdisk1 # Confirm bootlist bootlist -m normal -o bootlist -m service -o # Final mirror state — each LV should show PPs = 2x LPs lsvg -l rootvg # Confirm hd5 (BLV) is mirrored to new disk lspv -l hdisk1 | grep hd5

bosboot is mandatory on the new disk. Without it, the disk contains a valid XFS/JFS2 partition and mirrored data but no AIX boot record — the system cannot boot from it if hdisk0 fails. Always run bosboot -a -d on every disk in the bootlist after a mirror rebuild.

Quick Reference Step Summary
CommandPurposeNotes
lspv / lsvg rootvgIdentify failed disk, stale PP countFailed disk shows "missing"
bootlist -m normal -oVerify surviving disk is boot primaryFix before any other action
syncvg -v rootvgConfirm disk is unrecoverableWill fail if disk is dead
unmirrorvg rootvg hdisk1Graceful LP migration (degraded disk)Option B path
reducevg -d rootvg hdisk1Force remove dead disk + stale PPsOption A path — -d bypasses guard
lsvg rootvg (STALE=0)Confirm clean single-copy VGMust be 0 before swap
rmdev -dl hdisk1Unconfigure before hot-swapSkip for cold swap
cfgmgr -vRediscover replacement diskNew disk must have no PVID
extendvg rootvg hdisk1Add replacement disk to rootvgDisk must be >= hdisk0 size
mirrorvg rootvg hdisk1Rebuild mirror — syncs all LVsBackground sync, do not interrupt
bosboot -a -d /dev/hdiskXWrite boot record to new diskMandatory — run on both disks
bootlist -m normal hdisk0 hdisk1Register both disks in bootlistRun for normal and service mode
⚠ Key Notes