AIX 7.1 TL/SP Patching

ALT_DISK_COPY AIX 7.1 · in-place
Runbook ID
RB-PATCH-AIX-001
Operation
In-place TL / SP update
Method
install_all_updates
OS
AIX 7.1
Scope
Single LPAR · non-clustered
Rollback
alt_disk_copy clone
Window (SP)
~90 min
Window (TL)
2 — 3 hr
Privileges
root / sudo-to-root
Spare disk
≥ rootvg size
Source level
7100-05-11-2246
AIX 7.1 TL5 SP11
running on hdisk0
──▶
install_all_updates -Yc
Target level
7100-05-12-2320
AIX 7.1 TL5 SP12 — final 7.1 SP
bundle: 7100-05-12-2320-FP.tar

Worked example: All commands below assume hdisk0 = live rootvg, hdisk1 = unassigned spare for alt_disk_copy, staging dir /var/update_staging/7100-05-12. Sub in your own values — package names will differ per release.

Out of scope: NIM-driven mass updates · PowerHA/HACMP-managed nodes · major-version migrations (7.1 → 7.2 / 7.2 → 7.3) — those are migration installations, see the companion nimadm and boot-media migration runbooks.

Phase 1 Pre-change planning · T-7 to T-2 days

Window sizing: ~90 min for an SP, 2 — 3 hr for a TL. Network or SAN-attached storage operations (mksysb to NFS, alt_disk_copy onto a SAN LUN) can extend this materially — pad accordingly.

Phase 2 Pre-flight checks · T-0, before window start

Capture everything: Wrap the whole pre-flight in script /tmp/preflight_$(date +%Y%m%d).log so the output lands in the change record automatically.

01 Capture current OS level and hardware identity BASELINE TARGET LPAR
oslevel -s # expect: 7100-05-11-2246 oslevel -r # expect: 7100-05 instfix -i | grep ML # all prior MLs/TLs cleanly applied? uname -a prtconf | head -30 # CPU/memory/serial for change record

Pin the source level into the change record. instfix -i | grep ML must show every prior ML as complete — anything missing means a half-applied prior update that needs cleaning up first.

02 Verify fileset consistency CRITICAL TARGET LPAR
lppchk -v # MUST return clean (no output) lppchk -c # checksum verify (slower, optional)

Stop condition: If lppchk -v reports broken or missing prerequisites, do not proceed. The most common cause is a previous half-applied update — clean it up with installp -C before patching.

03 Confirm disk space in rootvg filesystems CAUTION TARGET LPAR
df -g / /usr /var /opt /tmp /home /admin
FilesystemMin free (SP)Notes
/100 MBBootloader, /etc, ODM
/usr2 GBMost filesets land here
/var500 MBLogs, lpp metadata, /var/adm/ras
/tmp1 GBinstallp scratch
/opt500 MBOptional product trees

For a TL update, double these. It is imperative that /usr has enough headroom — running it dry mid-install leaves filesets in a broken state. /usr is restartable via chfs -a size=+1G /usr and install_all_updates can be re-run, but only if you catch it.

04 Confirm rootvg is healthy and spare disk is free VALIDATE TARGET LPAR
lsvg rootvg # check FREE PPs lsvg -p rootvg # all PVs active? lsvg -l rootvg # any open/syncd issues? lspv # confirm spare disk (hdisk1) is free

✓ Verify: hdisk1 shows None in the VG column of lspv. Every LV in lsvg -l rootvg should be open/syncd — anything in stale state needs syncvg before you clone.

05 Snapshot bootlist and error log BASELINE TARGET LPAR
bootlist -m normal -o > /tmp/bootlist_pre.txt bootlist -m service -o >> /tmp/bootlist_pre.txt cat /tmp/bootlist_pre.txt # expect hdisk0 in normal list errpt | head -30 # any hardware errors? fix first errpt -a | head -100 > /tmp/errpt_pre.txt

Latent hardware errors: A failing path or disk that isn't bothering anyone in steady state will absolutely bite you on the post-patch reboot. Resolve hardware errors before continuing — or you'll be guessing whether the patch caused them.

06 Snapshot running services and routing BASELINE TARGET LPAR
lssrc -a > /tmp/lssrc_pre.txt netstat -rn > /tmp/routing_pre.txt df -g > /tmp/df_pre.txt

Known-good baseline to diff against post-patch. Any subsystem that was active before but inoperative after — or any route that disappeared — gets flagged immediately rather than weeks later.

Phase 3 Backups

Both layers required for production. Standard pattern: mksysb to NIM / NFS, with TSM file-level grabbing the resulting mksysb file plus all datavgs. alt_disk_copy on top of that as the immediate online rollback.

07 Take a mksysb of rootvg CRITICAL TARGET LPAR
Verify destination has space (mksysb size ≈ used PPs in rootvg) df -g /backup Take the bootable rootvg image mksysb -ipX /backup/$(hostname)_$(date +%Y%m%d_%H%M).mksysb Verify the resulting file ls -l /backup/$(hostname)_*.mksysb lsmksysb -l /backup/$(hostname)_*.mksysb | head -40
FlagPurpose
-iRegenerate /image.data — captures current LV layout
-XAuto-extend /tmp if needed during backup
-pDisable software packing

NFS-mounted destination preferred; local disk acceptable as a stopgap. Either way, confirm the file is readable and lsmksysb -l can enumerate it before continuing — a corrupt mksysb is worse than no mksysb.

08 Clone rootvg to the spare disk with alt_disk_copy CRITICAL TARGET LPAR

Primary rollback mechanism. Clones rootvg to hdisk1, leaving you with a bootable pre-patch image one bootlist away.

Confirm hdisk1 is free lspv | grep hdisk1 # Expected: hdisk1 00f6... None Clone (10 — 30 min depending on rootvg size and disk speed) alt_disk_copy -d hdisk1 Verify lspv # Expected new line: hdisk1 ... altinst_rootvg active lsvg altinst_rootvg

After the clone completes, verify and reset the bootlist if needed:

bootlist -m normal -o # Confirm hdisk0 (live rootvg). If alt_disk_copy moved it, reset: bootlist -m normal hdisk0

Two strategies — pick one:

A · patch live · this runbook
Patch the live rootvg on hdisk0. Clone on hdisk1 stays cold as the rollback target. Simpler and more common.
B · wake the clone · patch offline
Wake the clone with alt_rootvg_op -W, patch it offline, reboot onto patched clone. Original disk becomes the rollback. More steps, but the running system stays untouched until reboot.

This runbook follows pattern A.

Phase 4 Stage the update bundle
09 Create the staging directory CONFIG TARGET LPAR
mkdir -p /var/update_staging/7100-05-12 cd /var/update_staging/7100-05-12

Keep staging under /var — that's where the update logs land too, so the whole change record stays in one place.

10 Transfer the bundle to the LPAR TRANSFER JUMP HOST
From a jump host via SCP scp 7100-05-12-2320-FP.tar root@<target>:/var/update_staging/7100-05-12/ Or via NFS-mounted update share mount nfs-server:/aix_updates /mnt/updates cp /mnt/updates/7100-05-12-2320-FP.tar /var/update_staging/7100-05-12/
11 Verify the SHA256 checksum before unpacking VALIDATE TARGET LPAR
csum -h SHA256 7100-05-12-2320-FP.tar

✓ Verify: Output checksum matches the value listed on Fix Central / ESS for this bundle. A mismatched bundle gets you mysterious 0503-005 errors mid-install — catch it here, not three hours into the window.

12 Extract the bundle and build the .toc CRITICAL TARGET LPAR
tar -xvf 7100-05-12-2320-FP.tar ls -1 *.bff | head # Expect filenames like: # bos.64bit.7.1.5.41.bff # bos.mp64.7.1.5.41.bff # bos.rte.7.1.5.41.bff # bos.rte.boot.7.1.5.41.bff # devices.common.IBM.fc.rte.7.1.5.40.bff # ... (typically 100 — 300 filesets in a full SP) inutoc . ls -l .toc

Critical: inutoc builds the table-of-contents file that installp reads. Without a current .toc, installp reports "No filesets on the media" and fails. Re-run inutoc after any change — adding, removing, or replacing files in the staging directory.

13 Preview what will be installed (dry run) PREVIEW TARGET LPAR
install_all_updates -d /var/update_staging/7100-05-12 -p # -p == preview only, no changes made

✓ Review the output carefully. Confirm:

  • ▸ Target level matches expectation (7.1.5.x)
  • ▸ No FAILURES reported
  • ▸ No unexpected filesets being downgraded
  • ▸ Any REQUISITE warnings have a path to resolution
Phase 5 Apply the update
14 Final pre-apply checks VALIDATE TARGET LPAR
date # log start time who # any other users logged in? ps -ef | grep -v "^root\|^daemon\|^bin\|^sys" | head # any application processes still up?

Application halt: If application processes (Oracle, WebSphere, custom workloads) are still running, alert the application team and have them halt the database / app stack cleanly before install_all_updates kicks off. Patching a busy LPAR works in theory and stings in practice.

15 Run install_all_updates CRITICAL TARGET LPAR
script /var/update_staging/7100-05-12/install_$(date +%Y%m%d_%H%M).log install_all_updates \ -d /var/update_staging/7100-05-12 \ -Y \ -c # When complete: exit the script session exit
FlagPurpose
-dSource directory (must contain .toc)
-YAccept software licenses non-interactively
-cCommit (versus apply-only) — see below

Apply vs commit:

APPLY · default if -c omitted
Old fileset version retained on disk; can be rejected to roll back individual filesets without rebooting. Consumes ~2× disk space in /usr.
COMMIT · -c
Old version discarded. Smaller footprint. Rollback is via alt_disk_copy / mksysb only. Recommended for SPs where you have alt_disk_copy in place anyway.

Expected runtime: 20 — 60 min for an SP, 60 — 120 min for a TL.

16 Interpret the summary table VALIDATE TARGET LPAR

install_all_updates ends with a summary table. The column to watch is Result.

ResultMeaning
SUCCESSFileset installed cleanly
ALREADYTarget version already present (skipped — fine)
FAILED · BROKEN · CANCELLEDDo not reboot. Investigate first.

If anything but SUCCESS / ALREADY appears: investigate via /var/adm/ras/install_all_updates.log and the per-fileset logs in /var/adm/ras/ before any reboot. A reboot on top of broken filesets turns a recoverable mistake into a rollback exercise.

Phase 6 Reboot and verify
17 Reboot the LPAR REBOOT TARGET LPAR
sync; sync; sync # flush pending disk writes shutdown -Fr now

Have HMC console up and visible during the reboot — if the LPAR hangs at an LED code, you want to see it immediately rather than wait for the SSH timeout.

18 Post-reboot verification VALIDATE TARGET LPAR
oslevel -s # Expect: 7100-05-12-2320 oslevel -s -l $(oslevel -s) # Expect: no output (no filesets below the reported level) # If filesets are listed, they were not in the bundle instfix -i | grep ML # Expect: "All filesets for 7100-05_AIX_ML were found." lppchk -v # MUST return clean errpt | head -30 # Compare against /tmp/errpt_pre.txt — any new error classes? df -g # Compare against /tmp/df_pre.txt — /usr will have grown lssrc -a | grep -i inoperative # Any subsystems that should be active but aren't? bootlist -m normal -o # Should still show hdisk0 — verify and re-set with 'bootlist -m normal hdisk0' if not

✓ Acceptance criteria: oslevel -s at target · lppchk -v clean · no new errpt classes · no inoperative subsystems · bootlist still on hdisk0. Anything else triggers Phase 7.

Phase 7 Rollback procedure

Trigger conditions: lppchk -v reports broken filesets after reboot, or LPAR fails to boot cleanly off hdisk0.

RB.1 Rollback while system is booted ROLLBACK TARGET LPAR
Repoint bootlist to the pre-patch clone on hdisk1 bootlist -m normal hdisk1 bootlist -m normal -o # verify shutdown -Fr now After reboot oslevel -s # Expect: 7100-05-11-2246 (pre-patch level) lspv # rootvg is now on hdisk1; old (patched) rootvg shows as # old_rootvg on hdisk0 bootlist -m normal hdisk1 # confirm bootlist sticks for next boot
RB.2 Rollback if the system will not boot ROLLBACK HMC
  1. Activate the LPAR from the HMC in SMS mode.
  2. Select boot device, choose hdisk1.
  3. Boot normally.

Then follow the RB.1 verification steps to confirm oslevel -s reports the pre-patch level and the bootlist is persisted.

RB.3 Post-rollback cleanup CLEANUP TARGET LPAR
Once stable on the rollback disk # Remove the failed alt_rootvg (the patched one, now on hdisk0) alt_rootvg_op -X old_rootvg

Preserve: Keep /var/adm/ras/install_all_updates.log and the per-fileset logs in /var/adm/ras/ before raising a PMR. IBM support will need them to diagnose why the patch broke the system.

Phase 8 Cleanup · T+3 to T+7 days, after burn-in
19 Clear down the alt_disk_copy clone and archive logs CLEANUP TARGET LPAR

Only after burn-in. If patching has completed successfully and nobody has raised an issue, you're good to clear down the alt_disk_copy clone — but not before.

Free hdisk1 by removing the clone definition alt_rootvg_op -X altinst_rootvg lspv # hdisk1 should now show "None" in VG column Remove staged update files rm -rf /var/update_staging/7100-05-12 Archive change logs for audit tar -cvf /backup/change_$(hostname)_$(date +%Y%m%d).tar \ /tmp/preflight_*.log \ /tmp/errpt_pre.txt /tmp/lssrc_pre.txt /tmp/df_pre.txt \ /var/update_staging/7100-05-12/install_*.log \ /var/adm/ras/install_all_updates.log Optional: remove the mksysb if alt_disk_copy + TSM cover you # ls -l /backup/$(hostname)_*.mksysb

Now close the change. You earned it.

Appendix A Sample Fix Central bundle contents
A.1 Typical AIX 7.1 SP bundle layout REFERENCE TARGET LPAR

A typical AIX 7.1 SP bundle (7100-05-12-2320-FP.tar) extracts to a flat directory of .bff files plus the .toc generated by inutoc.

/var/update_staging/7100-05-12/ .toc (built by inutoc) bos.64bit.7.1.5.41.bff bos.acct.7.1.5.40.bff bos.mp64.7.1.5.41.bff bos.net.tcp.client.7.1.5.41.bff bos.net.tcp.server.7.1.5.41.bff bos.perf.libperfstat.7.1.5.41.bff bos.perf.perfstat.7.1.5.41.bff bos.perf.tools.7.1.5.41.bff bos.rte.7.1.5.41.bff bos.rte.boot.7.1.5.41.bff bos.rte.install.7.1.5.41.bff bos.rte.libc.7.1.5.41.bff bos.rte.security.7.1.5.41.bff clic.rte.kernext.4.10.0.4.bff devices.common.IBM.ethernet.rte.7.1.5.41.bff devices.common.IBM.fc.rte.7.1.5.40.bff devices.common.IBM.scsi.rte.7.1.5.41.bff devices.fcp.disk.rte.7.1.5.40.bff ... (typically 150 — 300 filesets total)

Naming convention:

<package>.<subpackage>.<V>.<R>.<M>.<F>.bff V = Version (7) R = Release (1) M = Modification / TL (5) F = Fix / SP build (41)

Not all filesets in a bundle ship at the same V.R.M.F — some sub-components update infrequently, which is why mixed .40 and .41 levels appear in the same SP. Expected, not a problem.

A.2 Common errors and remediation REFERENCE TARGET LPAR
SymptomCause / Fix
installp: 0503-005 ... could not access tocRun inutoc . in the staging dir.
0504-203 No filesets on the mediaSame root cause — missing or stale .toc.
0503-409 ... requisite is missingA prereq fileset isn't in the bundle. Check oslevel -s -l and download the missing fileset, or use a fuller bundle.
0503-464 ... cannot install over a newer versionFileset on disk is newer than what's in the bundle. Skip with installp -ag (apply, ignore requisites) only if you understand why.
lppchk -v reports BROKEN filesets post-patchRun installp -C to clean uncommitted state, then re-run the update for affected filesets only.
LPAR hangs at LED 0c31 or similar on bootBoot from rollback disk via SMS (see RB.2). File a PMR.
/usr fills mid-installchfs -a size=+1G /usr, then re-run install_all_updates. installp is restartable.
A.3 Worked example — single command sequence REFERENCE TARGET LPAR

Do not follow this blindly — the worked-example values (level numbers, hdisk names, bundle filename) will change for your actual change. This is a sequence reference, not a copy-paste recipe.

Pre-flight oslevel -s; lppchk -v; df -g /usr; lspv bootlist -m normal -o > /tmp/bootlist_pre.txt errpt -a > /tmp/errpt_pre.txt Backups mksysb -ipX /backup/$(hostname)_$(date +%Y%m%d).mksysb alt_disk_copy -d hdisk1 lspv # confirm altinst_rootvg Stage mkdir -p /var/update_staging/7100-05-12 cd /var/update_staging/7100-05-12 tar -xvf /tmp/7100-05-12-2320-FP.tar inutoc . Preview, then apply install_all_updates -d . -p script ./install_$(date +%Y%m%d_%H%M).log install_all_updates -d . -Yc exit Reboot and verify shutdown -Fr now # ... wait for reboot ... oslevel -s # 7100-05-12-2320 lppchk -v instfix -i | grep ML errpt | head Cleanup (after burn-in, days later) alt_rootvg_op -X altinst_rootvg rm -rf /var/update_staging/7100-05-12
Quick Reference Command Summary
CommandPurposeNotes
oslevel -sCurrent TL/SP levelFull V.R.M.F-build
oslevel -s -l <level>Filesets below a given levelEmpty output = at level
oslevel -rJust the TL portione.g. 7100-05
instfix -iInstalled ML/TL inventorygrep ML for milestones
lslpp -LFull fileset inventoryState + level per fileset
lslpp -h <fileset>Install history for a filesetapply/commit/reject log
lppchk -vVerify fileset consistencyMUST be clean pre and post
mksysb -ipX <file>Bootable rootvg backupPhase 3 — to NFS preferred
alt_disk_copy -d <disk>Clone rootvg to spare diskPhase 3 — primary rollback
alt_rootvg_op -W -d <disk>Wake an alt_rootvg for offline accessPattern B alternative
alt_rootvg_op -SSleep an alt_rootvgAfter offline patching
alt_rootvg_op -X <vg>Remove an alt_rootvg definitionCleanup or rollback discard
inutoc <dir>Build .toc for an installp directoryRe-run after every dir change
install_all_updates -d <dir> -pPreview updatePhase 4 — must be clean
install_all_updates -d <dir> -YcApply and commitPhase 5 — the real run
installp -CClean up failed/half-applied stateUse before re-running
installp -sList filesets in APPLIED (uncommitted) stateApply-mode rollback target
installp -c <fileset>Commit a specific applied filesetPromote apply → commit
installp -r <fileset>Reject (roll back) an applied filesetApply-mode only — no reboot
bootlist -m normal -oShow normal-mode bootlistPre/post snapshot
bootlist -m normal <disk>Set normal-mode bootlistRollback flip in RB.1
bosboot -ad /dev/<disk>Rebuild boot image on diskIf boot-image corruption suspected
⚠ Key Notes