Power_775_AIX_Upgrade_Procedure

There is a newer version of this page. You can find it here.

Introduction

This documentation describes a process for upgrading the software and firmware on a AIX Power775 Cluster.

Step A: (Prep Work)

  1. Create a mksysb of the EMS ( Primary/Backup) create a mksysb for the Service Nodes. This will allow you to quickly revert to your previous Cluster level. Put these mysysb images on a backup file system. Backup your database if you do not have a recent backup. If there is any important data that is not on rootvg and shared disks, make a backup for the data also.I recommend using the new binary process for dumpxCATdb/restorexCATdb. See doc:http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setting_Up_DB2_as_the_xCAT_DB#Backup.2FRestore_the_database_with_DB2_Commands
  2. On the Backup EMS, start the local DB2 database. su - xcatdb , db2start.
  3. In the following order, update: DB2, xCAT, DFM, LL, ISNM.hdwr_svr, TEAL, AIX. ( no shared disk mounted). After upgrade, make sure xCAT,LL, ISNM.hdwr_svr , TEAL are stopped, and finally db2stop the database. Note may delay upgrade of ISNM.hdwr_svr to the upgrade of ISNM.cnm below. DFM and xCAT must be upgraded at the same time.
  4. Reboot the Backup EMS. No daemons for xCAT, LL, HDWR_SVR, TEAL should be running. The database should also not be running. su - xcatdb, db2stop. Stop them if they are!
  5. On the Primary EMS, in the following order, update: DB2, xCAT, DFM, LL, TEAL, AIX . Can usually upgrade ISNM.hdw_svr but not ISNM.cnm. If LL is configured using the database, after updating LL, run "perl which lldbupdate"
  6. Reboot Primary EMS. This scenario will have to be reevaluated when running LL on the Database. Updating the DB2 PTF requires stopping all daemons accessing the database, so that will cause LL on the SN to be stopped.
  7. Update xCAT on All Service Nodes following this process: http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Updating_AIX_Software_on_xCAT_Nodes#Upgrading_xCAT_on_service_nodes ( Do not use the instxcat script, it is for the EMS only).
  8. If using multibos support - prep the alternate BOS on the Service Nodes here (do not update the hfi driver)http://publib.boulder.ibm.com/infocenter/pseries/v5r3/topic/com.ibm.aix.install/doc/insgdrf/multibosutility.htm?resultof=%22%6d%75%6c%74%69%62%6f%73%22%20

Step B: Update All Service Nodes (Maintenance window #1)

  • Option 1: Using updatenode
  • Shutdown the compute nodes.
  • Stop xcatd on the Service Node ( only required if DB2 upgrade).
  • From the Primary EMS, update (using updatenode or xdsh ) DB2, LL, AIX (no new hfi driver) on the Service Nodes in that order.
  • Follow the DB2 upgrade on SN process:http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setting_Up_DB2_as_the_xCAT_DB#Installing_Latest_Fix_Packs_on_Service_Nodes
  • Reboot the Service Nodes
  • Start LL on the service node (using xdsh <service node> llctl start)
  • Bring up the compute nodes, if you shut them down in step 2.
  • Start LL on the compute nodes ( using xdsh <group> llrctl start).
  • Option 2: Using multibos -
  • Shutdown the compute nodes.
  • Reboot the Service Nodes to the alternate bos.
  • Start LL on the service nodes (using xdsh <service node> llctl start)
  • Bring up the compute nodes, if you shut them down in step 2.
  • Start LL on the compute nodes ( using xdsh <group> llrctl start).

Step C: Build and distribute new images

Note:Between the maintenance windows - jobs can be running during this time, if you want.

  1. Add new hfi drivers to lpp source on the EMS for images.
  2. Build all new images (mknimimage) for nodes.
  3. Run mkdsklsnode -n on the EMS for all Compute Nodes

Step D: Reboot Cluster (Maintenance window #2)

  1. Prepare your firmware for upgrade.
  2. Upgrade ISNM.cnm. Upgrade ISNM.hdwr_svr, if not already upgraded, on the EMS and Backup EMS
  3. Run updatenode to update hfi drivers on the Service Nodes.
  4. Run rflash to the frames and CECS. This requires a reboot of all nodes. After reboot, restart LL. Verify the nodes are available and run test jobs.

MongoDB Logo MongoDB