xCAT Wiki

An extreme cluster/cloud administration toolkit

Brought to you by: besawn, cxhong, gurevich, obihoernchen, victorhu

XCAT_HASN_with_GPFS

There is a newer version of this page. You can find it here.

DRAFT! This is a work-in-progress and is not complete!!!!

XCAT High Availability Service Nodes (HASN)

XCAT High Availability Service Nodes (HASN)

(Using NFS v4 Client Replication Failover with GPFS filesystems.)

AIX diskless nodes depend on their service nodes for many services: bootp, tftp, default gateway, name serving, NTP, etc. The most significant service is NFS to access OS files, statelite data, and paging space. This document describes how to use GPFS and NFSv4 client replication failover support to provide continuous operation of the full HPC cluster if a the NFS services provided by a service node become unavailable, whether due to failure of that service node or for other reasons.

Overview of Hardware and Cluster Configuration

Using a shared filesystem

All service nodes in the cluster are FC connected to 2 external disks which will be used to hold a common copy of the node images, statelite files, and paging space.
The disks are owned by GPFS, and all service nodes are in one GPFS cluster. Note that is a separate small management cluster, and is disjoint from the large GPFS application data cluster that all of the compute and storage nodes belong to.
A common /install GPFS filesystem will be used for all data
OPTIONAL: The EMS can also be attached to the external disks and included in the GPFS cluster. In this case, the /install filesystem will be common across the EMS and all service nodes. Both options have been tested by xCAT.
Each service node will NFSv4 export the /install filesystem with its backup service node NFS server replica specified (automatically set in the /etc/exports file by the xCAT mkdsklsnode command).
Compute node definitions in xCAT will have both a primary and a backup service node defined. The cluster will be configured with these "pairs" of service nodes, such that each service node in a pair is a backup to the other.
Compute nodes will NFSv4 mount the appropriate /install filesystems and be configured for NFSv4 client replication failover. This will include the /usr read-only filesystem, the shared-root filesystem which will be managed by STNFS on the compute node, and the filesystem used for xCAT statelite data. Paging support is not yet available from AIX.

During normal cluster operation, if a compute node is no longer able to access its NFS server, it will failover to the configured replica backup server. Since both NFS servers are using GPFS to back the filesystem, the replica server will be able to continue to serve the identical data to the compute node.

Storage Hardware Configuration

Storage Setup Configuration 1

[File:StorageSetup01.jpg]

This configuration may be used when the systems housing the service nodes have enough slots available to accommodate internal drives for the service node as well as the fiber channel HBAs for connectivity to the external storage to be used with the GPFS setup.
The amount of storage in the configuration should be sized adequately based on the desired storage capacity desired as well as the overall I/O throughput desired from the setup.
If there are more service nodes than the host ports on the fiber channel storage controller units then the use of fiber channel switches may be required.
Typically the configuration would have two identically configured fiber channel controller setups. Using two controller setups along with GPFS replication will provide data protection beyond that provided at the RAID array level.
The external storage is typically configured in 4+2P RAID6 arrays with one LUN per array. The use of 256KB segment during the array creation will allow for a GPFS file system block size of 1MB. Alternately a segment size of 128KB may also be used which will allow for a GPFS file system block size of 512KB.

Storage Setup Configuration 2

[File:StorageSetup02.jpg]

This configuration may be used when the systems housing the service nodes DO NOT have enough slots available to accommodate internal drives for the service node as well as the fiber channel HBAs for connectivity to the external storage to be used with the GPFS setup thus necessitating that the service nodes boot over fiber channel from the external storage.
The amount of storage in the configuration should be sized adequately based on the desired storage capacity desired as well as the overall I/O throughput desired from the setup.
If there are more service nodes than the host ports on the fiber channel storage controller units then the use of fiber channel switches may be required.
Typically the configuration would have two identically configured fiber channel controller setups. Using two controller setups along with GPFS replication will provide data protection beyond that provided at the RAID array level.
The external storage is typically configured in 4+2P RAID6 arrays with one LUN per array. The use of 256KB segment during the array creation will allow for a GPFS file system block size of 1MB. Alternately a segment size of 128KB may also be used which will allow for a GPFS file system block size of 512KB.
The disks to be used for the boot of the service node(s) can be RAIDED (for example RAID1) or non-RAIDED. The use of storage partitioning is recommended to isolate the disks used for the service node booting from those to be used with the GPFS setup.

GPFS Configuration

Recommendations for the GPFS setup

All the service nodes from a cluster/sub-cluster should typically belong to a GPFS cluster.
The GPFS cluster should be configured over the Ethernet interfaces on the service nodes.
Recommended block sizes for the GPFS file system in the setup are 1MB or 512KB.
Optionally the EMS can also be a part of this GPFS setup.

Layout of the file systems on the external disks:

There is only ONE COMMON /install file system across all service nodes in the cluster/sub-cluster. Optionally, this file system can also be available on the EMS, and written directly there.
There is only one file system for the statelite persistent files. To make NFS exports simple, this can be under the /install filesystem.
The paging spaces and dump files will need to be under /install/nim, for example /install/nim/paging and /install/nim/dump, respectively. These directories should configured such that they are not replicated at the GPFS level. This is for optimal use of the space in the GPFS file system as well as for performance reasons.

Considerations for Other Software Components

There are a few components that normally run on the service nodes that under certain circumstances need access to the application GPFS cluster. Since a service node can't be (directly) in 2 GPFS clusters at once, some changes in the placement or configuration of these components must be made, now that the service nodes are in their own GPFS cluster. The components that can be affected by this are:

LL schedd daemons - need to store the spool files in a common file system. This could be the SN GPFS cluster if all the schedd's run on the SNs. But the schedd's normally need a file system that is common with the compute nodes where the application executables are stored. If you want this in GPFS, it has to be in the application GPFS cluster. Also, if you use C/R, it needs access to the application GPFS cluster.
LL central mgr, regional mgr, and schedd's - need access to the database if using the database option for LL. The following LL features need the database option: rolling updates, C/R, and the future energy aware scheduling feature
TEAL GPFS monitoring - needs to run on the GPFS monitoring collector node of the application GPFS cluster and needs access to the database

There are a few different ways to satisfy these requirements:

Run LL with the traditional config files, not the database option. If you don't need any of the LL features that require the DB option (listed above), then you can run the central mgr, resource mgr, regional mgrs, and schedd's on other utility nodes, not the SNs. If you want TEAL GPFS monitoring, you still need to find a place for that, which means you have to set up another utility lpar that is part of the application GPFS cluster and has a disk so it can have the db2 client on it, or run with the new DB2 lite client that can run diskless. Either way, the utility node needs to contact the EMS, so it either needs an ethernet adapter in it, or routing has to be set up through the SNs.
If you are not using C/R and you don't need more schedd's than the number of SNs, you can run your LL daemons on the SNs. In this option, you also need a different common file system between the SNs and computer nodes for the application executables to be stored. You still have the same issue with the TEAL GPFS monitoring.

Limitations, Notes and Issues

If "site.sharedinstall=all", all NIM resources on the EMS will be created directly in the GPFS filesystem, including your lpp_source and spot resources. By default, NIM resources cannot be created with associated files in a GPFS filesystem (only jsf or jsf2 filesystems are supported). To bypass this restriction all NIM commands must be run with either the environment variable "NIM_ATTR_FORCE=yes" set, or by using the 'nim -F' force flag directly on each command. All xCAT commands have been changed to accommodate this setting. However, it is often necessary for an admin to run NIM commands directly. When doing so, be sure to use one of these force options.
You will need to remember that with a shared /install filesystems, the NIM files that are created there are visible to multiple NIM masters. xCAT code accommodates this when running NIM commands on multiple service nodes accessing the same directories and files. If you directly run NIM commands, remember that you can easily corrupt the NIM environment on another server without even realizing it. Use extreme caution when running NIM commands directly, and understand how the results of that command may affect other NIM servers accessing the identical /install/nim directories and files.
The xCAT snmove command is still under design and development for this environment.
Retargetting the dump resource after failover is still being explored.
The service node OS service shutdown order has to be changed to shutdown the NFS daemons before GPFS, so that NFS doesn't keep trying to serve files backed by GPFS.
Once your GPFS /install filesystem is running, and you have /install NFSv4 exported, you MUST ensure that NFS does not start before GPFS. Running NFS with no active GPFS filesystem to back the exported directories caused strange hangs on compute nodes that had that service node registered as its primary NFS server even though it had failed over and was running from the backup replica server.
You should modify /etc/inittab on the service nodes to control the startup order of NFS and GPFS correctly, moving the call for rc.nfs to after the start of GPFS.
Similarly, if you need to stop and restart GPFS on a service node, make sure to stop/start these services in the following order:
exportfs -ua
stopsrc -g nfs
mmshutdown
mmstartup
startsrc -g nfs (or /etc/rc.nfs)
exportfs -a
The paging space currently does not support NFSv4 client replication fail over. This may cause problems if the primary service node goes down, and the compute node requires paging to remain operational.
There is currently an issue with using NFSv4 replication client fail over for readwrite files, even when GPFS is ensuring that the files are the same regardless of which SN the are accessed from. A small timing window exists in which the client sends a request to update a file and the server updates it, but before it sends the acknowledgement to the client, the server crashes. When the client fails over to the other server (which has the updated file thanks to GPFS) and resends the update request, the client will detect that the modification time the client and server think the file has are different and bail out, marking the file "dead" until the client closes and reopens the file. This is a precaution, because the NFS client has no way of verifying that this is the exact same file that it updated on the other server. AIX development is sizing a configuration option in which we could tell it not to mark the file dead in this case because GPFS is ensuring the consistency of the files between the servers.

Note - we have not yet directly experienced this condition in any of our testing.

Software Pre-requisites

xCAT 2.7.2 including the following code updates:

Base: AIX 7.1.D (7.1.1.0)

Initial code drop of STNFS failover support:

**(AIX CMVC defect 816890)**
**HPCstnfs.111202.epkg.Z **

STNFS Patches from Duen-wen to fix hang with 'ls /etc/nfs' downloaded from ausgsa:

/usr/lib/drivers/stnfs.ext 
/usr/lib/ras/autoload/stnfs64.kdb

STNFS Patch from Duen-wen to fix I/O errors from 'ls -lR /' after failover downloaded from ausgsa:

**(AIX CMVC defect 822215):**
/usr/lib/drivers/stnfs.ext

NFS Patches from Duen-wen to fix access failures to libC in /usr filesystem downloaded from ausgsa:

**(AIX CMVC defect 826634):**
/usr/lib/drivers/nfs.ext 
/usr/lib/drivers/nfs.netboot.ext

NIM patch to turn off TCB-enabled during SPOT build (locally modified on EMS by Linda Mellor based on instructions from Paul Finley). This is ONLY required for sharedinstall=all (not needed for sharedinstall=sns):

**(AIX CMVC defect 824583):**
/usr/lpp/bos.sysmgt/nim/methods/c_instspot

NOTE: All STNFS/NFS defects are fixed and will be shipped in AIX 7.1.F (7.1.2, GA 5/2012). We will need to work with AIX support if efixes need to be built for a different version of AIX

HASN Setup Process

Assumptions

Basics

You are starting with an existing cluster.
The EMS is installed with the correct xCAT code, configured, and operational.
Service Nodes (SNs) are installed with correct xCAT code, configured, and operational.
Network routing from the EMS to the compute nodes is set up correctly so that if one service node goes down, there are other routes to reach the compute node network.
Release 2.7.2 of xCAT is installed on the EMS and SNs.
You are running AIX 7.1.1.0 release or greater.

Information used in examples

EMS:

SNs:

CNs:

domain:

network defs:

osimage for compute nodes:

Preparing an existing cluster

**Note**: If starting over with a new cluster then refer to the
https://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setting_Up_an_AIX_Hierarchical_Cluster
document for details on how to install an xCAT EMS and service nodes (SN).

Do not remove any xCAT or NIM definitions on the EMS.

Do not remove any postscripts or statelite information from the EMS.

Hardware setup for the shared file system

[TBD]

Shut down the cluster nodes

In the following example, "compute" is the name of an xCAT node group containing all the cluster compute nodes.

_**xdsh compute "/usr/sbin/shutdown -F &"**_

Remove the NIM client definitions from the SNs

The following command will remove all the NIM client definitions from both primary and backup service nodes. See the rmdsklsnode man page for additonal details.

_**rmdsklsnode -V -f compute**_

Remove NIM resources from the SNs

The existing NIM resources need to be removed on each service node. (With the original /install filesystem still in place.)

In the following example, "service" is the name of the xCAT node group containing all the xCAT service nodes, and "<osimagename>" should be substituted with the actual name of an xCAT osimage object.

_**rmnimimage -V -f -d -s service &lt;osimagename&gt;**_

See rmnimimage for additional details.

When this command is complete it would be good to check the service nodes to make sure there are no other NIM resources still defined. For each service node (or from EMS with 'xdsh service'), run lsnim to list whatever NIM resources may be remaining. Remove any random resources that are no longer needed (you should NOT remove basic NIM resources such as master, network, etc.)

Clean up the NFS exports

On each service node, clean up the NFS exports.

Edit /etc/exports and remove all entries related to /install. The xCAT mkdsklsnode command will create new entries for NFSv4 replication when it is run later in this process.
If your statelite persistent directory will not be located in the shared /install GPFS filesystem (recommendation is that it IS located in shared /install), edit /etc/exports and add an NFSv4 entry specifying the correct replica server.
Re-do the exports

exportfs -ua
exportfs -a (if there are any entries left in /etc/exports)

Back up local /install on SNs

[TBD]

Deactivate local /install directory

On each service node, deactivate (in whatever way you choose: rename, overmount, etc.) the local /install filesystem and activate the GPFS shared /install filesystem. Depending on how the /install filesystem was originally created, this may also require updates to /etc/filesystems.

[NEED_EXAMPLES]

Create shared file system on SNs

[NEED_EXAMPLES]

Restore backed up /install contents

The contents of the backed up local /install, EXCEPT for /install/nim, must by copied back to the shared /install directory in GPFS.

Since the other contents of the /install directory should be the same on all SNs you can just log in to one SN and use rsync to copy the files and directories to the shared file system.

_**ssh &lt;targetSN&gt;**_

[NEED_EXAMPLES]

rsync .......????...

Convert all existing NIM images to NFSv4

If your cluster was setup with NFSv3 you will need to convert all existing NIM images to NFSv4. On the EMS, for each OS image definition, run:

_**mknimimage -u &lt;osimage_name&gt; nfs_vers=4**_

Update /etc/inittab on SNs

On each service node, the AIX OS startup order has to be changed to start GPFS before NFS. Edit /etc/inittab on each service node.

_**vi /etc/inittab**_

Move the call to /etc/rc.nfs to AFTER the start of GPFS, making sure GPFS is active before starting NFS.

On each service node, the AIX OS shutdown order has to be changed to shutdown the NFS server before GPFS, so that NFS doesn't keep trying to serve files backed by GPFS. Add the following to /etc/rc.shutdown on each service node:

_**vi /etc/rc.shutdown and add:**_
_**stopsrc -s nfsd**_
_**exit 0**_

Installing cluster nodes

Configure the EMS

Verify that the following attributes and values are set in the xCAT site definition:

nameservers= "&lt;xcatmaster&gt;" 
domain=&lt;domain_name&gt; (this is required by NFSv4) 
useNFSv4onAIX="yes" 
sharedinstall="sns"

You could set these values using the following command:

_**chdef -t site nameservers= "&lt;xcatmaster&gt;" domain=mycluster.com useNFSv4onAIX="yes" sharedinstall="sns"**_

Verify that all required software and updates are installed.

[NEED_LIST]

If you intend to define dump resources for your compute nodes then make sure you have installed the prequisite software. See [XCAT_AIX_Diskless_Nodes#ISCSI_dump_support] for details.

Configure the SNs

Verify that all required software and updates are installed.

[NEED_LIST?]

You can use the updatenodecommand to update the SNs.

If you intend to define dump resources for your compute nodes then make sure you have installed the prequisite software. See [XCAT_AIX_Diskless_Nodes#ISCSI_dump_support] for details.

Create node groups for primary service nodes

Create node groups for each primary SN

[How_are_we_assigning_nodes_to_primary_and_backup_SNs????]

Update xCAT node definitions

Add new postscript setupnfsv4replication.

The following example assumes you are using a 'compute' nodegroup entry in your xCAT postscripts table.

_**chdef -t group compute -p postscripts=setupnfsv4replication**_

Set primary and backup SNs in node definition.

The "servicenode" attribute values must be the names of the service nodes as they are known by the EMS. The "xcatmaster" attribute value must be the name of the primary server as known by the nodes.

_**chdef -t node -o &lt;SNgroupname&gt; servicenode=&lt;primarySN&gt;,&lt;backupSN&gt;  xcatmaster=&lt;nodeprimarySN&gt;**_

Update or create NIM installp_bundle resources

- create or update NIM installp_bundle files that you wish to use with the new osimage - copy in any changes to the bundles files shipped with xCAT

create NIM installp_bundle resources

_These are the bundles I used:_

nim -Fo define -t installp_bundle -a location=/install/nim/installp_bundle/xCATaixCN71.bnd -a /

server=master xCATaixCN71 
nim -Fo define -t installp_bundle -a location=/install/nim/installp_bundle/xCATaixHFIdd.bnd -a /

server=master xCATaixHFIdd 
nim -Fo define -t installp_bundle -a location=/install/nim/installp_bundle/IBMhpc_base.bnd -a /

server=master IBMhpc_base 
nim -Fo define -t installp_bundle -a location=/install/nim/installp_bundle/IBMhpc_all.bnd -a /

server=master IBMhpc_all

Create or update xCAT osimages - (optional)

ex. "mknimimage -V -r -D -s 71Dsp3_lpp_source -t diskless 71Dsp3tst installp_bundle="xCATaixCN71" configdump=selective"

- add aditional software, updates, ifixes etc. to the lpp_source - update the spot - create a dump resource - and add it to the osimage def - optional - specify configdump value - optional

ex. mknimimage -V -u 71Dsp3tst

create NIM lpp_source for image
)- add all HPC software to the lpp_source
)- add all efixes to the lpp_source:

# Base STNFS failover support:

cp /xcat/stnfs/HPCstnfs.111202.epkg.Z /install/nim/lpp_source/71Dtst_lpp_source/emgr/ppc
nim -Fo check 71Dtst_lpp_source

create osimage on MN (mknimimage)

"mknimimage -t diskless -r -D -s 71D_lpp_source 71Dcompute otherpkgs=”HPCstnfs.111202.epkg.Z”  _&lt;other mknimimage input as needed&gt;&gt;_" 
_My actual command:_

mknimimage --force -V -r -s 71Dtst_lpp_source -t diskless 71Dtst /&nbsp;:::installp_bundle="xCATaixCN71,xCATaixHFIdd,IBMhpc_base,IBMhpc_all" /

otherpkgs=”HPCstnfs.111202.epkg.Z”  synclists=/install/custom/aix/compute.synclist

Note - there is a known xCAT bug that when you use multiple installp_bundle files with mknimimage,the rpm.rte lpp MUST be listed in the first bundle file you specify to xCAT (i.e. xCATaixCN71). The rpm command is needed by subsequent function in mknimimage to install rpms into the image.

)- add custom patches to the spot:

cp /xcat/stnfs/stnfs.ext /install/nim/spot/71Dtst/usr/lib/drivers/stnfs.ext 
cp /xcat/stnfs/stnfs64.kdb /install/nim/spot/71Dtst/usr/lib/ras/autoload/stnfs64.kdb 
cp /xcat/stnfs/nfs.ext /install/nim/spot/71Dtst/usr/lib/drivers/nfs.ext 
cp /xcat/stnfs/nfs.netboot.ext /install/nim/spot/71Dtst/usr/lib/drivers/nfs.netboot.ext 
nim -Fo check 71Dtst

)- verify efixes applied to the spot:

xcatchroot -i 71Dtst 'emgr -l'

Statelite setup

- statelite table entry MUST be $noderes.xcatmaster - required

do statelite setup

 \- The admin must create the persistent directory in the shared filesystem on the service nodes and add it to /etc/exports. Recommend creating it as something like GPFS /install/statelite_data since xCAT mkdsklsnode will NFSv4 export /install with the correct replica info for you. 
The statelite table should be set up so that each service node is the NFS server for its compute nodes. You should use the "$noderes.xcatmaster" substitution string instead of specifying the actual service node so that when xCAT changes the service node database values for the compute nodes during an snmove operation, this table will still have correct information. It should look something like:

   #node,image,statemnt,mntopts,comments,disable
   "compute",,"$noderes.xcatmaster:/install/statelite_data",,,


Reminder, if you have an entry in your litefile table for persistent AIX logs, you MUST redirect your console log to another location, especially in this environment. The NFSv4 client replication failover support logs messages during failover, and if the console log location is in a persistent directory, which is actively failing over, you can hang your failover. If you have an entry in your litefile similar to:

 tabdump litefile
  #image,file,options,comments,disable
  :
  "ALL","/var/adm/ras/","persistent","for GPFS",


be sure that you have a postscript that runs during node boot to redirect the console log:

/usr/sbin/swcons -p /tmp/conslog 
(or some other local location) 
For more information, see: [XCAT_AIX_Diskless_Nodes#Preserving_system_log_files]

Update postscripts and prescripts - (optional)

Run mkdsklsnode for backup SNs

Note: when using a shared file system across the SNs you must run the mkdsklsnode command on the backup SNs first and then run it for the primary SNs.

ex. mkdsklsnode -V -S -b -i 71Dsp3tst c250f10c12ap02-hf0

verify - check nim setup on backup SN

run mkdsklsnode for compute nodes

\- first, make sure /etc/exports on service nodes do not contain any old entries. If so, remove, and run 'exportfs -ua'
\- mkdsklsnode -S -V -i 71Dcompute compute 
notes:

- use -S flag to setup the NFSv4 replication settings on the SNs
\- the site.sharedinstall value tells us to do the primary only and to
copy resources to one SN (shared file system)

Run mkdsklsnode for primary SNs

ex. mkdsklsnode -V -S -p -i 71Dsp3tst c250f10c12ap02-hf0

verify - check nim setup on primary SN

Verify the NFSv4 replication setup

verify the NFSv4 replication is exported correctly for your service node pairs:

xdsh service cat /etc/exports | xcoll 
==================================== 
c250f10c12ap01 
==================================== 
/install -replicas=/install@20.10.12.1:/install@20.10.12.17,vers=4,rw,noauto,root=*

==================================== 
c250f10c12ap17 
==================================== 
/install -replicas=/install@20.10.12.17:/install@20.10.12.1,vers=4,rw,noauto,root=*&lt;/font&gt;

Boot nodes

_**rbootseq compute hfi**_
_**rpower compute on**_

Verify node function

- NFSv4 replication - dump device configuration

verify NFSv4 replication configured correctly the on compute node:

xdsh &lt;node&gt; nfs4cl showfs 
xdsh &lt;node&gt; nfs4cl showfs /usr

- simple test of NFSv4 client replication failover:

s1 = service node 1 
s2 = service node 2 
c1 = all compute nodes managed by s1, backup s2 
c2 = all compute nodes managed by s2, backup s1




xdsh c1,c2 nfs4cl showfs | xcoll

_# should show c1 filesystems served by s1 and c2 filesystems served by s2_
xdsh s1 stopsrc -s nfsd 
xdsh c1,c2 ls /usr | xcoll 
xdsh c1,c2 nfs4cl showfs | xcoll

_# should show all nodes getting /usr from s2 now (depending on NFS caching, it may take additional activity on the c1 nodes to have all filesystems failover to s2)_

**TESTING NOTE**:At this point, you can restart NFS on s1. You can continue testing by shutting down NFS on s2 and watching all nodes failover to s1. Once NFS is back up on both service nodes, over time, the clients should eventually switch back to using their primary server.

SN failover process

Discover a primary SN failure

https://sourceforge.net/apps/mediawiki/xcat/index.php?title=Monitor_and_Recover_Service_Nodes#Monitoring_Service_Nodes

Move nodes to the backup SN

Use the node groups that were created for each primary SN to simplify the process.

_**snmove -V &lt;targetnodegroup&gt;**_

Verify the move

On the MN - node defs (lsdef <node>) - check servicenode and xcatmaster values

On the nodes - def gateway - (netstat -nr) - dump device - (sysdumpdev) - list dump target info??? - statelite tables - in root dir - .client_data files -? in s-r/node/c-d - /etc/xcatinfo file

Reboot nodes - (optional)

_**xdsh compute "shutdown -F &"**_
_**rpower compute on**_

xCAT Wiki

An extreme cluster/cloud administration toolkit

XCAT_HASN_with_GPFS

XCAT High Availability Service Nodes (HASN)

Overview of Hardware and Cluster Configuration

Using a shared filesystem

Storage Hardware Configuration

GPFS Configuration

Considerations for Other Software Components

Limitations, Notes and Issues

Software Pre-requisites

HASN Setup Process

Assumptions

Basics

Information used in examples

Preparing an existing cluster

Hardware setup for the shared file system

Shut down the cluster nodes

Remove the NIM client definitions from the SNs

Remove NIM resources from the SNs

Clean up the NFS exports

Back up local /install on SNs

Deactivate local /install directory

Create shared file system on SNs

Restore backed up /install contents

Convert all existing NIM images to NFSv4

Update /etc/inittab on SNs

Installing cluster nodes

Configure the EMS

Configure the SNs

Create node groups for primary service nodes

Update xCAT node definitions

Update or create NIM installp_bundle resources

Create or update xCAT osimages - (optional)

Statelite setup

Update postscripts and prescripts - (optional)

Run mkdsklsnode for backup SNs

Run mkdsklsnode for primary SNs

Verify the NFSv4 replication setup

Boot nodes

Verify node function

SN failover process

Discover a primary SN failure

Move nodes to the backup SN

Verify the move

Reboot nodes - (optional)

Reverting to the primary service node

Working in a HASN environment

Removing NIM client definitions

Removing old NIM resources