DRAFT! This is a work-in-progress and is not complete!!!!
(Using NFS v4 Client Replication Failover with GPFS filesystems.)
AIX diskless nodes depend on their service nodes for many services: bootp, tftp, default gateway, name serving, NTP, etc. The most significant service is NFS to access OS files, statelite data, and paging space. This document describes how to use GPFS and NFSv4 client replication failover support to provide continuous operation of the full HPC cluster if the NFS services provided by a service node become unavailable, whether due to failure of that service node or for other reasons.
OPTIONAL: The EMS can also be attached to the external disks and included in the GPFS cluster. In this case, the /install filesystem will be common across the EMS and all service nodes.
NOTE: This option is not yet supported by xCAT.
Each service node will NFSv4 export the /install filesystem with its backup service node NFS server replica specified (automatically set in the /etc/exports file by the xCAT mkdsklsnode command).
During normal cluster operation, if a compute node is no longer able to access its NFS server, it will failover to the configured replica backup server. Since both NFS servers are using GPFS to back the filesystem, the replica server will be able to continue to serve the identical data to the compute node. However, there is no automatic failover capability for the dump resource - no dump capability will be available until the xCAT snmove command is run to retarget the compute node's dump device to its backup service node.
There are a few components that normally run on the service nodes that under certain circumstances need access to the application GPFS cluster. Since a service node can't be (directly) in 2 GPFS clusters at once, some changes in the placement or configuration of these components must be made, now that the service nodes are in their own GPFS cluster. The components that can be affected by this are:
Similarly, if you need to stop and restart GPFS on a service node, make sure to stop/start these services in the following order:
exportfs -ua
exportfs -a
The paging space currently does not support NFSv4 client replication fail over. This may cause problems if the primary service node goes down, and the compute node requires paging to remain operational. xCAT development has started preliminary testing with a prototype version of the paging space support, and some notes have been included in the process below.
There is currently an issue with using NFSv4 replication client fail over for readwrite files, even when GPFS is ensuring that the files are the same regardless of which SN the are accessed from. A small timing window exists in which the client sends a request to update a file and the server updates it, but before it sends the acknowledgement to the client, the server crashes. When the client fails over to the other server (which has the updated file thanks to GPFS) and resends the update request, the client will detect that the modification time the client and server think the file has are different and bail out, marking the file "dead" until the client closes and reopens the file. This is a precaution, because the NFS client has no way of verifying that this is the exact same file that it updated on the other server. AIX development is sizing a configuration option in which we could tell it not to mark the file dead in this case because GPFS is ensuring the consistency of the files between the servers.
Note - we have not yet directly experienced this condition in any of our testing.
If "site.sharedinstall=all" (currently not supported), all NIM resources on the EMS will be created directly in the GPFS filesystem, including your lpp_source and spot resources. By default, NIM resources cannot be created with associated files in a GPFS filesystem (only jsf or jsf2 filesystems are supported). To bypass this restriction all NIM commands must be run with either the environment variable "NIM_ATTR_FORCE=yes" set, or by using the 'nim -F' force flag directly on each command. All xCAT commands have been changed to accommodate this setting. However, it is often necessary for an admin to run NIM commands directly. When doing so, be sure to use one of these force options.
[Need_final_statement_for_prereqs]
xCAT 2.7.2 including the following code updates:
Base: AIX 7.1.D (7.1.1.0)
Initial code drop of STNFS failover support:
**(AIX CMVC defect 816890)**
**HPCstnfs.111202.epkg.Z **
STNFS Patch from Duen-wen to fix I/O errors from 'ls -lR /' after failover downloaded from ausgsa:
**(AIX CMVC defect 822215):**
/usr/lib/drivers/stnfs.ext
NFS Patches from Duen-wen to fix access failures to libC in /usr filesystem downloaded from ausgsa:
**(AIX CMVC defect 826634):**
/usr/lib/drivers/nfs.ext
/usr/lib/drivers/nfs.netboot.ext
NIM patch to turn off TCB-enabled during SPOT build (locally modified on EMS by Linda Mellor based on instructions from Paul Finley). This is ONLY required for sharedinstall=all (not needed for sharedinstall=sns):
**(AIX CMVC defect 824583):**
/usr/lpp/bos.sysmgt/nim/methods/c_instspot
From AIX Development/Service:
==> swinf -u 816890
U843487|IV11645|816890|bos|bos.net.nfs.client|bos71F onc71F pkg71F|aix|limited stnfs replication support|7.1.1.15|
U846403|IV11645|816890|bos|bos.net.nfs.client|aix71D|aix|limited stnfs replication support|7.1.1.3|7100-01-03
U846654|IV11646|816890|bos|bos.net.nfs.client|bos71H onc71H pkg71H|aix|limited stnfs replication support|7.1.2.0|
==> swinf -u 822215
U843487|IV14334|822215|bos|bos.net.nfs.client|onc71F|aix|stnfs replication will not work if the mount point is not the root of a FS.|7.1.1.15|
U846654|IV15285|822215|bos|bos.net.nfs.client|onc71H|aix|stnfs replication will not work if the mount point is not the root of a FS.|7.1.2.0|
==> swinf -u 826634
U842986|IV16857|826634|bos|bos.net.nfs.client|onc61S|aix|During file recovery, regular file becomes symlink|6.1.7.15|
U849900|IV18488|826634|bos|bos.net.nfs.client|onc61N|aix|During file recovery, regular file becomes symlink|6.1.6.19|
U845434|IV16681|826634|bos|bos.net.nfs.client|onc61V|aix|During file recovery, regular file becomes symlink|6.1.8.0|
U843487|IV16758|826634|bos|bos.net.nfs.client|onc71F|aix|During file recovery, regular file becomes symlink|7.1.1.15|
U846654|IV17125|826634|bos|bos.net.nfs.client|onc71H|aix|During file recovery, regular file becomes symlink|7.1.2.0|
816890 will be in 7100-01-03, i.e. 71D SP3.
The other two, 822215 and 826634 do not show up in 71D as of now.
NOTE: All STNFS/NFS defects are fixed and will be shipped in AIX 7.1.F (7.1.2, GA 5/2012). We will need to work with AIX support if efixes need to be built for a different version of AIX
This procedure assumes the following:
**Note**: If starting over with a new cluster then refer to the
https://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setting_Up_an_AIX_Hierarchical_Cluster
document for details on how to install an xCAT EMS and service nodes (SN).
Do not remove any xCAT or NIM information from the EMS.
Storage Setup Configuration 1
Storage Setup Configuration 2
Perform the necessary admin steps to assign the fibre channel I/O adapter slots to the selected xCAT SN octant/LPAR (the xCAT chvm command may be used to do this). The xCAT SN LPAR and serving CEC may need to be taken down to make I/O slot changes to the xCAT SN configuration.
Ensure that the assigned SAN Disks being used with the GPFS cluster can be allocated back to the assigned fibre channel adapters and the SAN disks can be seen on the target xCAT SNs .
[PUNEET/BIN_-_PLEASE_ADD_DETAIL_HERE_AS_REQUIRED]
Recommendations for the GPFS setup
Layout of the file systems on the external disks:
For now, mount the GPFS /install filesystem on a temporary mount point on the SNs, such as /install-gpfs. This will need to be changed to /install later in the process.
Since this process will remove all existing data from the local /install/nim directories on your service nodes, you may choose to make a backup copy of the /install filesystem at this time.
The contents of the local /install filesystems on your SNs will need to be copied into the new shared GPFS /install-gpfs filesystem. You should NOT copy over the /install/nim directory -- this will need to be completely re-created in order to ensure NIM is configured correctly for each SN.
Most of the xCAT data should be identical for each local SN /install directory in your cluster. This includes sub-directories such as:
/install/custom
/install/post
/install/prescripts
/install/postscripts
It will only be necessary to copy these subdirectories from one SN into /install-gpfs. Therefore, you can just log into one SN and use rsync to copy the files and directories to the shared file system.
ssh <targetSN>
rsync .......????...
You must create the directory for your persistent statelite data in the /install-gpfs filesystem. e.g. from one SN:
mkdir /install-gpfs/statelite_data
(Optional) At this time, you may choose to place an initial copy of your persistent data into the /install-gpfs filesysem. However, since the compute nodes in your cluster are currently running, they are still updating their persistent files, so you will need to resync this data again later after bringing down the cluster. Depending on the amount and stability of your persistent data, the subsequent rsync can take much less time and help reduce your cluster outage time.
Use rsync to do the initial copy from your current statelite directory. You should run this rsync from one SN at a time to copy data into the shared /install-gpfs filesystem. This will ensure that if you happen to have more than one SN that has a subdirectory for the same compute node, you will not run into collisions copying from multiple SNs at the same time. Make sure to use the rsync -u (update) option to ensure stale data from an older SN does not overwrite the data from an active SN.
Note: You do not need to worry about changing /etc/exports to correctly export your statelite directory since you are placing it in /install-gpfs. Later in the process, you will rename the filesystem to /install, and xCAT will add the correct /install export for NFSv4 replication to /etc/exports when mkdsklsnode runs.
The statelite table will need to be set up so that each service node is the NFS server for its compute nodes. You should use the "$noderes.xcatmaster" substitution string instead of specifying the actual service node so that when xCAT changes the service node database values for the compute nodes during an snmove operation, this table will still have correct information. It should look something like:
_**#node,image,statemnt,mntopts,comments,disable**_
_**"compute",,"$noderes.xcatmaster:/install/statelite_data",,,**_
REMINDER: If you have an entry in your litefile table for persistent AIX logs, you MUST redirect your console log to another location, especially in this environment. The NFSv4 client replication failover support logs messages during failover, and if the console log location is in a persistent directory, which is actively failing over, you can hang your failover. If you have an entry in your litefile similar to:
tabdump litefile
_**#image,file,options,comments,disable**_
_**"ALL","/var/adm/ras/","persistent","for GPFS",**_
Be sure that you have a postscript that runs during node boot to redirect the console log:
_**/usr/sbin/swcons -p /tmp/conslog**_
(or some other local location)
For more information, see: [XCAT_AIX_Diskless_Nodes#Preserving_system_log_files]
If you have any other non-xCAT data in your local /install filesystems, you will first need to determine if this data is identical across all service nodes, or if you will need to create a directory structure to support unique files for each SN. Based on that determination, copy the data into /install-gpfs as appropriate.
Verify that the following attributes and values are set in the xCAT site definition:
nameservers= "<xcatmaster>"
domain=<domain_name> (this is required by NFSv4)
useNFSv4onAIX="yes"
sharedinstall="sns"
You could set these values using the following command:
_**chdef -t site nameservers= "<xcatmaster>" domain=mycluster.com**_
_**useNFSv4onAIX="yes" sharedinstall="sns"**_
Verify that all required software and updates are installed.
You can use the updatenodecommand to update the SNs.
If you intend to define dump resources for your compute nodes then make sure you have installed the prequisite software. See [XCAT_AIX_Diskless_Nodes#ISCSI_dump_support] for details.
NOTE: If any software changes you are making require you to reboot the service node, you may wish to postpone this work until you shutdown the cluster nodes later in the process.
On each service node, the AIX OS startup order has to be changed to start GPFS before NFS. Edit /etc/inittab on each service node.
_**vi /etc/inittab**_
Move the call to /etc/rc.nfs to AFTER the start of GPFS, making sure GPFS is active before starting NFS.
On each service node, the AIX OS shutdown order has to be changed to shutdown the NFS server before GPFS, so that NFS doesn't keep trying to serve files backed by GPFS. Add the following to /etc/rc.shutdown on each service node:
_**vi /etc/rc.shutdown and add:**_
_**stopsrc -s nfsd**_
_**exit 0**_
You may wish to keep copies of these files on the EMS and add them to synclists for your service nodes. Then, if you ever need to re-install your service nodes, these files will be updated correctly at that time.
(There is nothing unique required in this step for HASN support)
Create or update NIM installp_bundle files that you wish to use with your osimages.
Also, if you are upgrading to a new version of xCAT, you should check any installp_bundles that you use that were provided as sample bundle files by xCAT. If these sample bundle files are updated in the new version of xCAT you should update your NIM installp_bundle files appropriately.
The list of bundle files you should have defined include:
To define a NIM installp_bundle resource you can run a command similar to the following:
_**nim -Fo define -t installp_bundle -a location=/install/nim/installp_bundle/xCATaixCN71.bnd**_
_**-a server=master xCATaixCN71**_
You can modify a bundle file by simply editing it. It does not have to be re-defined.
If your cluster was setup with NFSv3 you will need to convert all existing NIM images to NFSv4. On the EMS, for each existing OS image definition, run:
_**mknimimage -u <osimage_name> nfs_vers=4**_
You will need to build images with the correct version of AIX, all of the required fixes for NFS v4 client replcation failover support, and your desired HPC software stack. You can use existing xCAT osimage definitions or you can create new ones using the xCAT mknimimage command.
To create a new osimage you could run a command similar to the following:
_**mknimimage -V -r -s /myimages -t diskless <osimage name>**_
_**installp_bundle="xCATaixCN71,xCATaixHFIdd,IBMhpc_base,IBMhpc_all"**_
Whether you are using an existing lpp_source or you created a new one you must make sure you copy any new software prerequisites or updates to the NIM lpp_source resource for the osimage.
The easiest way to do this is to use the "nim -o update" command.
For example, to copy all software from the /tmp/myimages directory you could run the following command.
_**nim -o update -a packages=all -a source=/tmp/myimages <lpp_source name>**_
This command will automatically copy installp, rpm, and emgr packages to the correct location in the lpp_source subdirectories.
Once you have copied all you software to the lpp_source it would be good to run the following two commands.
_**nim -Fo check <lpp_source name>**_
And.
_**chkosimage -V <spot name>**_
See chkosimage for details.
You can use the the xCAT mknimimage, xcatchroot, or xdsh commands to update the spot software on the EMS.
For example, to install the HPCstnfs.111202.epkg.Z ifix you could run the following command.
_**mknimimage -V -u <spot name> otherpkgs="E:HPCstnfs.111202.epkg.Z"**_
Check the spot.
_**nim -Fo check <spot name>**_
Verify that the ifixes are applied to the spot.
_**xcatchroot -i <spot name> "emgr -l"**_
Dump resource
Due to current NIM limitations a dump resource cannot be created in the shared file system.
If you wish to define a dump resource to be included in an osimage definition you must use NIM directly to create the resource in a separate local file system on the EMS. (For example /export/nim.)
Once the dump resource is created you can add its name to your osimage definition.
_**chdef -t osimage -o <osimage name> dump=<dump res name>**_
When the mkdsklsnode command creates the resources on the SNs it will create the dump resources in a local filesystem with the same name, e.g. /export/nim. If you want these directories to exist in filesystems on the external storage subsystem, you will need to create those filesystems and have them available on each SN before running the mkdsklsnode command.
Paging resource
[TBD_-_this_section_will_be_expanded_once_the_paging_failover_support_becomes_available]
On one SN, create the paging files for all of the compute nodes in your cluster in the shared /install filesystem. For example, to create 128G of swap space for each node do:
mkdir /install/paging
# For each compute node:
mkdir /install/paging/<compute node>
dd if=/dev/zero of=/install/paging/<node>/swapnfs1 bs=1024k count=65536
dd if=/dev/zero of=/install/paging/<node>/swapnfs2 bs=1024k count=65536
Set up a new postscript to run on the compute node to activate that paging space with replication/failover support and disable the default swapnfs0:
#!/bin/sh
rmps swapnfs1
rmps swapnfs2
mkps -t nfs $MASTER /install/paging/`hostname -s`/swapnfs1:fur
mkps -t nfs $MASTER /install/paging/`hostname -s`/swapnfs2:fur
swapon /dev/swapnfs1
swapon /dev/swapnfs2
swapoff /dev/swapnfs0
rmps swapnfs0
NOTE: The paging space failover support is NOT available yet, so if a diskless node is paging during failover, the paging activity will hang. Also, the flags ':fur' are specific for failover support. If you are setting up paging in preparation for this future function, use the flags ':wam' instead.
Create node groups for each primary SN
[How_are_we_assigning_nodes_to_primary_and_backup_SNs????]
The following example assumes you are using a 'compute' nodegroup entry in your xCAT postscripts table.
_**chdef -t group compute -p postscripts=setupnfsv4replication**_
The "servicenode" attribute values must be the names of the service nodes as they are known by the EMS. The "xcatmaster" attribute value must be the name of the primary server as known by the nodes.
_**chdef -t node -o <SNgroupname> servicenode=<primarySN>,<backupSN> xcatmaster=<nodeprimarySN>**_
????? need postscript for creating additional paging???
What others????
In the following example, "compute" is the name of an xCAT node group containing all the cluster compute nodes.
_**xdsh compute "/usr/sbin/shutdown -F &"**_
The following command will remove all the NIM client definitions from both primary and backup service nodes. See the rmdsklsnode man page for additonal details.
_**rmdsklsnode -V -f compute**_
The existing NIM resources need to be removed on each service node. (With the original /install filesystem still in place.)
In the following example, "service" is the name of the xCAT node group containing all the xCAT service nodes, and "<osimagename>" should be substituted with the actual name of an xCAT osimage object.
_**rmnimimage -V -f -d -s service <osimagename>**_
See rmnimimage for additional details.
When this command is complete it would be good to check the service nodes to make sure there are no other NIM resources still defined. For each service node (or from EMS with 'xdsh service'), run lsnim to list whatever NIM resources may be remaining. Remove any random resources that are no longer needed (you should NOT remove basic NIM resources such as master, network, etc.)
On each service node, clean up the NFS exports.
Re-do the exports
exportfs -ua
exportfs -a (if there are any entries left in /etc/exports)
Use rsync to copy all the persistent data from your current statelite directory. Even if you did an initial copy earlier in the process, you will need to do this again now to pick up any changes that have been written since then. You should run this rsync from one SN at a time to copy data into the shared /install-gpfs filesystem. This will ensure that if you happen to have more than one SN that has a subdirectory for the same compute node, you will not run into collisions copying from multiple SNs at the same time. Make sure to use the rsync -u (update) option to ensure stale data from an older SN does not overwrite the data from an active SN.
On each service node, deactivate (in whatever way you choose: rename, overmount, etc.) the local /install filesystem. Change the mount point for your shared GPFS /install-gpfs filesystem to /install. Depending on how the old local /install filesystem was originally created, this may also require updates to /etc/filesystems.
If you postponed updating software on your service nodes because of required reboots, you should apply that software now and reboot the SNs.
After the SNs come back up, make sure that the admin GPFS cluster is running and NFS has started correctly.
Make sure /etc/exports on service nodes do not contain any old entries. If so, remove, and run:
_**exportfs -ua**_
When using a shared file system across the SNs you must run the mkdsklsnode command on the backup SNs first and then run it for the primary SNs.
This is necessary since there are some install-related files that are server-specific. The server that is configured last is the one the node will boot from first.
_**mkdsklsnode -V -S -b -i <osimage name> <noderange>**_
Use the -S flag to setup the NFSv4 replication settings on the SNs.
If you are using a dump resource you can specify the type dump to be collected from the client. The values are "selective", "full", and "none". If the configdump attribute is set to "full" or "selective" the client will automatically be configured to dump to an iSCSI target device. The "selective" memory dump will avoid dumping user data. The "full" memory dump will dump all the memory of the client partition. Selective and full memory dumps will be stored in subdirectory of the dump resource allocated to the client. This attribute is saved in the xCAT osimage definition.
For example:
_**mkdsklsnode -V -S -b -i <osimage name> <noderange> configdump=selective**_
To verify the setup on the SNs you could use xdsh to run the lsnim command on the SNs.
To check for the resource and node definitions you could run:
_**xdsh <SN name> "lsnim"**_
To get the details of the NIM client definition you could run"
_**xdsh <SN name> "lsnim"**_
To set up the primary service nodes run the same command you just ran on the backup SNs only use the "-p" option instead of the "-b" option.
_**mkdsklsnode -V -S -p -i <osimage name> <noderange>**_
Verify the NFSv4 replication is exported correctly for your service node pairs:
_**xdsh service cat /etc/exports | xcoll**_
====================================
c250f10c12ap01
====================================
/install -replicas=/install@20.10.12.1:/install@20.10.12.17,vers=4,rw,noauto,root=*
====================================
c250f10c12ap17
====================================
/install -replicas=/install@20.10.12.17:/install@20.10.12.1,vers=4,rw,noauto,root=*
_**rbootseq compute hfi**_
_**rpower compute on**_
If you specified a dump resource you can check if the primary dump device has been set on the node by running:
_**xdsh <node> "sysdumpdev"**_
Verify the NFSv4 replication is configured correctly the a compute node:
_**xdsh <node> nfs4cl showfs**_
_**xdsh <node> nfs4cl showfs /usr**_
_simple test of NFSv4 client replication failover:_
s1 = service node 1
s2 = service node 2
c1 = all compute nodes managed by s1, backup s2
c2 = all compute nodes managed by s2, backup s1
xdsh c1,c2 nfs4cl showfs | xcoll
should show c1 filesystems served by s1 and c2 filesystems served by s2
xdsh s1 stopsrc -s nfsd
xdsh c1,c2 ls /usr | xcoll
xdsh c1,c2 nfs4cl showfs | xcoll
should show all nodes getting /usr from s2 now (depending on NFS caching, it may take additional activity on the c1 nodes to have all filesystems failover to s2)__
_**TESTING NOTE** At this point, you can restart NFS on s1. You can continue testing by _
_shutting down NFS on s2 and watching all nodes failover to s1. Once NFS is back up on_
_both service nodes, over time, the clients should eventually switch back to using their_
_primary server._
The nodes will continue running after the primary service node goes down, however, you should move the nodes to the backup SN as soon as possible.
Use the xCAT snmove command to move a set of nodes to the backup service node.
Note: Since we have already run mkdsklsnode on the backup SN we know that the NIM resources have been defined and nodes initialized.
In the case where a primary SN fails you can run the snmove command with the node group you created for this SN. For example, if the name of the node group is "SN27group" then you could run the following command:
_**snmove -V SN27group**_
You can also specify scripts to be run on the nodes by using the "-P" option.
_**snmove -V SN27group -P myscript**_
(Make sure the script has benn added to the /install/postscripts directory and that it has the correct permissions.)
The snmove command performs several steps to both keep the nodes running and to prepare the backup service node for the next time the nodes need to be booted.
This includes the following:
You can verify some of these steps by running the following commands.
Check if the node definitions have been modified:
lsdef <noderange>
Check the primary dump device on the nodes.
xdsh <noderange> "/bin/sysdumpdev"
Make sure the primary dump device has been reset.
Check the default gateway.
xdsh <noderange> "/bin/netstat -rn"
Check the contents of the /etc/xcatinfo file.
xdsh <noderange> "/bin/cat /etc/xcatinfo"
See if the server is the name of the new SN.
The nodes should continue running after the primary SN goes down, however, it is adviseable to reboot the node at soon as possible.
When the node are rebooted they will automatically boot from the new SN.
_**xdsh compute "shutdown -F &"**_
_**rpower compute on**_
The process for switching nodes back will depend on what must be done to recover the original service node. Essentially the SN must have all the NIM resources and definitions restored and operations completed before you can use it.
If you are using the xCAT statelite support then you must make sure you have the latest files and directories copied over and that you make any necessary changes to the statelite and/or litetree tables.
If all the configuration is still intact you can simply use the snmove command to switch the nodes back.
If the configuration must be restored then you will have to run the mkdsklsnode command. This commands will re-configure the SN using the common osimages defined on the xCAT management node.
Remember that this SN would now be considered the backup SN, so when you run mkdsklsnode you need to use the "-b" option.
Once the SN is ready you can run the snmove command to switch the node definitions to point to it. For example, to move all the nodes in the "SN27group" back to the original SN you could run the following command.
_**svmove -V SN27group**_
The next time you reboot the nodes they will boot from the original SN.
Because Teal-gpfs monitoring must be in the application GPFS cluster, we must move the teal.gpfs-sn ( will we change this name) off the service node to a utility node that is in the compute node cluster. Since the teal package requires access to the database server, we will also be installing and configuring a new db2driver package that have a minimum DB2 client that can run the required ODBC interface on a diskless node.
You will have to obtain the db2driver code and the required level of the teal.gpfs-sn code from IBM that supports this function. The db2driver code is available at the following location on fix central and is available to anyone holding the HPC DB2 license.
http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=ibm/Information+Management&product=ibm/Information+Management/IBM+Data+Server+Client+Packages&release=9.7.&platform=All&function=fixId&fixids=-dsdriver-*FP005&includeSupersedes=0
The following DB2 driver software package was tested and works withe the DB2 9.7.4 or 9.7.5 WSER Server code.
v9.7fp5_aix64_dsdriver.tar.gz
You will need to obtain the appropriate level of TEAL. Only the teal.gpfs-sn lpp is required on the node.
We will configure the Data Server Client in the /db2client directory on the EMS machine. We will use this setup to update the image for the utility node that will run it.
*** Get unzip and install unzip, if not already available.**
Note the Data Server Client code requires unzip. Make sure it is available before continuing:
AIX:
Get unzip from Linux Toolbox, if not already available.
rpm -i unzip-5.51-1.aix5.1.ppc.rpm for diskfull. For AIX diskless need to add to statelite image.
*** Extract the Data Server Client Code on the EMS**
mkdir /db2client
cd /db2client
cp ..../v9.7fp5_aix64_dsdriver.tar.gz .
gunzip v9.7fp5_aix64_dsdriver.tar.gz
tar -xvf v9.7fp5_aix64_dsdriver.tar
export PATH=/db2client/dsdriver/bin:$PATH
export LIBPATH=/db2client/dsdriver/lib:$LIBPATH
Set the path to the Data Server Client code. You should add these to your .profile on AIX. (Linux TBD).
export PATH=/db2client/dsdriver/bin:$PATH
export LIBPATH=/db2client/dsdriver/lib:$LIBPATH
*** Install the Driver** This script will only automatically setup the 64 bit driver. We must manually extract the 32 bit driver.
cd /db2client/dsdriver
./installDSdriver
cd odbc_cli_driver
cd *32
uncompress *.tar.Z
tar -xvf *.tar
Note: the package I downloaded had sub-directories not defined with the bin owner/ bin group. To be sure, do the following:
cd /db2client
chown -R bin *
chgrp -R bin *
*Create shared lib on 32 bit path (AIX)
cd /db2client/dsdriver/odbc_cli_driver/aix32/clidriver/lib
ar -x libdb2.a
mv shr.o libdb2.so
The DB2 Data Server Client has several configuration files that must be setup.
The db2dsdriver.cfg configuration file contains database directory information and client configuration parameters in a human-readable format.
The db2dsdriver.cfg configuration file is a XML file that is based on the db2dsdriver.xsd schema definition file. The db2dsdriver.cfg configuration file contains various keywords and values that can be used to enable various features to a supported database through ODBC, CLI, .NET, OLE DB, PHP, or Ruby applications. The keywords can be associated globally for all database connections, or they can be associated with specific database source name (DSN) or database connection.
cd /db2client/dsdriver/cfg
cp db2dsdriver.cfg.sample db2dsdriver.cfg
chmod 755 db2dsdriver.cfg
vi db2dsdriver.cfg
Here is a sample setup for a node accessing the xcatdb database on the Management Node p7saixmn1.p7sim.com
<configuration>
<dsncollection>
<dsn alias="xcatdb" name="xcatdb" host="p7saixmn1.p7sim.com" port="50001"/>
</dsncollection>
<databases>
<database name="xcatdb" host="p7saixmn1.p7sim.com" port="50001">
</database>
</databases>
</configuration>
The CLI/ODBC initialization file (db2cli.ini) contains various keywords and values that can be used to configure the behavior of CLI and the applications using it.
The keywords are associated with the database alias name, and affect all CLI and ODBC applications that access the database.
cd /db2client.save/dsdriver/cfg
cp db2cli.ini.sample db2cli.ini
chmod 0600 db2cli.ini
Here is a sample db2cli.in file containing information needed to access the xcatdb database, using instance xcatdb and password cluster. Note this file should only be readable by root.
[xcatdb]
uid=xcatdb
pwd=cluster
For 32 bit, copy the /db2client/dsdriver/cfg files to /db2client/dsdriver/odbc_cli_driver/aix32/clidriver/cfg
cd /db2client/dsdriver/cfg
cp db2cli.ini /db2client/dsdriver/odbc_cli_driver/aix32/clidriver/cfg
cp db2dsdriver.cfg /db2client/dsdriver/odbc_cli_driver/aix32/clidriver/cfg
The unixODBC files are still needed. The following are sample configurations:
cat /etc/odbc.ini
[xcatdb]
Driver = DB2
DATABASE = xcatdb
cat /etc/odbcinst.ini
[DB2]
Description = DB2 Driver
Driver = /db2client/dsdriver/odbc_cli_driver/aix32/clidriver/lib/libdb2.so
FileUsage = 1
DontDLClose = 1
Threading = 0
We have create a new diskless image for the teal.gpfs-sn node. Here is a sample Bundle file:
# sample bundle file for teal-gpfs utility node
I:rpm.rte
I:openssl.base
I:openssl.license
I:openssh.base
I:openssh.man.en_US
I:openssh.msg.en_US
I:gpfs.base
I:gpfs.gnr
I:gpfs.msg.en_USI:rsct.core.sensorrm
I:teal.gpfs-sn
# RPMs
R:popt*
R:rsync*
# using Perl 5.10.1
R:perl-Net_SSLeay.pm-1.30-3*
R:perl-IO-Socket-SSL*
R:unixODBC*
R:unzip* (optional) since we are setting up db2driver on the EMS
With this additional bundle file, build the diskless image for the teal-gpfs utiltiy node.
Copy /db2driver directory into the image.
Copy /etc/odbc.init into the image.
Copy /etc/odbcinst.ini into the image.
Copy /db2cli.ini into the image
Copy /etc/xcat/cfgloc into the image
Configure IP forwarding such that the Utility Node can access the DB2 Server (EMS)