
This document assumes that you have already purchased your LoadLeveler product, have the install packages available, and are familiar with the LoadLeveler documentation: Tivoli Workload Scheduler LoadLeveler library
These instructions are based on LoadLeveler 4.1 and 5.1. If you are using a different version of LoadLeveler, you may need to make adjustments to the information provided here.
When installing LoadLeveler in an xCAT cluster, it is assumed that you will be using the xCAT MySQL or DB2 database to store your LoadLeveler configuration data. Different versions of Loadleveler support different operating systems, hardware architectures, and databases. Refer to the LoadLeveler documentation for the support required for your cluster. For example, Power 775 requires LoadLeveler 5.1 on AIX 7.1 or RedHat ELS 6 with DB2.
Before proceeding with these instructions, you should have the following already completed for your xCAT cluster:
Loadleveler requires that userids be common across all nodes in a LoadLeveler cluster, and that the user home directories are shared. There are many different ways to handle user management and to set up a cluster-wide shared home directory (for example, using NFS or through a global filesystem such as GPFS). These instructions assume that the shared home directory has already been created and mounted across the cluster and that the xCAT management node and all xCAT service nodes are also using this directory. You may wish to have xCAT invoke your custom postbootscripts on nodes to help set this up.
Note: For Linux statelite or stateless clusters, a problem exists when updating the LoadLeveler 4.1 rpms(e.g. PTF6 to PTF7). Currently, a new license rpm is shipped and must be installed and accepted before the other LL rpms will install correctly. This will be fixed in future LL PTFs, so that customers will only need to accept the license when installing the base LL rpms. Once fixed, when LL is updated, no LL license rpm will be updated and customers will not need to accept the license a second time. Until a fix is available, a workaround has been posted to [Updating_IBM_HPC_product_software].
Copy the LoadLeveler packages from your distribution media onto the xCAT management node (MN). Suggested target location to put the packages on the xCAT MN:
/install/post/otherpkgs/<os>/<arch>/loadl
For rhels6 ppc64 , the target location is:
/install/post/otherpkgs/rhels6/ppc64/loadl
Note: LoadLeveler 4.1.1 on Linux requires a special Java rpm to run its license acceptance script. This is not required for LoadLeveler 5.1. The correct version of this rpm is identified in the LoadLeveler product documentation (at the time of this writing, the rpm was IBMJava2-142-ppc64-JRE-1.4.2-5.0.ppc64.rpm, but please verify with the LL documentation). Ensure the Java rpm is included in the loadl otherpkgs directory.
For Linux, you should create repodata in your /install/post/otherpkgs/<os>/<arch>/* directory so that yum or zypper can be used to install these packages and automatically resolve dependencies for you:
createrepo /install/post/otherpkgs/<os>/<arch>/loadl
If the createrepo command is not found, you may need to install the createrepo rpm package that is shipped with your Linux OS. For SLES 11, this is found on the SDK media.
Following the LoadLeveler Installation Guide, create the loadl group and userid:
On Linux:
groupadd loadl
useradd -G loadl loadl
For rhels6 ppc64 .
groupadd loadl
useradd -g loadl loadl
On AIX:
mkgroup -a loadl
mkuser pgrp=loadl groups=loadl home=/<user_home>/loadl loadl
Commonly, the <use_home> is "home" directory. When creating the loadl group and userid, the administrator can change it as needed. It is assumed that you have already created a common home directory in the cluster for all users either in NFS, GPFS, or some other shared filesystem. LoadLeveler requires that its administrative userid have either rsh or ssh access across all nodes in the cluster and to the LL central manager. Make sure you have set this up for the loadl userid. For example, to create a .rhosts file (as root):
nodels compute > /<user_home>/loadl/.rhosts
echo "<MN hostname>" >> /<user_home>/loadl/.rhosts
chown loadl:loadl /<user_home>/loadl/.rhosts
Or, if you are using ssh for LoadLeveler communications, follow your ssh documentation to set up .ssh keys for the userid.
Note: xCAT does not provide any general function for just setting up a user's ssh keys. However, if the user will also be running xCAT xdsh and other commands, the xCAT wiki page on [Granting_Users_xCAT_privileges] includes instructions on how to provide the user with this access, including automatically setting up ssh keys for that user.
If the user will not be authorized to run xCAT commands, you can still "cheat" and take advantage of a side-effect of the xdsh command to set up your ssh keys:
su - <userid>
/opt/xcat/bin/xdsh xxx -K ## "xxx" can be any string
xdsh will prompt you for the user's password. Enter the correct password, and then xdsh will fail with:
Error: Permission denied for request
Even though the xdsh command failed, it still created a /u/<userid>/.ssh directory with ssh keys. Create an authorized_keys file for the user:
cat /<user_home>/<userid>/.ssh/id_rsa.pub >> /<user_home>/<userid>/.ssh/authorized_keys
Since the home directory is shared across the cluster, the userid now has non-password prompted ssh access to all nodes and to the xCAT management node.
Sync the loadl group and userid to all nodes in the cluster:
See the step below on "(Optional) Synchronize system configuration files" for more details.
The role of the central manager is to examine the job.s requirements and find one or more machines in the LoadLeveler cluster that will run the job. Once it finds the machine(s), it notifies the scheduling machine. To Set up LoadLeveler Central Manager, install LoadLeveler on the node to act as LoadLeveler Central Manager. When setting up LoadLeveler in an xCAT non-hierarchical cluster, it is recommended to set up xCAT management node as the LoadLeveler Central Manager. When setting up LoadLeveler in an xCAT hierarchical cluster, it is recommended to set up one of your xCAT service nodes as the LoadLeveler Central Manager. If you have a different LoadLeveler Central Manager configuration, please see LoadLeveler documentation for more information.
To use the LoadLeveler database configuration option with the xCAT database, you will need to install LoadLeveler on your xCAT management node. You may also choose to configure your management node or service nodes as your LL central manager and resource manager. Following the LoadLeveler Installation Guide for details, install LoadLeveler on your xCAT management node. These are the high-level steps:
On Linux:
Make sure the following packages are installed on your management node:
compat-libstdc++-33.ppc64
libXmu.ppc64
libXtst.ppc64
libXp.ppc64
libXScrnSaver.ppc64
cd /install/post/otherpkgs/<os>/<arch>/loadl
IBM_LOADL_LICENSE_ACCEPT=yes rpm -Uvh ./LoadL-full-license*.rpm
rpm -Uvh ./LoadL-scheduler-full*.rpm ./LoadL-resmgr-full*.rpm
cd /install/post/otherpkgs/aix/ppc64/loadl
inutoc .
installp -Y -X -d . all
installp -X -B -d . all
You may choose to install LoadLeveler on your xCAT service node and configure it as your LL central manager or resource manager. Follow Setting_up_LoadLeveler_in_a_Stateful_Cluster/#linux to install LoadLeveler on your xCAT Linux service node. Follow Setting_up_LoadLeveler_in_a_Stateful_Cluster/#AIX to install LoadLeveler on your xCAT AIX service node.
Note: Installing LoadLeveler on xCAT service nodes follows the same process as Installing LoadLeveler on compute nodes, only with the differences below:
Python
PyODBC
For example:
cp /opt/xcat/share/xcat/IBMhpc/loadl/loadl.bnd /install/nim/installp_bundle/loadl-sn.bnd
Make sure the LoadL.scheduler packages exist in /install/post/otherpkgs/aix/ppc64/loadl directory
Add a new line in /install/nim/installp_bundle/loadl-sn.bnd
I:LoadL.scheduler
nim -o define -t installp_bundle -a server=master -a location=/install/nim/installp_bundle/loadl-sn.bnd loadl-sn
chdef -t osimage -o <image_name> -p installp_bundle="IBMhpc_base,loadl-sn"
Make your own copy for /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install, rename and edit it.
cp -p /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install /install/postscripts/loadl_install-sn
Modify /install/postscripts/loadl_install-sn, and change the "aix_loadl_bin" as:
aix_loadl_bin=/usr/lpp/LoadL/full/bin
For example:
cp -p /opt/xcat/share/xcat/IBMhpc/loadl/loadl-5103.otherpkgs.pkglist /opt/xcat/share/xcat/IBMhpc/loadl/loadl-5103-sn.otherpkgs.pkglist
#ENV:IBM_LOADL_LICENSE_ACCEPT=yes#
loadl/LoadL-full-license*
#loadl/LoadL-scheduler-full*
loadl/LoadL-resmgr-full*
To:
#ENV:IBM_LOADL_LICENSE_ACCEPT=yes#
loadl/LoadL-full-license*
loadl/LoadL-scheduler-full*
loadl/LoadL-resmgr-full*
Note: by default, it is assuming to install Loadl 5.1.0.3 or upper. If you wish to install Loadl 5.1.0.2 or below, then make your own copy for /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install, rename and edit it.
For example:
cp -p /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install /install/postscripts/loadl_install-sn
Modify /install/postscripts/loadl_install-sn, and change the three lines as:
#linux_loadl_license_script="/opt/ibmll/LoadL/sbin/install_ll -c resmgr"
linux_loadl_license_script="/opt/ibmll/LoadL/sbin/install_ll"
linux_loadl_bin=/opt/ibmll/LoadL/full/bin
After your xCAT management node and service nodes are installed with the LoadLeveler resmgr and scheduler packages, you can start to configure the xCAT management node or an xCAT service node as the LoadLeveler Central Manager and Resource Manager.
Generally, any LoadLeveler node can be specified as the LoadLeveler Central Manager and Resource Manager given that it has the LoadLeveler resmgr and scheduler packages installed, has remote access to the database when using the database configuration option, and has network connectivity to all of the nodes that will be part of your LoadLeveler cluster. For an xCAT HPC cluster without hierarchy, it is recommended that you set up your xCAT management node as the LoadLeveler Central Manager. For an xCAT HPC hierarchical cluster, it is recommended that you set up one of your xCAT service nodes as the LoadLeveler Central Manager. If you are setting up a service node as your central manager in an xCAT hierarchical cluster, you may need to also set up network routing so that the xCAT management node can communicate to the service node using the interface defined as the central manager. This may not be the same interface that the xCAT management node uses to communicate to the service node if they are on different networks. Follow the instructions in [Setup_IP_routes_between_the_MN_and_the_CNs].
To specify LoadLeveler Central Manager, you can either use llinit command if you are using file-based configuration:
llinit -cm <central manager host>
OR edit LoadL_config configuration file:
CENTRAL_MANAGER_LIST = <list of central manager and alt central managers>
OR if planning to use database configuration edit the ll-config stanza in the cluster configuration file:
CENTRAL_MANAGER_LIST = <list of central manager and alt central managers>
OR if you have already set up and are running the LoadLeveler database configuration option (see instructions below), change the attribute directly in the database:
llconfig -c CENTRAL_MANAGER_LIST=<central manager host>
Unless otherwise specified, LoadLeveler will use the central manager as the resource manager.
LoadLeveler provides the option to use configuration data from files or from a database. When setting up LoadLeveler in an xCAT HPC cluster, it is recommended that you use the database configuration option. This will use the xCAT MySQL or DB2 database. The DB2 database is required for Power 775 clusters. The hints and tips provided here will allow you to use xCAT to help set up default LoadLeveler database support. However, be sure that you have read through all of the LoadLeveler documentation for this support and understand what needs to be done to set it up. You will need to make modifications to the processes outlined here to take advantage of advanced LoadLeveler features and to set this up correctly for your environment.
All LoadLeveler nodes that will access the database must have access to the database server, and have ODBC installed and configured correctly. When setting up LoadLeveler in an xCAT non-hierarchical cluster, it is recommended that you set up your xCAT management node as the LoadLeveler DB access node. When setting up LoadLeveler in an xCAT hierarchical cluster, it is recommended that you set up your xCAT service nodes as the LoadLeveler DB access nodes. The LoadLeveler DB access nodes will serve all of its xCAT compute nodes as LoadLeveler "remotely configured nodes". The xCAT service nodes already have database access granted.
If you are running xCAT with the MySQL database, you will need to set up LoadLeveler to use this same database. Note that MySQL is not supported on Power 775 clusters. You must use DB2.
If you do not already have xCAT running with MySQL, follow the instructions in
[Setting_Up_MySQL_as_the_xCAT_DB] to convert your xCAT database to MySQL on xCAT management node. After that, follow the instructions below to set up ODBC for LoadLeveler DB access.
After your xCAT cluster is running with MySQL, to add LoadLeveler DB access on your xCAT management node, and configure the MySQL database for use with LoadLeveler, follow the instructions in Setting_Up_MySQL_as_the_xCAT_DB/#add-odbc-support to set up ODBC support. We only list the basic steps here that are fully documented in the XCAT doc:
Note: As of Oct 2010, the AIX deps package will automatically install the perl-DBD-MySQL , and unixODBC-* when installed on the Management or Service Nodes. On Redhat/Fedora and on SLES, MySQL comes as part of the OS. You may find these already installed.
cd <your xCAT-mysql rpm directory>
rpm -Uvh unixODBC-*
rpm -Uvh mysql-connector-odbc-*
With xCAT 2.6 and newer, run the command
mysqlsetup -o -L
This will set up /etc/odbcinst.ini, /etc/odbc.ini, and .odbc.ini in the root home directory and set the MySQL log-bin-trust-function-creators variable to on.
With xCAT 2.5 and older, run the command
mysqlsetup -o
and manually set the MySQL log-bin-trust-function-creators variable to ON using the MySQL interactive command:
mysql -u root -p
<enter password when prompted>
SET GLOBAL log_bin_trust_function_creators=1;
# On Linux:
cp /root/.odbc.ini /<user_home>/loadl
chown loadl:loadl /<user_home>/loadl/.odbc.ini
# On AIX:
cp /.odbc.ini /<user_home>/loadl
chown loadl:loadl /<user_home>/loadl/.odbc.ini
You can verify this access:
su - loadl
# On Linux:
/usr/bin/isql -v xcatdb
# On AIX:
/usr/local/bin/isql -v xcatdb
After your xCAT cluster is running with MySQL, to configure LoadLeveler DB access on xCAT service nodes, and configure the MySQL database for use with LoadLeveler, follow the instructions in Setting_Up_MySQL_as_the_xCAT_DB/#add-odbc-support - section "Setup the ODBC on the Service Node" to set up ODBC support. The basic steps are:
Note: As of Oct 2010, the AIX deps package will automatically install the perl-DBD-MySQL , and unixODBC-* when installed on the Management or Service Nodes. On Redhat/Fedora and on SLES, MySQL comes as part of the OS. With xCAT 2.6, the sample service package lists shipped with xCAT contain the ODBC rpms. You may find these already installed.
To include the rpms and ODBC files in the service node image, first add the rpms to the service node package list:
On Linux, add the rpms to the otherpkgs.pkglist file:
vi /install/custom/install/<ostype>/<service-profile>.otherpkgs.pkglist
# add the following entries:
unixODBC
mysql-connector-odbc
On AIX, add the rpms to the bundle file (assuming this bundle file is already defined to NIM and included in your xCAT osimage definition):
vi /install/nim/installp_bundle/xCATaixSN<version>.bnd
# add the following entries:
I:X11.base.lib
R:mysql-connector-odbc-*
For AIX61, the bundle file is /install/nim/installp_bundle/xCATaixSN61.bnd; For AIX71, the bundle file is /install/nim/installp_bundle/xCATaixSN71.bnd
With xCAT 2.6, xCAT provides an odbcsetup postbootscript. Add this to the list of postscripts run on your servicenode to create the required ODBC files:
chdef service -p postbootscripts=odbcsetup
With xCAT 2.5 and older, you will need to add the ODBC files to the synclist for your service node image:
vi /install/custom/install/<ostype>/<service-profile>.synclist
#add the following entries:
/etc/odbcinst.ini /etc/odbc.ini -> /etc/
# On Linux:
/root/.odbc.ini -> /root/
# On AIX:
/.odbc.ini -> /
and if you don't already have a synclist defined for your image:
chdef -t osimage -o <service node image> -p synclists=/install/custom/install/<ostype>/<service-profile>.synclist
If your service nodes are actively running, push out the changes now:
For xCAT 2.6 and newer:
updatenode -P odbcsetup
For xCAT 2.5 and older:
updatenode service -S
updatenode service -F
(These need to be run as two separate commands since the files need to get pushed out AFTER the packages are installed).
If you are running xCAT with the DB2 database, you will need to set up LoadLeveler to use this same database. If you do not already have xCAT running with DB2, follow the instructions in [Setting_Up_DB2_as_the_xCAT_DB] to convert your xCAT database to DB2 on xCAT management node. After that, follow the instruction below to set up ODBC for LoadLeveler DB access.
After your xCAT cluster is running with DB2, to configure LoadLeveler DB access on xCAT management node, and configure the DB2 database for use with LoadLeveler, follow the instructions in Setting_Up_DB2_as_the_xCAT_DB/#adding-odbc-support - section "Setup the ODBC on the Management Node" to set up ODBC support.
After your xCAT cluster is running with DB2, to configure LoadLeveler DB access on xCAT service nodes, and configure the DB2 database for use with LoadLeveler, follow the instructions in Setting_Up_DB2_as_the_xCAT_DB/#adding-odbc-support - section "Setup ODBC on the Service Nodes" to set up ODBC support. Also read the section on automatic setup of DB2 on the Service Nodes during install: Setting_Up_DB2_as_the_xCAT_DB/#setting-up-the-db2-client-on-the-service-nodes.
After your xCAT cluster is running with the MySQL or DB2 database, and your xCAT management node or service nodes are set up with ODBC support for LoadLeveler DB access following the instruction above, you can start to configure the xCAT management node or xCAT service nodes as the LoadLeveler DB access nodes.
Generally, all LoadLeveler nodes that have access to the database can be specified as LoadLeveler DB access nodes. While in an xCAT HPC cluster, it is recommended that you set up your xCAT management node as the LoadLeveler DB access node in an xCAT non-hierarchical cluster, and set up your xCAT service nodes as the LoadLeveler DB access nodes in an xCAT hierarchical cluster. If you have a different LoadLeveler DB access node configuration, please see LoadLeveler documentation for more information.
Modify /etc/LoadL.cfg master configuration file on the xCAT management node or xCAT service nodes to add a line:
LoadLDB = xcatdb
Follow the LoadLeveler instructions to perform the necessary steps to initialize and configure your cluster using the database configuration option. This includes things like properly editting your /etc/LoadL.cfg master configuration file, and determining your LoadLeveler configuration information, and running the llconfig -i command to initialize the database.
Note: By default, the xCAT HPC Integration support will only install the LoadLeveler resmgr rpm on the compute nodes in your cluster. Both the LoadLeveler resmgr and scheduler rpms are installed on your xCAT management node or your xCAT service nodes, so when you run llinit on your xCAT management node or your xCAT service nodes, it will configure the default LoadL_admin and LoadL_config files to reference these. You will need to modify the BIN and NEGOTIATOR values in the LoadL_config file to correctly work for all the compute nodes in your cluster. If you are running with the LoadLeveler database configuration option,use the llconfig -c command to change these values:
On Linux:
BIN = /opt/ibmll/LoadL/resmgr/full/bin/
NEGOTIATOR=/opt/ibmll/LoadL/scheduler/full/bin/LoadL_negotiator
On AIX:
BIN = /usr/lpp/LoadL/resmgr/full/bin/
NEGOTIATOR=/usr/lpp/LoadL/scheduler/full/bin/LoadL_negotiator
After you have made any needed updates, initialize the LoadLeveler database configuration:
llconfig -i -t <your cluster config file> -f <LoadL_config file you have edited>
Note: After llconfig -i is executed on the xCAT management node to initialize the LL database, it creates /install/postscripts/llserver.sh and /install/postscripts/llcompute.sh postscripts and the /install/postscripts/LoadL directory with files used by these scripts. You can use these postscripts via the xCAT postscript process to configure the selected nodes as the LoadLeveler database server nodes or compute nodes when running LoadLeveler with the database option. Please see the descriptions in llserver.sh and llcompute.sh for details.
To configure the LoadLeveler database server nodes, specify the llserver.sh as xCAT postbootscripts. For example:
To run during service node installs:
chdef <loadl db servers> -p postbootscripts=llserver.sh
To run immediately on a service node that is already installed:
updatenode <loadl db servers> -P llserver.sh
To configure the LoadLeveler compute nodes, specify the llcompute.sh as xCAT postbootscripts. For example:
To run during compute node installs or diskless boot:
chdef <loadl compute nodes> -p postbootscripts=llcompute.sh
To run immediately on a compute node that is already installed:
updatenode <loadl compute nodes> -P llcompute.sh
Note: When using service nodes make sure the postscripts and the LoadL subdirectory are copied to the /install/postscripts directories on each service node before the updatenode command is issued.
For example:
xdcp service -v -R /install/postscripts/* /install/postscripts
To continue to set up LoadLeveler in a Linux statelite or stateless cluster, follow these steps:
Include LoadLeveler in your diskless image:
Install the optional xCAT-IBMhpc rpm on your xCAT management node. This rpm is available with xCAT and should already exist in your zypper or yum repository that you used to install xCAT on your managaement node. A new copy can be downloaded from: Download xCAT.
To install the rpm in SLES:
zypper refresh
zypper install xCAT-IBMhpc
To install the rpm in Redhat:
yum install xCAT-IBMhpc
Add to pkglist: Edit your /install/custom/netboot/<ostype>/<profile>.pkglist and add the base IBMhpc pkglist:
#INCLUDE:/opt/xcat/share/xcat/IBMhpc/IBMhpc.sles11.ppc64.pkglist#
For rhels6 ppc64, edit the /install/custom/netboot/rh/compute.pkglist,
#INCLUDE:/opt/xcat/share/xcat/IBMhpc/IBMhpc.rhels6.ppc64.pkglist#
Verify that the above sample pkglist contains the correct packages. If you need to make changes, you can copy the contents of the file into your <profile>.pkglist and edit as you wish instead of using the #INCLUDE: ...# entry.
#INCLUDE:/opt/xcat/share/xcat/IBMhpc/loadl/loadl-5103.otherpkgs.pkglist#
Note: If you are using LoadLeveler 5.1.0.2 or below, please use pkglist /opt/xcat/share/xcat/IBMhpc/loadl/loadl.otherpkgs.pkglist
Verify that the above sample pkglist contains the correct LoadLeveler packages. If you need to make changes, you can copy the contents of the file into your <profile>.otherpkgs.pkglist and edit as you wish instead of using the #INCLUDE: ...# entry.
#INCLUDE:/opt/xcat/share/xcat/IBMhpc/IBMhpc.<osver>.<arch>.exlist#
#INCLUDE:/opt/xcat/share/xcat/IBMhpc/loadl/loadl.exlist#
For rhels6 ppc64, edit the /install/custom/netboot/rh/compute.exlist
#INCLUDE:/opt/xcat/share/xcat/IBMhpc/IBMhpc.rhels6.ppc64.exlist#
#INCLUDE:/opt/xcat/share/xcat/IBMhpc/loadl/loadl.exlist#
Verify that the above sample exclude lists contain the files and directories you want deleted from your diskless image. If you need to make changes, you can copy the contents of the file into your <profile>.exlist and edit as you wish instead of using the #INCLUDE: ...# entry.
Note: Several of the exclude list files shipped with xCAT-IBMhpc re-include files (with "+directory" syntax) that are normally deleted with the base exclude lists xCAT ships in /opt/xcat/share/xcat/netboot/<os>/compute.*.exlist. Keeping these files in the diskless image is required for the install and functionality of some of the HPC products.
If you are building a statelite image, refer to the xCAT documentation for statelite images for creating persistent files, identifying mount points, and configuring your xCAT cluster for working with statelite images. For your LoadLeveler support, add writable and persistent directories/files required by LoadLeveler to your litefile table in the xCAT database:
tabedit litefile
<in a separate window> cut the contents of /opt/xcat/share/xcat/IBMhpc/loadl/litefile.csv
paste into your tabedit session, modify as needed for your environment, and save
When using persistent files, you should also make sure that you have an entry in your xCAT database statelite table pointing to the location for storing those files for each node.
LoadLeveler requires that directories specified in the LOG, EXECUTE and SPOOL configuration keywords be writable and persistent. If you are using GPFS filesystems, the preferred location is to put these directories in a GPFS filesystem using $(host). For instance, the LOG directory can be specified as LOG = /LL/$(host)/log".
If you are not using GPFS filesystems, you will need to mount an NFS writeable directory for each node or include these directories in your litefile table to have xCAT manage the persistence. For detailed information about these LoadLeveler configuration keywords see TWS LoadLeveler: Using and Administering.
Included in this list is an entry for /var/loadl which is the default location for the LoadLeveler log files. This directory is also referenced by the /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install script. If you change this location, make sure to change it both in the litefile table and in the loadl_install script.
Included in this list is an entry for the /home directory. Depending on how you are managing your shared home directory for the cluster, you may need to implement a postbootscript that mounts the correct shared home directory on the node onto /.statelite/tmpfs/home.
Add to postinstall scripts:
Edit your /install/custom/netboot/<ostype>/<profile>.postinstall(please make sure it has executable permission) and add:
/opt/xcat/share/xcat/IBMhpc/IBMhpc.sles.postinstall $1 $2 $3 $4 $5
installroot=$installroot loadldir=/install/post/otherpkgs/$osver/$arch/loadl NODESETSTATE=genimage /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install-5103
For rhels6 ppc64, edit the /install/custom/netboot/rh/compute.postinstall(please make sure it has executable permission) and add:
/opt/xcat/share/xcat/IBMhpc/IBMhpc.rhel.postinstall $1 $2 $3 $4 $5
installroot=$1 loadldir=/install/post/otherpkgs/rhels6/ppc64/loadl NODESETSTATE=genimage /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install-5103
Note: If you are using LoadLeveler 5.1.0.2 or below, please instead use postinstall script /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install
Review these sample scripts carefully and make any changes required for your cluster. Note that some of these scripts may change tuning values and other system settings. They will be run by genimage after all of your rpms are installed into the image. First the basic IBMhpc script will be run to create filesystems, turn on services, and set some tunables. Then the script to accept the LoadLeveler license and install only the LoadL-resmgr-full rpm will be run. This script will also perform some configuration for using LoadLeveler in your xCAT cluster such as creating LoadLeveler directories in your diskless image and adding LoadLeveler paths to the default profile. Verify that these scripts will work correctly for your cluster. If you wish to make changes to one of these scripts, copy it to /install/postscripts and adjust the above entry in the postinstall script to invoke your updated copy.
(Optional) Synchronize system configuration files:
LoadLeveler requires that userids be common across the cluster. There are many tools and services available to manage userids and passwords across large numbers of nodes. One simple way is to use common /etc/password files across your cluster. You can do this using xCAT's syncfiles function. Create the following file:
vi /install/custom/netboot/<ostype>/<profile>.synclist
add the following line (modify as appropriate for the files you wish to synchronize):
/etc/hosts /etc/passwd /etc/group /etc/shadow -> /etc/
When packimage or litemiage is run, these files will be copied into the image. You can periodically re-sync these files as changes occur in your cluster. See the xCAT documentation for more details: [Sync-ing_Config_Files_to_Nodes]
Network boot your nodes:
To continue to set up LoadLeveler in an AIX diskless, follow these steps:
Include LoadLeveler in your diskless image:
Install the optional xCAT-IBMhpc rpm on your xCAT management node. This rpm is available with xCAT and should already exist in the directory that you downloaded your xCAT rpms to. It did not get installed when you ran the instxcat script. A new copy can be downloaded from: Download xCAT.
To install the rpm:
cd <your xCAT rpm directory>
rpm -Uvh xCAT-IBMhpc*.rpm
If you skipped the previous optional step of installing LoadLeveler on your management node, copy the LoadLeveler product packages and PTFS from your distribution media onto the xCAT management node (MN). Suggested target location to put the packages on the xCAT MN:
/install/post/otherpkgs/aix/ppc64/loadl
The packages that will be installed by the xCAT HPC Integration support are listed in sample bundle files. Review the following file to verify you have all the product packages you wish to install (instructions are provided below for copying and editing this file if you choose to use a different list of packages):
/opt/xcat/share/xcat/IBMhpc/loadl/loadl.bnd
Add the LoadLeveler packages to the lpp_source used to build your diskless image:
inutoc /install/post/otherpkgs/aix/ppc64/loadl
nim -o update -a packages=all -a source=/install/post/otherpkgs/aix/ppc64/loadl <lpp_source_name>
Add additional base AIX packages to your lpp_source:
Some of the HPC products require additional AIX packages that may not be part of your default AIX lpp_source. Review the following file to verify all the AIX packages needed by the HPC products are included in your lpp_source (instructions are provided below for copying and editing this file if you choose to use a different list of packages):
/opt/xcat/share/xcat/IBMhpc/IBMhpc_base.bnd
To list the contents of your lpp_source, you can use:
nim -o showres <lpp_source_name>
And to add additional packages to your lpp_source, you can use the nim update command similar to above specifying your AIX distribution media and the AIX packages you need.
Create NIM bundle resources for base AIX prerequisites and for your LoadLeveler packages:
cp /opt/xcat/share/xcat/IBMhpc/IBMhpc_base.bnd /install/nim/installp_bundle
nim -o define -t installp_bundle -a server=master -a location=/install/nim/installp_bundle/IBMhpc_base.bnd IBMhpc_base
cp /opt/xcat/share/xcat/IBMhpc/loadl/loadl.bnd /install/nim/installp_bundle
nim -o define -t installp_bundle -a server=master -a location=/install/nim/installp_bundle/loadl.bnd loadl
Review these sample bundle files and make any changes as desired. Note that the loadl.bnd file will only install the LoadL.resmgr lpp. If you wish to install the full LoadLeveler product on all of your compute nodes, edit this bundle file, and make corresponding changes to the loadl_install postscript below.
chdef -t osimage -o <image_name> -p installp_bundle="IBMhpc_base,loadl"
Note: Verify that there are no nodes actively using the current diskless image. NIM will fail if there are any NIM machine definitions that have the SPOT for this image allocated. If there are active nodes accessing the image, you will either need to power them down and run rmdkslsnode for those nodes, or you will need to create a new image and then switch your nodes to that image later. For more information and detailed instructions on these options, see the xCAT document for updating software on AIX nodes: [Updating_AIX_Software_on_xCAT_Nodes]
mknimimage -u <image_name>
cp -p /opt/xcat/share/xcat/IBMhpc/IBMhpc.postscript /install/postscripts
cp -p /opt/xcat/share/xcat/IBMhpc/loadl/loadl_install-5103 /install/postscripts
chdef -t group -o <compute nodegroup> -p postscripts="IBMhpc.postscript,loadl_install-5103"
Review these sample scripts carefully and make any changes required for your cluster. Note that some of these scripts may change tuning values and other system settings. The scripts will be run on the node after it has booted as part of the xCAT diskless node postscript processing.
(Optional) Synchronize system configuration files:
LoadLeveler requires that userids be common across the cluster. There are many tools and services available to manage userids and passwords across large numbers of nodes. One simple way is to use common /etc/password files across your cluster. You can do this using xCAT's syncfiles function. Create the following file:
vi /install/custom/netboot/aix/<profile>.synclist
add the following lines (and modify these entries based on the files you wish to synchronize):
/etc/hosts /etc/passwd /etc/group -> /etc/
/etc/security/passwd /etc/security/group /etc/security/limits /etc/security/roles -> /etc/security/
Add this syncfile to your image:
chdef -t osimage -o <imagename> synclists=/install/custom/netboot/aix/<profile>.synclist
Update the image:
mknimimage -u <imagename>
You can periodically re-sync these files to the nodes as changes occur in your cluster by running 'updatenode <noderange> -F'. See the xCAT documentation for more details: [Sync-ing_Config_Files_to_Nodes]
Follow the instructions in the xCAT AIX documentation [XCAT_AIX_Diskless_Nodes] to network boot your nodes:
Note:Before start LoadLeveler on your cluster nodes, the HPC admin should need to validate that LoadL configuration files are properly setup.Please refer to the Installation Guide on the page linkhttp://publib.boulder.ibm.com/infocen.../llbooks.html . Please choose the correct Installation Guide version for Different OS.
There are several ways you can start LoadLeveler on your cluster nodes.
Use the xCAT xdsh command to run the LoadLeveler llrctl start command individually on all nodes, or use LoadLeveler to distribute the commands by running "llrctl -g start". Note that for very large clusters, running the llrctl -g command to start the daemons on all the nodes in the cluster can take a long time since this is a serial operation from the LoadLeveler central manager. Therefore, using xdsh with appropriate fanout values may be a better choice.
You can set up /etc/inittab or /etc/init.d to automatically start LoadLeveler when your node boots. However, if your shared home directory is in GPFS, and this is a large cluster using a network interface that may take a little extra start time at node boot, this may not be a reliable way to start the daemons.
You can create your own postscript in /install/postscripts and add an entry to the xCAT postscripts table for your nodes. The postscript can be as simple as:
/opt/ibmll/LoadL/resmgr/full/bin/llrctl start
If your home directory is stored in GPFS, you may want to add a verification to this script first checking that GPFS is running and your /u/loadl home directory is available before starting the LoadLeveler daemon.
Wiki: Download_xCAT
Wiki: Granting_Users_xCAT_privileges
Wiki: IBM_HPC_Stack_in_an_xCAT_Cluster
Wiki: Power_775_Cluster_Recovery
Wiki: Power_775_Cluster_on_MN
Wiki: Setting_Up_IBM_HPC_Products_on_a_Statelite_or_Stateless_Login_Node
Wiki: Setting_up_all_IBM_HPC_products_in_a_Statelite_or_Stateless_Cluster
Wiki: Sync-ing_Config_Files_to_Nodes
Wiki: Updating_AIX_Software_on_xCAT_Nodes
Wiki: Updating_IBM_HPC_product_software
Wiki: XCAT_2.6.6_Release_Notes
Wiki: XCAT_AIX_Diskless_Nodes
Wiki: XCAT_pLinux_Clusters_775