This cookbook provides instructions on how to use xCAT to create and deploy a Linux cluster on IBM power system machines.
The power system machines have the following characteristics:
xCAT supports three types of installations for compute nodes: Diskfull installation (Statefull), Diskless Stateless, and Diskless Statelite. xCAT also supports hierarchical clusters where one or more service nodes are used to handle the installation and management of compute nodes. Please refer to [Setting_Up_a_Linux_Hierarchical_Cluster] for hierarchical usage.
This document will guide you through installing xCAT on your management node, configuring your cluster, deploying a Linux operating system to your compute nodes, and optionally upgrading firmware on your power system hardware.
To provide an easier understanding of the installation steps, this cookbook provides example commands for the following cluster configuration:
The management node:
Arch: an LPAR on a p5/p6/p7 machine
OS: Red Hat Enterprise Linux 5.2
Hostname: pmanagenode
IP: 192.168.0.1
HCP: HMC
The management Network:
Net: 192.168.0.0
NetMask: 255.255.255.0
Gateway: 192.168.0.1
Cluster-face-IF: eth1
dhcpserver: 192.168.0.1
tftpserver: 192.168.0.1
nameservers: 192.168.0.1
The compute nodes:
Arch: an LPAR on a p5/p6/p7 machine
OS: Red Hat Enterprise Linux 5.2
HCP: HMC
Hostname: pnode1 - this node will be installed statefull
IP: 192.168.0.10
Cluster-face-IF: eth0
Hostname: pnode2 - this node will be installed stateless
IP: 192.168.0.20
Cluster-face-IF: eth0
The Hardware Control Point (only for the non-DFM case):
Name: hmc1
IP: 192.168.0.100
Before proceeding to setup your pLlinux Cluster, you must first follow the instructions for installing the operating system, configuring your server to allow xCAT to set up valid defaults, and downloading and installing the xCAT rpms on your Linux management node:
[Setting_Up_a_Linux_xCAT_Mgmt_Node]
The details in the following chapters refer to various xCAT database tables. It may be helpful to briefly review the list of tables in the xCAT database and their descriptions to give you more background information before proceding. See xcatdb manpage
The tftp client in the open firmware of p5 is only compatible with tftp-server instead of atftpd which is required by xCAT2. So we have to remove the atftpd first and then install the tftp-server. This is not required for Power6 or later.
rpm -qa | grep atftp
Could find one or both of the following rpms:
atftp-xcat-*
atftp-*
service tftpd stop
rpm --nodeps -e atftp-xcat atftp
[RH]:
yum install tftp-server.ppc
zypper install tftp
Notes: make sure the entry "disable=no" in the /etc/xinetd.d/tftp.
service xinetd restart
The xCAT database table passwd contains default userids and passwords for xCAT to access cluster components. This section will describe how to set the default userids and passwords for system and hmc in xCAT database table.
This is the password that will be assigned to root, when nodes are installed.
chtab key=system passwd.username=root passwd.password=cluster
A basic networks table was created for you during the xCAT rpm install on the management node. Review that table, remove entries that are not relevant to your cluster management, and add additional networks based on your hardware configuration.
To list the existing network definitions:
lsdef -t network -l
To create an additional network that will be used for cluster management:
mkdef -t network -o net1 net=192.168.0.0 mask=255.255.255.0 gateway=192.168.0.1 mgtifname=eth1
dhcpserver=192.168.0.1 tftpserver=192.168.0.1 nameservers=192.168.0.1
The makenetworks command helps gather and define network information from the management node, and the configured xCAT service nodes, and automatically saves this information in the xCAT database. Several services need to be configured before the SNs are installed, so you have to add these networks manually in order to add the hfi, ml, fsp, gpfs, etc networks.
To enable the NTP services on the cluster, first configure NTP on the management node:
vi /etc/ntp.conf
Set a list of time servers the management node will be using and any other configuration information you desire. If your management node will be a timeserver for the cluster, make sure you have a "restrict" stanza that will allow queries to this server. For example, if your cluster network is 192.168.0.0, you can add:
restrict 192.168.0.0 mask 255.255.255.0 nomodify notrap
Adjust your system clock to be "close" to the time served by your timeserver. For example, if 0.north-america.pool.ntp.org is one of your timeservers, run:
ntpdate -u 0.north-america.pool.ntp.org
If your system clock varied greatly from the timeserver, you may need to run ntpdate several times to close the time gap. The "offset" return value tells you how much of a time change was made. Use caution when making large changes to the system clock since this can impact time-sensitive applications running on your management node.
Configure the ntpd service to start at system boot:
chkconfig ntpd on
Start ntpd:
service ntpd start
Next set the ntpservers attribute in the site table. Whatever time servers are listed in this attribute will be used by all the nodes that boot directly from the management node. In our example, the Management Node will be used as the ntp server.
_chdef -t site ntpservers= myMN_
To have xCAT automatically set up ntp on the cluster nodes you must add the setupntp script to the list of postscripts that are run on the nodes.
To do this you can either modify the postscripts attribute for each node individually or you can just modify the definition of a group that all the nodes belong to.
For example, if all your nodes belong to the group compute, then you could add setupntp to the group definition by running the following command.
_chdef -p -t group -o compute postscripts=setupntp_
There are many different ways to set up name resolution for your cluster. You must ensure that all hostnames and IP addresses for your nodes, hmcs, fsps, etc., are resolvable. This section outlines some simple examples for setting up name resolution.
Specify all hostnames and IP addresses for your cluster nodes, hmcs, fsps, etc., in your /etc/hosts file:
vi /etc/hosts
127.0.0.1 localhost
192.168.0.1 pmanagenode
192.168.0.10 pnode1
192.168.0.20 pnode2
192.168.0.100 hmc1
.
.
.
Another way to create entries in your /etc/hosts file is by using the xCAT makehosts command. This is useful for large clusters when hostname/IP address mapping follows reqular patterns. However, this requires entries in your xCAT hosts table and node definitions to exist for your cluster. You may wish create a minimal /etc/hosts file now and come back to this step again after you have defined all of your nodes to xCAT.
Add your cluster domain name and the management node's nameserver into /etc/resolv.conf:
vi /etc/resolv.conf
search cluster.net
nameserver 9.112.4.1
Setup the nameserver for your cluster nodes. In our example, the xCAT management node will be the nameserver for the cluster:
chdef -t site nameservers=192.168.0.1
Setup the external nameserver:
chdef -t site forwarders=9.112.4.1
Setup the local domain name:
chdef -t site domain=cluster.net
Set up your xCAT management node as a DNS server using named:
makedns
service named start
chkconfig --level 345 named on
If you add nodes or update the networks table at a later time, then rerun makedns:
makedns
service named restart
The xCAT rcons command uses the conserver package to provide support for a single read-write console or multiple read-only consoles on a single node and for console logging. For example, if a user has a read-write console session open on node node1, other users could also log in to that console session on node1 as read-only users. This allows sharing a console server session between multiple users for diagnostic or other collaborative purposes. The console logging function will log the console output and activities for any node with remote console attributes set to the following file which an be replayed using the xCAT replaycons command for debugging or any other purposes:
/var/log/consoles/<os> arch=ppc64 profile=compute
For valid options:
tabdump -d nodetype
The xCAT rcons command uses the conserver package to provide support for multiple read-only consoles on a single node and the console logging. For example, if a user has a read-write console session open on node node1, other users could also log in to that console session on node1 as read-only users. This allows sharing a console server session between multiple users for diagnostic or other collaborative purposes. The console logging function will log the console output and activities for any node with remote console attributes set to the following file which an be replayed for debugging or any other purposes:
/var/log/consoles/<management node>
Note: conserver=<management node> is the default, so it optional in the command
Each xCAT node with remote console attributes set should be added into the conserver configuration file to make the rcons work. The xCAT command makeconservercf will put all the nodes into conserver configuration file /etc/conserver.cf. The makeconservercf command must be run when there is any node definition changes that will affect the conserver, such as adding new nodes, removing nodes or changing the nodes' remote console settings.
To add or remove new nodes for conserver support:
makeconservercf
service conserver stop
service conserver start
The functions rnetboot and getmacs depend on conserver functions, check it is available.
rcons pnode1
If it works ok, you will get into the console interface of the pnode1. If it does not work, review your rcons setup as documented in previous steps.
See if you setup is correct at this point, run rpower to check node status:
rpower pnode1 stat
Before run getmacs, make sure the node is off. The reason is the HMC cannot shutdown linux nodes which are in running state.
You can force the lpar shutdown with:
rpower pnode1 stat, if node is on then run
rpower pnode1 off
If there's only one Ethernet adapter on the node or you have specified the installnic or primarynic attribute of the node, using following command can get the correct mac address.
Check for *nic definition, by running
lsdef pnode1
To set installnic or primarynic:
chdef -t pnode1 -o blade01 installnic=eth0 primarynic=eth1
Get mac addresses:
getmacs pnode1
If there are more than one Ethernet adapters on the node, and you don't know which one has been configured for the installation process, or the lpar is just created and there is no active profile for that lpar, or the lpar is on a P5 system and there is no lhea/sea ethernet adapters, then you have to specify more parameters for the lpar to try to figure out an available interface by using the ping operation. Run this command:
getmacs pnode1 -D -S 192.168.0.1 -G 192.168.0.10
The output looks like following:
pnode1:
Type Location Code MAC Address Full Path Name Ping Result Device Type
ent U9133.55A.10E093F-V4-C5-T1 f2:60:f0:00:40:05 /vdevice/l-lan@30000005 virtual
And the Mac address will be written into the xCAT mac table. Run to verify:
tabdump mac
To set installnic or primarynic:
chdef -t node -o c250f07c04ap13 installnic=hf0 primarynic=hf1
Get mac addresses:
getmacs c250f07c04ap13 -D
Add the defined nodes into the DHCP configuration:
makedhcp c250f07c04ap13
Restart the dhcp service:
service dhcpd restart
xCAT supports the running of customization scripts on the nodes when they are installed. You can see what scripts xCAT will run by default by looking at the xcatdefaults entry in the xCAT postscripts database table. The postscripts attribute of the node definition can be used to specify the comma separated list of the scripts that you want to be executed on the nodes. The order of the scripts in the list determines the order in which they will be run.
To check current postscript and postbootscripts setting:
tabdump postscripts
For example, if you want to have your two scripts called foo and bar run on node node01 you could add them to the postscripts table:
_**chdef -t node -o node01 -p postscripts=foo,bar**_
(The -p flag means to add these to whatever is already set.)
For more information on creating and setting up Post*scripts: [Postscripts_and_Prescripts]
You can use the iso file of the installed OS to extract the installation files. For example, you have a iso file /iso/RHEL5.2-Server-20080430.0-ppc-DVD.iso
copycds /iso/RHEL5.2-Server-20080430.0-ppc-DVD.iso
Note: If you encounter the issue that the iso cannot be mounted by the copycds command. Make sure the SElinux is disabled.
Before following the next steps of installation, you need to know about the relationship between <os> and <platform>. Normally, <os> is used as the name of the operating system, <platform> is used as one family or platform containing many operating systems. We can think that <platform> contains <os>.
For example, considering RedHat Enterprise Linux 6.0, rhels6 is the <os>, and rh is the <platform>. For SuSE Linux Enterprise Server 11 SP1, sles11.1 is the <os>, and sles is the <platform>.
Note: This naming convention is suitable for the installation of the Stateful/Stateless/Statelite Compute/Service nodes.
xCAT uses KickStart or AutoYaST installation templates and related installation scripts to complete the installation and configuration of the compute node.
You can find sample templates for common profiles in following directory:
/opt/xcat/share/xcat/install/<platform>/
If you customize a template then you should copy it to:
/install/custom/install/<platform> directory.
The profile, os and architecure of the node was setup in Set the type attributes of the node above.
To check the setting of your node's profile, os, architecture run:
lsdef pnode1
Object name: pnode1
.
.
.
arch=ppc64
os=rhels5.5
profile=compute
For this example, the search order for the template file is as follows:
The directory /install/custom/install/<platform> will be searched first and then search /opt/xcat/share/xcat/install/<platform>.
Then in the diretory the following order will be honored:
compute.rhels5.5.ppc64.tmpl
compute.rhels5.ppc64.tmpl
compute.rhels.ppc64.tmpl
compute.rhels5.5.tmpl
compute.rhels5.tmpl
compute.rhels.tmpl
compute.ppc64.tmpl
compute.tmpl
If you want to customize a template for node , you should copy the template to the /install/custom/install/<os>/ directory and make your modifications unless you rename your file. You need to copy to the custom directory, so the next install of xCAT will not wipe out your modifications, because it will update the /opt/xcat/share directory. Keep in mind the above search order to make sure it picks up your template.
Note: Sometimes the directory /opt/xcat/share/xcat/install/scripts also needs to be copied to /install/custom/install/ to make the customized profile work, because the customized profiles will need to include the files in scripts directory as the prescripts and postscripts.
For example, you need to put the .otherpkgs.pkglistfile into the /install/custom/install/<os>/ directory, if you need to install other packages.
If you want to install a specific package like a specific .rpm onto the compute node, copy the rpm into the following directory:
/install/post/otherpkgs/<os>/<arch>
Another thing you MUST DO is to create repodata for this directory. You can use the "createrepo" command to create repodata.
On RHEL5.x, the "createrepo" rpm package can be found in the install ISO; on SLES11, it can be found in SLE-11-SDK-DVD Media 1 ISO.
After "createrepo" is installed, you need to create one text file which contains the complete list of files to include in the repository. For example, the name of the text file is rpms.list in /install/post/otherpkgs/<os>/<arch> directory. Create rpms.list:
cd /install/post/otherpkgs/<os>/<arch>
ls *.rpm >rpms.list
Then, please run the following command to create the repodata for the newly-added packages:
createrepo -i rpms.list /install/post/otherpkgs/<os>/<arch>
The createrepo command with -i rpms.list option will create the repository for the rpm packages listed in the rpms.list file. It won't destroy or affect the rpm packages that are in the same directory, but have been included into another repository.
Or, if you create a sub-directory to contain the rpm packages, for example, named other in /install/post/otherpkgs/<os>/<arch>. Please run the following command to create repodata for the directory /install/post/otherpkgs/<os>/<arch>/.
createrepo /install/post/otherpkgs/<os>/<arch>/**other**
Note: Please replace other with your real directory name.
nodeset pnode1 install
rnetboot pnode1
After the node installation is completed successfully, the node's status will be changed to booted, the following command to check the node's status:
lsdef pnode1 -i status
When the node's status is changed to booted, you can also check ssh service on the node is working and you can login without password. Note: Do not run ssh or xdsh against the node until the node installation is completed successfully. Running ssh or xdsh against the node before the node installation completed may result in ssh hostkeys issues.
If ssh is working but cannot login without password, force exchange the ssh key to the compute node using xdsh:
xdsh pnode1 -K
After exchanging ssh key, following command should work, without being prompted for a password.
xdsh pnode1 date
Using a postinstall script ( you could also use the updatenode method):
mkdir /install/postscripts/data
cp <kernel> /install/postscripts/data
Create the postscript updatekernel:
vi /install/postscripts/updatekernel
#!/bin/bash
rpm -Uivh data/kernel-*rpm
chmod 755 /install/postscripts/updatekernel
Add the script to the postscripts table and run the install:
chdef -p -t group -o compute postscripts=updatekernel
rinstall compute
Typically, you can build your stateless compute node image on the Management Node, if it has the same OS and architecture as the node. If you need another OS image or architecture than the OS installed on the Management Node, you will need a machine that meets the OS and architecture you want for the image and create the image on that node.
If the stateless image you are building doesn't match the OS/architecture of the Management Node, logon to the node with the desired architecture.
ssh <node>
mkdir /install
mount xcatmn:/install /install ( make sure the mount is rw)
The default list of rpms to added or exclude to the diskless images is shipped in the following directory:
/opt/xcat/share/xcat/netboot/<platform>
If you want to modify the current defaults for .pkglist or .exlist or *.postinstall, copy the shipped default lists to the following directory, so your modifications will not be removed on the next xCAT rpm update. xCAT will first look in the custom directory for the files before going to the share directory.
/install/custom/netboot/<platform> directory
If you want to exclude more packages, add them into the following exlist file:
/install/custom/netboot/<platform>/<profile>.exlist
Add more packages names that need to be installed on the stateless node into the pkglist file
/install/custom/netboot/<platform>/<profile>.pkglist
There are rules ( release 2.4 or later) for which * postinstall files will be selected to be used by genimage.
If you are going to make modifications, copy the appropriate /opt/xcat/share/xcat/netboot/<platform>/*postinstall file to the
/install/custom/netboot/<platform> directory:
cp opt/xcat/share/xcat/netboot/<platform>/*postinstall /install/custom/netboot/<platform>/.
Use these basic rules to edit the correct file in the /install/custom/netboot/<platform> directory. The rule allows you to customize your image down to the profile, os and architecture level, if needed.
You will find postinstall files of the following formats and genimage* will process the files in the order of the below formats:
<profile>.<os>.<arch>.postinstall
<profile>.<arch>.postinstall
<profile>.<os>.postinstall
<profile>.postinstall
This means, if "<profile>.<os>.<arch>.postinstall" is there, it will be used first.
Make sure you have the basic postinstall script setup in the directory to run for your genimage. The one shipped will setup fstab and rcons to work properly and is required.
You can add more postinstall process ,if you want. The basic postinstall script (2.4) will be named <profile>.<arch>.postinstall ( e.g. compute.ppc64.postinstall). You can create one for a specific os by copying the shipped one to , for example, compute.rhels5.4.ppc64.postinstall
Note: you can use the sample here: /opt/xcat/share/xcat/netboot/<platform>/
[RH]:
Add following packages name into the <profile>.pkglist
bash
nfs-utils
stunnel
dhclient
kernel
openssh-server
openssh-clients
busybox-anaconda
wget
vim-minimal
ntp
You can add any other packages that you want to install on your compute node. For example, if you want to have userids with passwords you should add the following:
cracklib
libuser
passwd
Add following packages name into the <profile>.pkglist
aaa_base
bash
nfs-utils
dhcpcd
kernel
openssh
psmisc
wget
sysconfig
syslog-ng
klogd
vim
The HFI kernel can be installed by xCAT automatically, but other packages, such as hfi_util and nettools, require the rpm options --nodeps or --force respectively, which xCAT cannot handle automatically. So we need to modify the postinstall file manually to install those packages during the diskless image generation.
Add the following lines to /install/custom/netboot/rh/compute.rhels6.ppc64.postinstall. (rhels6 stands for the OS version - it should be the same as the previous step.)
cp /hfi/dd/* /install/netboot/rhels6/ppc64/compute/rootimg/tmp/
chroot /install/netboot/rhels6/ppc64/compute/rootimg/ /bin/rpm -ivh /tmp/dhclient-4.1.1-13.P1.el6.ppc64.rpm --force
chroot /install/netboot/rhels6/ppc64/compute/rootimg/ /bin/rpm -ivh /tmp/dhcp-4.1.1-13.P1.el6.ppc64.rpm --force
chroot /install/netboot/rhels6/ppc64/compute/rootimg/ /bin/rpm -ivh /tmp/kernel-headers- 2.6.32-71.el6.20110617.ppc64.rpm --force
chroot /install/netboot/rhels6/ppc64/compute/rootimg/ /bin/rpm -ivh /tmp/net-tools-1.60-102.el6.ppc64.rpm --force
chroot /install/netboot/rhels6/ppc64/compute/rootimg/ /bin/rpm -ivh /tmp/hfi_ndai-1.2-0.el6.ppc64.rpm --force
chroot /install/netboot/rhels6/ppc64/compute/rootimg/ /bin/rpm -ivh /tmp/hfi_util-1.12-0.el6.ppc64.rpm --force
cd /opt/xcat/share/xcat/netboot/rh
./genimage -i eth0 -n ibmveth -o rhels5.2 -p compute
cd /opt/xcat/share/xcat/netboot/sles
./genimage -i eth0 -n ibmveth -o sles11 -p compute
[RHEL]: In Power 775, there is a HFI enabled kernel required in the diskless image. The compute nodes could boot from this customized kernel and boot from HFI interfaces.
cp /hfi/dd/kernel-2.6.32-71.el6.20110617.ppc64.rpm /install/kernels/
cd /opt/xcat/share/xcat/netboot/rh/
./genimage -i hf0 -n hf_if -o rhels6 -p compute -k 2.6.32-71.el6.20110617.ppc64
This is used by the postscript hficonfig to configure all the HFI interfaces on the compute nodes. Setup a synclist file containing this line:
/etc/hosts -> /etc/hosts
The file can be put anywhere, but let's assume you name it /tmp/synchosts.
Make sure you have an OS image object in the xCAT database associated with your nodes and issue command:
xdcp -i <imagepath> -F /tmp/synchosts
<imagepath> stands for the OS image path. Generally it will be located in /install/netboot direcotry, for example, if the image path is /install/netboot/rhels6/ppc64/compute/rootimg, the command will be:
xdcp -i /install/netboot/rhels6/ppc64/compute/rootimg -F /tmp/synchosts
packimage -o rhels5.2 -p compute -a ppc64
packimage -o sles11 -p compute -a ppc64
nodeset pnode2 netboot
rnetboot pnode2
After the node installation is completed successfully, the node's status will be changed to booted, the following command to check the node's status:
lsdef pnode2 -i status
When the node's status is changed to booted, you can also check ssh service on the node is working and you can login without password.
Note: Do not run ssh or xdsh against the node until the node installation is completed successfully. Running ssh or xdsh against the node before the node installation completed may result in ssh hostkeys issues.
If ssh is working but cannot login without password, force exchange the ssh key to the compute node using xdsh:
xdsh pnode2 -K
After exchanging ssh key, following command should work.
xdsh pnode2 date
The kerneldir attribute in linuximage table is used to assign one directory to hold the new kernel to be installed into the stateless/statelite image. Its default value is /install/kernels, you need to create the directory named <kernelver> under the kerneldir, and genimage will pick them up from there.
Assuming you have the kernel in RPM format in /tmp, the value of kerneldir is not set (which will take the default value: /install/kernels).
Before the 2.6.1, the content in the kernel packages need to be extracted out for genimage to take. In the version 2.6.1 and later, the kernel will be installed directly from rpm packages.
The kernel RPM package is usually named kernel-<kernelver>.rpm, for example: kernel-2.6.32.10-0.5.ppc64.rpm is the kernel package for 2.6.32.10-0.5.ppc64.
2.6.1 and later
cp /tmp/kernel-2.6.32.10-0.5.ppc64.rpm /install/kernels/
Before 2.6.1
mkdir -p /install/kernels/2.6.32.10-0.5.ppc64
cd /install/kernels/2.6.32.10-0.5.ppc64
rpm2cpio /tmp/kernel-2.6.32.10-0.5.ppc64.rpm |cpio -idum
Usually, the kernel files for SLES are separated into two parts: kernel-<arch>-base and kernel, and the naming of kernel RPM packages are different. For example, there's two RPM packages in /tmp:
kernel-ppc64-base-2.6.27.19-5.1.ppc64.rpm
kernel-ppc64-2.6.27.19-5.1.ppc64.rpm
2.6.27.19-5.1.ppc64 is NOT the kernel version. 2.6.27.19-5-ppc64 is the kernel version actually. Please follow the naming rule to determine the kernel version.
After the kernel version is determined for SLES, then:
2.6.1 and later
cp /tmp/kernel-ppc64-base-2.6.27.19-5.1.ppc64.rpm /install/kernels/
cp /tmp/kernel-ppc64-2.6.27.19-5.1.ppc64.rpm /install/kernels/
Before 2.6.1
mkdir -p /install/kernels/2.6.27.19-5-ppc64
cd /install/kernels/2.6.27.19-5-ppc64
rpm2cpio /tmp/kernel-ppc64-base-2.6.27.19-5.1.ppc64.rpm |cpio -idum
rpm2cpio /tmp/kernel-ppc64-2.6.27.19-5.1.ppc64.rpm |cpio -idum
Run genimage/packimage to update the image with the new kernel: (Use sles as example)
2.6.1 and later
Since the kernel version is different with the rpm package version, the -g flag needs to be specified for the rpm version of kernel packages.
genimage -i eth0 -n ibmveth -o sles11.1 -p compute -k 2.6.27.19-5-ppc64 -g 2.6.27.19-5.1
Before 2.6.1
genimage -i eth0 -n ibmveth -o sles11.1 -p compute -k 2.6.27.19-5-ppc64
packimage -o sles11.1 -p compute -a ppc64
Reboot the node with the new image:
nodeset pnode2 netboot
rnetboot pnode2
To show the new kernel, run:
xdsh pnode2 uname -a
If you want to remove an image, rmimage is used to remove the Linux stateless or statelite image from the file system. It is better to use this command than just remove the filesystem yourself, because it also remove appropriate links to real files system that may be distroyed on your Management Node, if you just use the rm -rf command.
You can specify the <os>, <arch> and <profile> value to the rmimagecommand:
rmimage -o <os> -a <arch> -p <profile>
Or, you can specify one imagename to the command:
rmimage <imagename>
Statelite is an xCAT feature which allows you to have mostly stateless nodes (for ease of management), but tell xCAT that just a little bit of state should be kept in a few specific files or directories that are persistent for each node. If you would like to use this feature, refer to the [XCAT_Linux_Statelite] documentation.
Refer to [Using_Linux_Driver_Update_Disk].
Kdump is a kexec-based kernel crash dumping mechanism for Linux.Currently i386, x86_64 and ppc64 ports of kdump are available, and the mainstream distrobutions including Fedora, RedHat Enterprise Linux and SuSE Linux Enterprise Server have shipped the kdump rpm packages.
For RHELS6 or other Linux OSes, there are two rpm packages for kdump, which are
kexec-tools, crash
Before creating the stateless/statelite Linux root images with kdump enabled, please add these two rpm packages into the <profile>.<os>.<arch>.pkglist file.
For Linux images, the new attribute called dump is introduced to linuximage table, and it is used to define the remote NFS path where the crash information is dumped to.
The format of dump follows the standard URI format, since currently only NFS protocol is supported, its value should be set to:
nfs://<nfs_server_ip>/<kdump_path>
If you intend to use the Service Node or Management Node (if no service node is avaiable) as the nfs_server, you can ignore the <nfs_server_ip> field in the dump vlaue, and the value cat be set to:
nfs:///<kdump_path>
which treats the node's SN/MN as the NFS server for kdump service.
Based on <profile>, <os> and <arch>, there should be the definitions for two Linux images, one is for diskless, the other one is for statelite, which are:
<os>-<arch>-netboot-<profile>
<os>-<arch>-statelite-<profile>
For diskless image, set the value of dump attribute for diskless image by the following command:
chdef -t osimage <os>-<arch>-netboot-<profile> dump=nfs://<nfs_server_ip>/<kdump_path>
For example, the image name is rhels6-ppc64-netboot-compute, the NFS server used for kdump is 10.1.0.1, and the path on the NFS server is /install/kdump, you can set the value by:
chdef -t osimage rhels6-ppc64-netboot-compute dump=nfs://10.1.0.1/install/kdump
For statelite image, Set the value of dump attribute for statelite image by the following command:
chdef -t osimage <os>-<arch>-statelite-<profile> dump=nfs://<nfs_server_ip>/<kdump_path>
Note: If there are no such osimage definitions called <os>-<arch>-netboot-<profile> or <os>-<arch>-statelite-<profile> in linuximage table, please update the dump attribute after the genimage command is executed.
Please make sure that the remote NFS path(nfs://<nfs_server_ip>/<kdump_path>) for the dump attribute is write-able.
Once the kernel panic is triggered, the node will reboot into the capture kernel and the kernel dump (vmcore) will be automatically saved to the <kdump_path>/var/crash/<node_ip>-<time>/ directory on the specified NFS server (<nfs_server_ip>), you do not need to create the /var/crash/ directory under the NFS path(nfs://<nfs_server_ip>/<kdump_path>), which will be created by the kdump service when saving the crash information.
In most cases, the service nodes in Linux hierarchical cluster will automatically mount the /install directory from the management node unless the installloc attribute in site table is left blank.
If the installloc attribute is set in site table and the serivce node is chosen to be the remote NFS server for the kdump service, the default value of dump will be not suitable, so you must assign one individual path (which should NOT be in /install/ directory on service node) to the dump attribute. And please make sure the individual path is exported and write-able to the compute nodes.
For example, you can create one directory called /kdump on the service node, and export it with write-able permission, then set the dump attribute as the following value:
nfs://<service_node_ip>/kdump
OR:
nfs:///kdump
This step is for statelite only.
For the statelite images, the /boot/ directory and the /etc/kdump.conf file should be added into the litefile table.
You can use the tabedit litefile command to update the litefile table. After they are added into the table, there should be two new entries as the following:
"ALL","/etc/kdump.conf",,,
"ALL","/boot/",,,
The <profile>.exlist file is located in the /opt/xcat/share/xcat/netboot/<os_platform>/ directory. In order to create your own .exlist file, you should copy it to /install/custom/netboot/<osplatform>, and update the <profile>.exlist file in /install/custom/netboot/<osplatform>.
The kdump service needs to create its own initial ramdisk under the /boot/ directory of the rootimg, so the line which contains
/boot*
MUST be removed from <profile>.exlist.
In order to enable the kdump service for the specified node/nodegroup, you should add the enablekdump postscript by running the following command:
chdef <noderange> -p postscripts=enablekdump
If enablekdump is not added as the postscript, the enablekdump postscript will not be run and the kdump service will fail to start when the node/nodegroup is booting up.
Please follow Stateless_node_installation to generate the diskless rootimg; and please follow Create_Statelite_Image to generate the statelite image.
Please follow the documents including xCAT_pLinux_Clusters and xCAT_Linux_Statelite to setup the diskless/statelite image, and to make the specified noderange booting with the diskless/statelite image.
After noderange booted up with the diskless/statelite image, add a dynamic ip range into networks table to the network used for compute node installation. This dynamic ip range should be large enough accommodate all of the nodes on the network. For example:
#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,nodehostname,ddnsdomain,vlanid,comments,disable
"hfinet","20.0.0.0","255.0.0.0","hf0","20.7.4.1","20.7.4.1","20.7.4.1","20.7.4.1",,,"20.7.4.100-20.7.4.200",,,,,