
Note: if you are using P775 hardware, do not use this document. Use: [XCAT_pLinux_Clusters_775].
IF you are not very familiar with setting up PLinux clusters, you probably need more complete instructions on setting up the cluster. Use the following documentation. [XCAT_pLinux_Clusters].
This guide is a general quick reference for installing xCAT on IBM Power machines. xCAT provides full documentation for many different pLinux scenarios. Start with the information here first, then refer to the specific documentation for your environment if you need more information on commands and detailed setup.
Complete guide:
This guide was written using the following setup. Newer versions of Distros and xCAT and other PowerLinux servers should work correctly, but may not have been explicitly tested:
Objects versus Tables:
lsdef, chdef, mkdef are commands to change object definitions in the xCAT database. Since one object may reference more than one table, the preference is to use these commands chtab, tabdump will manage tables in the xCAT database. Remote commands:
rpower, rscan, rinstall, rnetboot - interfaces to hardware management operations See more information at
Listing_and_Modifying_the_Database
Install your distro:
Add xCAT repositories (xcat-core and xcat-dep):
$ cd /etc/yum.repos.d
wget http://sourceforge.net/projects/xcat/files/yum/2.8/xcat-core/xCAT-core.repo
wget http://sourceforge.net/projects/xcat/files/yum/xcat-dep/rh6/ppc64/xCAT-dep.repo
Check network configuration
The MN will be configured with two interfaces: one facing the external network (LAN) and another facing the private network
Example:
* LAN:
* domain: austin.ibm.com
* Management Node (MN) IP: 9.3.189.137/24
* Hostname: junoltc01
* Nameservers: 9.0.7.1,9.0.6.11
* Private network
* domain: mine.austin.ibm.com
* Hostname:junoltc01
* MN IP: 192.168.0.100/24
* HMC: aphmc5.austin.ibm.com (9.3.110.122)
Important files to check:
/etc/sysconfig/network: HOSTNAME should be correctly set (HOSTNAME=junoltc01). In SLES, file is /etc/HOSTNAME /etc/sysconfig/network-scripts/ifcfg-eth# : eth0 is configured to LAN, while eth1 is configured to the xCAT private network /etc/resolv.conf: domain should point to the private domain, as well as search. nameservers should have the private MN ip. $ cat /etc/resolv.conf
domain mine.austin.ibm.com
search mine.austin.ibm.com
nameserver 192.168.0.100
On xCAT 2.7 or later, you don't to need to configure the /etc/resolv.conf facing the private network.
/etc/hosts (see DNS section below for other choices of DNS configuration). It should contain dns resolution information for the management node, compute node(s) and the hmc: $ cat /etc/hosts
...
192.168.0.100 junoltc01 junoltc01.mine.austin.ibm.com #MN
192.168.0.103 junoltc03 junoltc03.mine.austin.ibm.com #Compute node (diskful install)
192.168.0.109 junoltc09 junoltc09.mine.austin.ibm.com #Compute node (diskless install)
9.3.110.122 aphmc5 aphmc5.austin.ibm.com # HMC
Note: you can skip this step in xCAT 2.8.1 and above, because xCAT does it automatically when it is installed.
To disable SELinux manually:
echo 0 > /selinux/enforce
sed -i 's/^SELINUX=.*$/SELINUX=disabled/' /etc/selinux/config
Note: you can skip this step in xCAT 2.8 and above, because xCAT does it automatically when it is installed.
The management node provides many services to the cluster nodes, but the firewall on the management node can interfere with this. If your cluster is on a secure network, the easiest thing to do is to disable the firewall on the Management Mode:
For RH:
service iptables stop
chkconfig iptables off
If disabling the firewall completely isn't an option, configure iptables to allow the following services on the NIC that faces the cluster: DHCP, TFTP, NFS, HTTP, DNS.
For SLES:
SuSEfirewall2 stop
This quickstart uses DNS configuration setup through /etc/hosts. For other DNS configuration methods,refer to: {Cluster_Name_Resolution]
Install xCAT
$ yum install xCAT
Check running services. Restart services if necessary.
$ service <name> status
$ service <name> restart
* httpd
* nfs
* named (dns) (optional)
* tftp/tftpd (it may be running as a xinetd service if atftp-xcat was not installed)
* xcatd
* firewall disabled (all policies accepting all in `iptables -L`)
Check site table
Important fields:
To change the site table
$ chdef -t site master=192.168.0.100 domain=mine.austin.ibm.com nameservers=192.168.0.100 forwarders=9.0.7.1,9.0.6.11 dhcpinterfaces=eth1
To check the site table configuration
$ tabdump site
...
"master","192.168.0.100",,
"forwarders","9.0.7.1,9.0.6.11",,
"nameservers","192.168.0.100",,
"domain","mine.austin.ibm.com",,
"dhcpinterfaces","eth1",,
...
"consoleondemand","yes",,
Check networks table
No changes needed if the setup above was correctly performed.
$ tabdump networks
#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,nodehostname,ddnsdomain,vlanid,domain,comments,disable
"9_3_189_0-255_255_255_0","9.3.189.0","255.255.255.0","eth0","9.3.189.1",,"9.3.189.137",,,,,,,,,,
"192_168_0_0-255_255_255_0","192.168.0.0","255.255.255.0","eth1","<xcatmaster>",,"192.168.0.100",,,,"192.168.0.1-192.168.0.99",,,,,,
Define the dynamic range for dhcp server(Optional)
The dynamic range is only needed by the hardware discovery process, if you will not do the hardware discovery, setting dynamic range might bring in side-effects for the ip addresses assignment in the network.
$ chdef -t network -o 192_168_0_0-255_255_255_0 dynamicrange="192.168.0.1-192.168.0.99"
Add the default account for root, when nodes are installed.
$ chtab key=system passwd.username=root passwd.password=cluster
Configure the hmc node
$ mkdef -t node -o aphmc groups=hmc,all nodetype=ppc hwtype=hmc mgt=hmc username=hscroot password=abc1234
We assume the virtualized partitions (LPARs) are already set using HMC (or IVM).
Create nodes for systems and frames under the hmc. (See also stanza files for other ways to configure it. See:
XCAT_System_p_Hardware_Management_for_HMC_Managed_Systems/#discover-hmcsframececs-and-define-them-in-xcat-db
$ rscan -w aphmc5
Run make* scripts
$ makeconservercf # enable remote console
$ makedhcp -n # configure dhcp server
$ makedhcp -a
$ makedns # configure dns server
See [Debugging_xCAT_Problems] for more debugging information on those commands.
Copy distro image to /install. You need to download the distro image (or mount it using NFS) to a directory in your disk. The example here is using RHEL6.2 (see that xCAT uses rhels6.2), and the ISO image is on /iso/
$ copycds -n rhels6.2 -a ppc64 /iso/RHEL6.2-20111117.0-Server-ppc64-DVD1.iso
$ ls /install/rhels6.2/ppc64 # verify copy was successfull
Add group to identify nodes that will be managed
$ chdef junoltc0[39] groups=cn -p
$ lsdef cn -i groups
Object name: junoltc03
groups=lpar,all,cn
Object name: junoltc09
groups=lpar,all,cn
Test if console is working for nodes
$ rcons junoltc03
See [Debugging_xCAT_Problems], if you have problems with the console
Get nodes mac addresses. Attention, if you have more than one interface, you need to specify which one you need to write in the mac table (See man getmacs)
$ getmacs cn
junoltc03:
#Type Phys_Port_Loc MAC_Address Adapter Port_Group Phys_Port Logical_Port VLan VSwitch Curr_Conn_Speed
virtualio N/A d2:08:3b:d6:7c:04 N/A N/A N/A N/A 1 ETHERNET1 N/A
junoltc09:
#Type Phys_Port_Loc MAC_Address Adapter Port_Group Phys_Port Logical_Port VLan VSwitch Curr_Conn_Speed
virtualio N/A d2:08:30:4d:6d:04 N/A N/A N/A N/A 1 ETHERNET1 N/A
$ tabdump mac
#node,interface,mac,comments,disable
"junoltc03",,"d2:08:3b:d6:7c:04",,
"junoltc09",,"d2:08:30:4d:6d:04",,
End definition of nodes
$ chdef cn netboot=yaboot tftpserver=192.168.0.100 nfsserver=192.168.0.100 xcatmaster=192.168.0.100 installnic="eth1" primarynic="eth1"
Setup nodes for OS deployment Here, we are going to deploy RHEL6.2
$ chdef cn os=rhel6.2 profile=compute arch=ppc64
We are ready now to deploy the distro in the compute nodes!
When you ran copycds, osimage definitions for your distro were automatically created:
lsdef -t osimage
For basic operating system installations, you can use one of these images without modification. Review the contents of the files referenced by the osimage definition you choose. For image customization, see [Using_Provmethod%3Dosimagename]
Choose and review a stateful osimage definition:
lsdef -t osimage -o rhels6.2-ppc64-install-compute -l
Set node to diskfull install and run installation:
#TRY THIS
$ nodeset junoltc03 osimage=rhels6.2-ppc64-install-compute
$ lsdef junoltc03 # check all node parameters are correctly set
...
$ rpower junoltc03 reset
# OR THIS
$ rinstall -O rhels6.2-ppc64-install-compute junoltc03
# to follow the installation process
$ rcons junoltc03 #run from another terminal
You need to generate an installation image before proceeding with netbooting. Review your chosen osimage definition:
lsdef -t osimage -o rhels6.2-ppc64-netboot-compute -l
Generate and pack your diskless image:
$ genimage rhels6.2-ppc64-install-compute
$ packimage rhels6.2-ppc64-install-compute
Set node to diskless install and network boot the node to load the image:
$ nodeset junoltc09 osimage=rhels6.2-ppc64-install-compute
$ lsdef junoltc09 # check all node parameters are correctly set
...
# NETWORK BOOT THE NODE:
$ rpower junoltc09 reset
# OR
$ rnetboot junoltc09
# to follow the installation process
$ rcons junoltc09 #run from another terminal
Trouble with TFTP server, console, remote installation, configuration? See [Debugging_xCAT_Problems]
Complete guides:
Wiki: CSM_to_xCAT_Migration
Wiki: Debugging_xCAT_Problems
Wiki: Listing_and_Modifying_the_Database
Wiki: Using_Provmethod=osimagename
Wiki: XCAT_Overview,_Architecture,_and_Planning
Wiki: XCAT_pLinux_Clusters
Wiki: XCAT_pLinux_Clusters_775