
For xCAT installing on Rh or sles, pls reference https://sourceforge.net/p/xcat/wiki/XCAT_iDataPlex_Cluster_Quick_Start/#prepare-the-management-node-for-xcat-installation
For xCAT installing on ubuntu, pls reference https://sourceforge.net/p/xcat/wiki/Ubuntu_Quick_Start/#install-xcat
Hardware discovery is used to configure the FSP/BMC and get the hardware configuration information for the physical machine. In this document, we use the following configuration as the example:
machine type/model: 8247-22L
serial: 10112CA
ip address for host: 10.1.101.1
ip address for FSP/BMC:10.2.101.1
password for FSP/BMC: abc123
the dynamic range for service network(used for hosts): 10.1.100.1-10.1.100.100
the dynamic range for management network(used for FSP/BMCs): 10.2.100.1-10.2.100.100
the nic information on MN for service network: eth1, 10.1.1.1/16
the nic information on MN for management network: eth2, 10.2.1.1/16
Note: the management Node need NICs both for management network and service network.
The hardware discovery process will be:
Normally, there will be at least two entries for the two subnet on MN in "networks" table after xCAT is installed. If not, pls run the following command to add networks in "networks" table.
#makenetworks
To check the networks, use:
# tabdump networks
#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,comments,disable
"10_1_0_0-255_255_0_0","10.1.0.0","255.255.0.0","eth1","<xcatmaster>",,"10.1.1.1",,,,,,,,,,,,
"10_2_0_0-255_255_0_0","10.2.0.0","255.255.0.0","eth2","<xcatmaster>",,"10.2.1.1",,,,,,,,,,,,
Set the correct NIC from which DHCP server provide service:
#chdef -t site dhcpinterfaces=eth1,eth2
Update DHCP configuration file:
#makedhcp -n
#makedhcp -a
To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:
#chdef –t site forwarders=1.2.3.4,1.2.5.6
search cluster
nameserver 10.1.1.1
#makedns -n
To configure default password for FSP/BMCs
*#tabedit passwd
*#key,username,password,cryptmethod,authdomain,comments,disable
"system","root","cluster",,,,
"ipmi",,"PASSW0RD",,,,
Note: At present, no username is supported for FSP through IPMI
Genesis pkg can be used to creates a network boot root image, it must be installed before do hardware discovery:
*[RH]
#rpm -aq | grep genesis
xCAT-genesis-scripts-ppc64-xxxx.noarch
xCAT-genesis-base-ppc64-xxxx.noarch
*[ubuntu]
# dpkg -l | grep genesis
ii xcat-genesis-base-ppc64 2.9-xxxx all xCAT Genesis netboot image
ii xcat-genesis-scripts 2.9-xxxx ppc64el xCAT genesis
If the two pkgs haven’t installed yet, pls installed them first and then run the following command to create the network boot root image.
#mknb ppc64
The dynamic range are used for assigning temporary IP adddress for FSP/BMCs and hosts:
#chdef -t network 10_1_0_0-255_255_0_0 dynamicrange="10.1.100.1-10.1.100.100"
#chdef -t network 10_2_0_0-255_255_0_0 dynamicrange="10.2.100.1-10.2.100.100"
#makedhcp -n
#makedhcp -a
The attributes for predefined node are scheduled by system admin. He shall make a plan that what ip address, bmc address, bmc password... can be used for the specified hosts with specified MTMS.
#nodeadd node[001-100] groups=pkvm,all
#chdef node001 mgt=ipmi cons=ipmi ip=10.1.101.1 bmc=10.2.101.1 netboot=petitboot bmcpassword=abc123 installnic=mac primarynic=mac
#chdef node001 mtm=8247-22L serial=10112CA
After the node and its IP address are defined, the admin need to create /etc/hosts file from node definition:
#makehosts pkvm
Add the node/ip mapping into DNS:
#makedns -n
#makedns -a
The xCAT rcons command uses the conserver package to provide support for multiple read-only consoles on a single node and the console logging.
To add or remove new nodes for conserver support:
#makeconservercf
#service conserver stop
#service conserver start
The FSP/BMCs will automatically powered on once the physical machine is powered on. Currently, we can use SLP to find all the FSPs for the p8 LE host:
#lsslp –s PBMC -w
The PBMC node will be like this:
# lsdef Server-8247-22L-SN01112CA
Object name: Server-8247-22L-SN01112CA
bmc=<fsp_ip1>,<fsp_ip2>
groups=pbmc,all
hidden=0
hwtype=pbmc
mgt=ipmi
mtm=8247-22L
nodetype=mp
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
serial=10112CA
If you know the special FSP/BMCs doesn't use the default password configured in 'passwd' table, pls use the following command to add special password for the specified PBMC node.
#chdef Server-8247-22L-SN01112CA bmcpassword=<your_password>
#nodediscoverstart noderange=node[001-100]
#rpower pbmc on
After hosts is powered on, the discover process will start automatically. If you'd like to monitor the discovery process, you can use:
#chdef pbmc cons=ipmi
#makeconsercf
#rcons Server-8247-22L-SN01112CA
After the discovery finished, the hardware information will be updated to the predefined node
#lsdef node001
Object name: node001
arch=ppc64
bmc=10.2.101.1
bmcpassword=abc123
cons=ipmi
cpucount=192
cputype=POWER8E (raw), altivec supported
groups=pkvm,all
installnic=mac
ip=10.1.101.1
mac=6c:ae:8b:02:12:50
memory=65118MB
mgt=ipmi
mtm=8247-22L
netboot=petitboot
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
primarynic=mac
serial=10112CA
statustime=10-15-2014 01:54:22
supportedarchs=ppc64
After all the mac addresses are found and defined for the nodes, you can stop the discovery process.
#nodediscoverstop
The firmware updating process can be done during discovery or at a later time. The steps are:
# rpm -i 01SV810_061_054.rpm --ignoreos
Then, you will find the image file 01SV810_xxx_xxx.img under /tmp/fwupdate/
*[Ubuntu]
# apt-get install alien
# alien 0SVXXX_XXX_XXX.rpm #It will generate a deb pkg like 01sv810xxx.deb
# dpkg -i 01svXXX_XXX_XXX*.deb
Then, you will find the image file 01SV810_xxx_xxx.img under /tmp/fwupdate/
total 197M
-rw-r--r-- 1 root root 197M Oct 10 08:10 01SV810_061_054.img
-rwxr-xr-x 1 root root 149 Oct 13 07:36 runme.sh
*#cat runme.sh
echo "================Start update"
/bin/update_flash -f ./01SV810_061_054.img
*# chmod +x runme.sh
# tar -zcvf firmware-update.tgz .
./
./runme.sh
./01SV810_061_054.img
tar: .: file changed as we read it
*#chdef node001 chain="runcmd=bmcsetup,runimage=http://mgmtnode/install/firmware/firmware-update.tgz,shell"
* Option 2 - update after node deployment:
If you are updating the firmware at a later time (i.e. not during the node discovery process), tell nodeset that you want to do the firmware update, and then set currchain to drop the nodes into a shell when they are done:
*#nodeset node001 runimage=http://mgmtnode/install/firmware/firmware-update.tgz,boot
* Option 3 - update with xcat xdsh:
If the machine is up and running, and have OS installed. You can use the following commands to update firmware
*#xdcp node001 /tmp/fwupdate/01SV810_061_054.img /tmp/ | xdsh node001 "/usr/sbin/update_flash -f /tmp/01SV810_061_054.img"
*#rpower node001 reset
commit or reject the updated image after the machine is up and running.
*#xdsh node001 "/usr/sbin/update_flash -c"
* To reject
*#xdsh node001 "/usr/sbin/update_flash -r"
This is the process for setting up PowerKVM with xCAT
copycds /iso/ibm-powerkvm-2.1.1.0-22.0-ppc64-gold-201410191558.iso
Currently, copycds only support PowerKVM Release 2.1.1 Build 22 Gold, for other build, you need to use <-n> option to specify the distroname.
copycds /iso/ibm-powerkvm-2.1.1.0-18.1-ppc64-gold-201410141637.iso -n pkvm2.1.1
To check the osimage object created by copycds, run the following:
lsdef -t osimage
pkvm2.1.1-ppc64-install-compute (osimage)
The hardware discovery process have updated most of attributes for the specified node, you just need modify the following attributes:
chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1
The following steps are needed if you don't have done hardware discovery.
mkdef node001 groups=all,kvm cons=ipmi mgt=ipmi
chdef node001 bmc=10.2.101.1 bmcpassword=abc123
chdef node001 mac=6c:ae:8b:02:12:50 installnic=mac primarynic=mac
chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1
Note: The discovery is not supported. So the mac address must be obtained by user.
define the name domain for this cluster
chdef -t site domain=cluster.com
Define IP address for the node
chdef node001 ip=10.1.101.1
makedns –n
Config DNS server, the resolv.conf file will be similar to:
cat /etc/resolv.conf
domain cluster.com
search cluster.com
nameserver 10.1.1.1
nslookup node001
Server: 10.1.1.1
Address: 10.1.1.1
Name: node001.cluster.com
Address: 10.1.101.1
chdef node001 netboot=petitboot
chdef node001 serialport=0 serialspeed=115200
makedhcp -n
makedhcp -a
nodeset node001 osimage=pkvm2.1.1-ppc64-install-compute
node001: install pkvm2.1.1-ppc64-compute
rsetboot node001 net
rpower node001 on/reset
You can get the bridge information like following after pkvm host installation is done:
# brctl show
bridge name bridge id STP enabled interfaces
br0 8000.000000000000 no eth0
If you don't have that, it probably that you didn't use the xCAT post install script. You can hack it together quickly by running:
IPADDR=10.1.101.1/16
brctl addbr br0
brctl addif br0 eth0
brctl setfd br0 0
ip addr add dev br0 $IPADDR
ip link set br0 up
ip addr del dev eth0 $IPADDR
For installing ubuntu LE through network, the VM need to access Internet when doing installing. So pls make sure the host is able to connection Internet.
mkdef vm1 groups=vm,all
chdef vm1 vmhost=node001
chdef vm1 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1
chdef vm1 ip=x.x.x.x
makehosts vm1
makedns -n
makedns -a
chdef vm1 mgt=kvm cons=kvm
chdef vm1 vmcpus=2 vmmemory=4096 vmnics=br0 vmnicnicmodel=virtio vmstorage=dir:///var/lib/libvirt/images/
optional:
chtab node=vm1 vm.vidpassword=abc123 (for monitor the installing process from kimchi)
mkvm vm1 -s 20G
chdef vm1 serialport=0 serialspeed=115200
For more information about modifying VM attributes, pls refer to Define Virtual Machines attributes
After you download the latest LE ISO, pls run the following command to create osimage objects.
Ubuntu:
copycds trusty-server-ppc64el.iso
SLES:
copycds SLE-12-Server-DVD-ppc64le-GM-DVD1.is
You can check the /install/<os>/ppc64el directory have been created. And you can find the osimage objects with:</os>
Ubuntu:
lsdef -t osimage
ubuntu14.04-ppc64el-install-compute (osimage)
ubuntu14.04-ppc64el-install-kvm (osimage)
ubuntu14.04-ppc64el-netboot-compute (osimage)
ubuntu14.04-ppc64el-statelite-compute (osimage)
SLES:
lsdef -t osimage
sles12-ppc64le-install-compute (osimage)
sles12-ppc64le-install-iscsi (osimage)
sles12-ppc64le-install-xen (osimage)
sles12-ppc64le-netboot-compute (osimage)
sles12-ppc64le-netboot-service (osimage)
sles12-ppc64le-stateful-mgmtnode (osimage)
sles12-ppc64le-statelite-compute (osimage)
sles12-ppc64le-statelite-service (osimage)
For Ubuntu, in order to boot from network, you need to download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/$(lsb_release -sc)/main/installer-ppc64el/current/images/netboot/", then mount the mini.iso to a tmp directory:
mkdir /tmp/iso
mount -o loop mini.iso /tmp/iso
ls /tmp/iso/install
initrd.gz vmlinux
Then, copy the file /tmp/iso/install/initrd.gz to /install/<ubuntu-version>/ppc64el/install/netboot.</ubuntu-version>
mkdir -p /install/<ubuntu-version>/ppc64el/install/netboot
cp /tmp/iso/install/initrd.gz /install/<ubuntu-version>/ppc64el/install/netboot
Make sure the grub2 had been installed on your Management Node:
rpm -aq | grep grub2
grub2-xcat-1.0-1.noarch
Note: If you are working with xCAT-dep oldder than 20141012, the modules for xCAT shipped grub2 can not support ubuntu LE smoothly. So the following steps needed to complete the grub2 setting.
rm /tftpboot/boot/grub2/grub2.ppc
cp /tftpboot/boot/grub2/powerpc-ieee1275/core.elf /tftpboot/boot/grub2/grub2.ppc
/bin/cp -rf /tmp/iso/boot/grub/powerpc-ieee1275/elf.mod /tftpboot/boot/grub2/powerpc-ieee1275/
Set 'netboot' attribute to 'grub2'
chdef vm1 netboot=grub2
Config password for root:
chtab key=system passwd.username=root passwd.password=xxxxxx
Create grub2 boot configuration file by running nodeset:
Ubuntu:
nodeset vm1 osimage=ubuntu14.04-ppc64el-install-compute
SLES:
nodeset vm1 osimage=sles12-ppc64le-install-compute
Send a hard reset to cycle the VM to start OS installation
rpower vm1 boot
On the pkvm host, pls make sure firewalld service had been stopped.
chkconfig firewalld off
Note: Forwarding request to systemctl will disable firewalld.service.
rm /etc/systemd/system/basic.target.wants/firewalld.service
rm /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service
Then, run wvid vm1 on MN
wvid vm1
Besides we could use kimchi to monitor the installing process
Open "https://<pkvm_ip>:8001" to open kimchi
There will be a “connect” button you can use below "Actions" button and input Password required:abc123 your have just set before mkvm
Then you could get the console
To use the text console
makeconservercf
rcons vm1
The steps below are used to provisioning ubuntu LE for powerNV.
copycds trusty-server-ppc64el.iso
lsdef -t osimage
ubuntu14.04-ppc64el-install-compute (osimage)
ubuntu14.04-ppc64el-install-kvm (osimage)
ubuntu14.04-ppc64el-netboot-compute (osimage)
ubuntu14.04-ppc64el-statelite-compute (osimage)
And in order to boot from network, you need to download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/trusty/main/installer-ppc64el/current/images/netboot/",(pls use http://ports.ubuntu.com/ubuntu-ports/dists/utopic/main/installer-ppc64el/current/images/netboot/mini.iso for ubuntu 14.10) then mount the mini.iso to a tmp directory:
mkdir /tmp/iso
mount -o loop mini.iso /tmp/iso
ls /tmp/iso/install
initrd.gz vmlinux
Then, copy the file /tmp/iso/install/initrd.gz to /install/<ubuntu-version>/ppc64el/install/netboot.</ubuntu-version>
mkdir -p /install/<ubuntu-version>/ppc64el/install/netboot
cp /tmp/iso/install/initrd.gz /install/<ubuntu-version>/ppc64/installel/netboot
The hardware discovery process have updated most of attributes for the specified node, you just need modify the following attributes:
chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1
The following steps are needed if you don't have done hardware discovery.
mkdef node001 groups=all,kvm cons=ipmi mgt=ipmi
chdef node001 bmc=10.2.101.1 bmcpassword=abc123
chdef node001 mac=6c:ae:8b:02:12:50 installnic=mac primarynic=mac
chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1
Note: The discovery is not supported. So the mac address must be obtained by user.
define the name domain for this cluster
chdef -t site domain=cluster.com
Define IP address for the node
chdef node001 ip=10.1.101.1
makedns –n
Config DNS server, the resolv.conf file will be similar to:
cat /etc/resolv.conf
domain cluster.com
search cluster.com
nameserver 10.1.1.1
nslookup node001
Server: 10.1.1.1
Address: 10.1.1.1
Name: node001.cluster.com
Address: 10.1.101.1
chdef node001 serialport=0 serialspeed=115200
nodeset node001 osimage=ubuntu14.04-ppc64el-install-compute
node001: install ubuntu14.04-ppc64el-compute
rsetboot node001 net
rpower node001 on/reset
IBM Power Servers support the Energy management capabilities like to query and monitor the
and to set the
xCAT offers the command 'renergy' to manipulate the Energy related features for Power Server. Refer to the man page of renergy to get the detail of usage.
The Ubuntu iso is being used to install the compute nodes only include packages to run a base operating system, it is likely that users will need to install additional Ubuntu packages from the internet Ubuntu repo or local repo, this section describes how to install additional Ubuntu packages.
Note: the procedure of updating Ubuntu kernel is a little bit different with the update of general Ubuntu packages, if you need to update Ubuntu kernel on the compute nodes, you need to either add the specific version of kernel packages like "linux-image-extra-3.13.0-39-generic linux-headers-3.13.0-39-generic linux-headers-3.13.0-39 linux-generic linux-image-generic" in the otherpkglist file, or write a user customized postscript to run "apt-get -y --force-yes dist-upgrade" on the compute nodes.
step1: Specify the repository
You can generate internet repository source list, refer to Ubuntu Sources List Generator.Use the internet repository directly when define the otherpkgdir attribute, take an example on Ubuntu14.04:
chdef -t osimage <osimage name> otherpkgdir="http://ports.ubuntu.com/ubuntu-ports/ trusty main,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates main,http://ports.ubuntu.com/ubuntu-ports/ trusty universe,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates universe"
step2: Specify otherpkglist file, for example
create an otherpkglist file,for example, /install/custom/install/ubuntu/compute.otherpkgs.pkglist. Add the packages' name into thist file. And modify the otherpkglist attribute for osimage object.
chdef -t osimage <osimage name> otherpkglist=/install/custom/install/ubuntu/compute.otherpkgs.pkglist
step3: Use optional (a) updatenode or optional (b) os provision to install/update the packages on the compute nodes;
optional (a): If OS is already provisioned, run "updatenode <nodename> -S" or "updatenode <nodename> -P otherpkgs" </nodename></nodename>
Run updatenode -S to install/update the packages on the compute nodes
updatenode <nodename> -S
Run updatenode otherpkgs to install/update the packages on the compute nodes
updatenode <nodename> -P otherpkgs
optional (b): OS is not provisioned
run rsetboot to instruct them to boot from the network for the next boot:
rsetboot <nodename> net
The nodeset command tells xCAT what you want to do next with this node, and powering on the node starts the installation process:
nodeset <nodename> osimage=<osimage name>
rpower <nodename> boot
If compute nodes cannot access the internet, there are two ways to install additional packages:use apt proxy or use local mirror;
Step 1: Install Squid on the server which can access the internet (Here uses management node as the proxy server)
apt-get install squid
Step 2: Edit the Squid configuration file /etc/squid3/squid.conf, find the line "#http_access deny to_localhost". Add the following 2 lines behind this line.
acl cn_apt src <compute node sub network>/<net mask length>
http_access allow cn_apt
For more refer Squid configuring.
Step 3: Restart the proxy service
service squid3 restart
Step 4: Create a postscript under /install/postscripts/ directory, called aptproxy, add following lines
#!/bin/sh
PROXYSERVER=$1
if [ -z $PROXYSERVER ];then
PROXYSERVER=$MASTER
fi
PROXYPORT=$2
if [ -z $PROXYPORT ];then
PROXYPORT=3128
fi
if [ -e "/etc/apt/apt.conf" ];then
sed '/^Acquire::http::Proxy/d' /etc/apt/apt.conf > /etc/apt/apt.conf.new
mv -f /etc/apt/apt.conf.new /etc/apt/apt.conf
fi
echo "Acquire::http::Proxy \"http://${PROXYSERVER}:$PROXYPORT\";" >> /etc/apt/apt.conf
Step 5: add this postscript to compute nodes, the [proxy server ip] and [proxy server port] are optional parameters for this postscript. If they are not specified, xCAT will use the management node ip and 3128 by default.
chdef <node range> -p postscripts="aptproxy [proxy server ip] [proxy server port]"
Step 6: Edit the otherpkglist file, for example, /install/custom/install/ubuntu/compute.otherpkgs.pkglist. Add the packages' name into thist file. And modify the otherpkglist attribute for osimage object.
chdef -t osimage <osimage name> otherpkglist=/install/custom/install/ubuntu/compute.otherpkgs.pkglist
Step 7: Edit the otherpkgdir attribute for os image object, can use the internet repositories directly. For example on Ubuntu14.04:
chdef -t osimage <osimage name> otherpkgdir="http://ports.ubuntu.com/ubuntu-ports/ trusty main,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates main,http://ports.ubuntu.com/ubuntu-ports/ trusty universe,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates universe"
Step 8:
If OS is not provisioned,run nodeset, rsetboot, rpower commands to provision the compute nodes.
rsetboot <nodename> net
nodeset <nodename> osimage=<osimage name>
rpower <nodename> boot
If OS is already provisioned,run "updatenode <nodename> -S" or "updatenode <nodename> -P otherpkgs" </nodename></nodename>
Run updatenode -S to install/update the packages on the compute nodes
updatenode <nodename> -S
Run updatenode otherpkgs to install/update the packages on the compute nodes
updatenode <nodename> -P otherpkgs
Find a server witch can connect the internet, and can be accessed by the compute nodes.
step 1: Install apt-mirror
apt-get install apt-mirror
step 2: Configure apt-mirror, refer to Rsyncmirror
vim /etc/apt/mirror.list
step 3: Run apt-mirror to download the repositories(The needed space can be found in Ubuntu Mirrors )
apt-mirror /etc/apt/mirror.list
step 4: Install apache
apt-get install apache2
step 5: Setup links to link our local repository folder to our shared apache directory
ln -s /var/spool/apt-mirror/mirror/archive.ubuntu.com /var/www/archive-ubuntu
When setting the otherpkgdir attribute for the osimages, can use http://<local mirror server ip>/archive-ubuntu/ precise main
For more information about setting local repository mirror can refer How to Setup Local Repository Mirror