XCAT_P8LE_Hardware_Management

Install xCAT on Management Node

For xCAT installing on Rh or sles, pls reference https://sourceforge.net/p/xcat/wiki/XCAT_iDataPlex_Cluster_Quick_Start/#prepare-the-management-node-for-xcat-installation
For xCAT installing on ubuntu, pls reference https://sourceforge.net/p/xcat/wiki/Ubuntu_Quick_Start/#install-xcat

MTMS based Hardware discovery for P8 LE machines

Hardware discovery is used to configure the FSP/BMC and get the hardware configuration information for the physical machine. In this document, we use the following configuration as the example:

machine type/model: 8247-22L
serial: 10112CA
ip address for host: 10.1.101.1
ip address for FSP/BMC:10.2.101.1
password for FSP/BMC: abc123
the dynamic range for service network(used for hosts): 10.1.100.1-10.1.100.100
the dynamic range for management network(used for FSP/BMCs): 10.2.100.1-10.2.100.100
the nic information on MN for service network: eth1, 10.1.1.1/16
the nic information on MN for management network: eth2, 10.2.1.1/16

Note: the management Node need NICs both for management network and service network.

The hardware discovery process will be:

Configure xCAT

configure network table

Normally, there will be at least two entries for the two subnet on MN in "networks" table after xCAT is installed. If not, pls run the following command to add networks in "networks" table.

#makenetworks

To check the networks, use:

# tabdump networks
#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,comments,disable
"10_1_0_0-255_255_0_0","10.1.0.0","255.255.0.0","eth1","<xcatmaster>",,"10.1.1.1",,,,,,,,,,,,
"10_2_0_0-255_255_0_0","10.2.0.0","255.255.0.0","eth2","<xcatmaster>",,"10.2.1.1",,,,,,,,,,,,

setup DHCP

Set the correct NIC from which DHCP server provide service:

#chdef -t site dhcpinterfaces=eth1,eth2

Update DHCP configuration file:

#makedhcp -n
#makedhcp -a

setup DNS

To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:

  • Ensure that /etc/sysconfig/named does not have ROOTDIR set
  • Set site.forwarders to your site-wide DNS servers that can resolve site or public hostnames. The DNS on the MN will forward any requests it can’t answer to these servers.
#chdef –t site forwarders=1.2.3.4,1.2.5.6
  • Edit /etc/resolv.conf to point the MN to its own DNS. (Note: this won't be required in xCAT 2.8 and above.)
search cluster
nameserver 10.1.1.1
  • Run makedns
#makedns -n

config passwd table

To configure default password for FSP/BMCs

*#tabedit passwd
*#key,username,password,cryptmethod,authdomain,comments,disable
"system","root","cluster",,,,
"ipmi",,"PASSW0RD",,,,

Note: At present, no username is supported for FSP through IPMI

Check the genesis pkg:

Genesis pkg can be used to creates a network boot root image, it must be installed before do hardware discovery:
*[RH]

#rpm -aq | grep genesis
xCAT-genesis-scripts-ppc64-xxxx.noarch
xCAT-genesis-base-ppc64-xxxx.noarch
*[ubuntu]
# dpkg -l | grep genesis
ii  xcat-genesis-base-ppc64             2.9-xxxx          all          xCAT Genesis netboot image
ii  xcat-genesis-scripts                2.9-xxxx          ppc64el      xCAT genesis

If the two pkgs haven’t installed yet, pls installed them first and then run the following command to create the network boot root image.

#mknb ppc64

Predefine node

Declare a dynamic range of addresses for discovery

The dynamic range are used for assigning temporary IP adddress for FSP/BMCs and hosts:

#chdef -t network 10_1_0_0-255_255_0_0 dynamicrange="10.1.100.1-10.1.100.100"
#chdef -t network 10_2_0_0-255_255_0_0 dynamicrange="10.2.100.1-10.2.100.100"
#makedhcp -n
#makedhcp -a

Predefine node for discovering

The attributes for predefined node are scheduled by system admin. He shall make a plan that what ip address, bmc address, bmc password... can be used for the specified hosts with specified MTMS.

#nodeadd node[001-100] groups=pkvm,all
#chdef node001 mgt=ipmi cons=ipmi ip=10.1.101.1 bmc=10.2.101.1 netboot=petitboot bmcpassword=abc123 installnic=mac primarynic=mac
#chdef node001 mtm=8247-22L serial=10112CA

Setup /etc/hosts and DNS

After the node and its IP address are defined, the admin need to create /etc/hosts file from node definition:

#makehosts pkvm

Add the node/ip mapping into DNS:

#makedns -n
#makedns -a

Configure conserver

The xCAT rcons command uses the conserver package to provide support for multiple read-only consoles on a single node and the console logging.
To add or remove new nodes for conserver support:

#makeconservercf
#service conserver stop
#service conserver start

Discover node

discovery FSP/BMCs

The FSP/BMCs will automatically powered on once the physical machine is powered on. Currently, we can use SLP to find all the FSPs for the p8 LE host:

#lsslp –s PBMC -w

The PBMC node will be like this:

# lsdef Server-8247-22L-SN01112CA
Object name: Server-8247-22L-SN01112CA
 bmc=<fsp_ip1>,<fsp_ip2>
 groups=pbmc,all
 hidden=0
 hwtype=pbmc
 mgt=ipmi
 mtm=8247-22L
 nodetype=mp
 postbootscripts=otherpkgs
 postscripts=syslog,remoteshell,syncfiles
 serial=10112CA

If you know the special FSP/BMCs doesn't use the default password configured in 'passwd' table, pls use the following command to add special password for the specified PBMC node.

#chdef Server-8247-22L-SN01112CA bmcpassword=<your_password>

power on the hosts

#rpower pbmc on

discover the nodes

After hosts is powered on, the discover process will start automatically. If you'd like to monitor the discovery process, you can use:

#chdef pbmc cons=ipmi
#makeconsercf
#rcons Server-8247-22L-SN01112CA

After the discovery finished, the hardware information will be updated to the predefined node

#lsdef node001
Object name: node001
arch=ppc64
bmc=10.2.101.1
bmcpassword=abc123
cons=ipmi
cpucount=192
cputype=POWER8E (raw), altivec supported
groups=pkvm,all
installnic=mac
ip=10.1.101.1
mac=6c:ae:8b:02:12:50
memory=65118MB
mgt=ipmi
mtm=8247-22L
netboot=petitboot
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
primarynic=mac
serial=10112CA
statustime=10-15-2014 01:54:22
supportedarchs=ppc64

Switch based hardware discovery for P8 LE machines

Switch based hardware discovery is little different with MTMS based hardware discovery. Admin need to specify the switch and swithport on which the host is connected in the node definition in "Predefine node section" above.

Define switch information

#nodeadd switch[1-4] groups=switch,all  
#chdef switch1 ip=172.10.0.1
#tabch switch=switch1 switches.snmpversion=3 switches.username=xcat switches.password=passw0rd switches.auth=sha

or use following commands to define a group of switches.

#chdef –t group –o switch ip="|\D+(\d+)|172.10.0.(\$1+0)|"
#tabch switch=switch switches.snmpversion=3 switches.username=xcat switches.password=passw0rd switches.auth=sha

Define node switch:port information.

#chdef node001 switch=switch1 switchport=1

or use following commands to add switches information to a group of nodes.

# chdef -t group -o pkvm switch="|\D+(\d+).*|switch((((\$1-1)/40)+1))|" switchport="|\D+(\d+).*|((((\$1-1)%40)+1))|" 

Make sure your switch is configured correctly with snmp v3.

For more information about switch based discovery and configuring the switches, pls reference “configure Ethernet Switches” section in https://sourceforge.net/p/xcat/wiki/XCAT_iDataPlex_Cluster_Quick_Start/#create-a-separate-file-system-for-install-optional

After switch information is updated, go to section "Discovery nodes" to continue your node discovery. Switched based hardware does not discovery FSP/BMC, only discovery hosts. Directly rpower reset hosts and the discover process will start automatically. The hosts information will be updated after the discovery.

Sequential based hardware discovery for P8 LE machines

This is a simple approach in which you give xCAT a range of node names to be given to the discovered nodes, and then you power the nodes on sequentially (usually in physical order), and each node is given the next node name in the noderange.

Initialize the discovery process

#nodediscoverstart noderange=node[001-010]

Power on the host sequential

If the order of hosts powered on is concerned, the admin need to power on the host manually one by one, otherwise, you can also use the following command to power on the host:

#rpower Server-8247-22L-SN01112CA on

Display information about the discovery process

There are additional nodediscover commands you can run during the discovery process. See their man pages for more details.

#nodediscoverstatus

Show the nodes that have been discovered so far:

#nodediscoverls -t seq -l

Stop the current sequential discovery process:

#nodediscoverstop

Note: The sequential discovery process will be stopped automatically when all of the node names in the node name pool are used up.

Firmware updating for P8 LE machine

The firmware updating process can be done during discovery or at a later time. The steps are:

  1. Download firmware file from Support Portal in IBM webpage(www.ibm.com). The firmware name is like this: 01SVXXX_XXX_XXX.rpm
  2. Extract the firmware img file from the rpm file.
    *[RH or SLES]
    # rpm -i 01SV810_061_054.rpm --ignoreos
    Then, you will find the image file 01SV810_xxx_xxx.img under /tmp/fwupdate/
*[Ubuntu]
    # apt-get install alien
    # alien 0SVXXX_XXX_XXX.rpm                #It will generate a deb pkg like 01sv810xxx.deb
    # dpkg -i 01svXXX_XXX_XXX*.deb
    Then, you will find the image file 01SV810_xxx_xxx.img under /tmp/fwupdate/
  1. Put it into a tarball:
    • The firmware img file extracted from rpm pkg.
    • a runme.sh script that you create that runs the executable with appropriate flags
    • For example:
      #cd /install/firmware/
      #ls -lh
total 197M
-rw-r--r-- 1 root root 197M Oct 10 08:10 01SV810_061_054.img
-rwxr-xr-x 1 root root  149 Oct 13 07:36 runme.sh
    *#cat runme.sh
echo "================Start update"
/bin/update_flash -f ./01SV810_061_054.img
    *# chmod +x runme.sh 
     # tar -zcvf firmware-update.tgz .
./
./runme.sh
./01SV810_061_054.img
tar: .: file changed as we read it
  1. start firmware update:
    • Option 1 - update during discovery:
      If you want to update the firmware during the node discovery process, ensure you have already added a dynamic range to the networks table and run "makedhcp -n". Then update the chain table to do both bmcsetup and the firmware update:
*#chdef node001 chain="runcmd=bmcsetup,runimage=http://mgmtnode/install/firmware/firmware-update.tgz,shell"
* Option 2 - update after node deployment:

If you are updating the firmware at a later time (i.e. not during the node discovery process), tell nodeset that you want to do the firmware update, and then set currchain to drop the nodes into a shell when they are done:

*#nodeset node001 runimage=http://mgmtnode/install/firmware/firmware-update.tgz,boot
* Option 3 - update with xcat xdsh:

If the machine is up and running, and have OS installed. You can use the following commands to update firmware

*#xdcp node001 /tmp/fwupdate/01SV810_061_054.img /tmp/ | xdsh node001 "/usr/sbin/update_flash -f /tmp/01SV810_061_054.img"

*#rpower node001 reset

  1. commit or reject the updated image after the machine is up and running.

    • To commit
*#xdsh node001 "/usr/sbin/update_flash -c"
* To reject
*#xdsh node001 "/usr/sbin/update_flash -r"
  1. check firmware level

    If the machine is up and running, you can use command "lsmcode" to list firmware levels.
    For ubuntu, the deb pkg "lsvpd" need to be installed in order to use "lsmcode".

Provisioning OS for powerKVM and VMs

Provisioning PowerKVM for p8 LE machine

This is the process for setting up PowerKVM with xCAT

create the osimage object

 copycds /iso/ibm-powerkvm-2.1.1.0-22.0-ppc64-gold-201410191558.iso

Currently, copycds only support PowerKVM Release 2.1.1 Build 22 Gold, for other build, you need to use <-n> option to specify the distroname.

 copycds /iso/ibm-powerkvm-2.1.1.0-18.1-ppc64-gold-201410141637.iso -n pkvm2.1.1

To check the osimage object created by copycds, run the following:

   lsdef -t osimage
   pkvm2.1.1-ppc64-install-compute (osimage)

Define the node object

Option 1: use node object updated by hardware discovery process

The hardware discovery process have updated most of attributes for the specified node, you just need modify the following attributes:

   chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1
Option 2: define the node object and its attribute by yourself

The following steps are needed if you don't have done hardware discovery.

Define node
   mkdef node001 groups=all,kvm cons=ipmi mgt=ipmi
   chdef node001 bmc=10.2.101.1 bmcpassword=abc123
   chdef node001 mac=6c:ae:8b:02:12:50 installnic=mac primarynic=mac
   chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1

Note: The discovery is not supported. So the mac address must be obtained by user.

Setup dns

define the name domain for this cluster

  chdef -t site domain=cluster.com

Define IP address for the node

  chdef node001 ip=10.1.101.1
  makedns –n

Config DNS server, the resolv.conf file will be similar to:

cat /etc/resolv.conf
domain cluster.com
search cluster.com
nameserver 10.1.1.1
Check the DNS setup
nslookup node001
Server: 10.1.1.1
Address: 10.1.1.1

Name: node001.cluster.com
Address: 10.1.101.1

Prepare the petitboot, console and dhcpd configurations

  chdef node001 netboot=petitboot
  chdef node001 serialport=0 serialspeed=115200
  makedhcp -n
  makedhcp -a
  nodeset node001 osimage=pkvm2.1.1-ppc64-install-compute
  node001: install pkvm2.1.1-ppc64-compute
  rsetboot node001 net

Reboot the node to start the provisioning process

rpower node001 on/reset

Check bridge setting after installation finished

You can get the bridge information like following after pkvm host installation is done:

# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.000000000000       no              eth0

If you don't have that, it probably that you didn't use the xCAT post install script. You can hack it together quickly by running:

    IPADDR=10.1.101.1/16
    brctl addbr br0
    brctl addif br0 eth0
    brctl setfd br0 0
    ip addr add dev br0 $IPADDR
    ip link set br0 up
    ip addr del dev eth0 $IPADDR

Make sure the powerKVM is able to connection Internet

For installing ubuntu LE through network, the VM need to access Internet when doing installing. So pls make sure the host is able to connection Internet.

Steps to install ubuntu 14.x LE or SLES 12 LE for powerkvm VM through network

Define node "vm1" as a normal vm node:

  mkdef vm1 groups=vm,all
  chdef vm1 vmhost=node001
  chdef vm1 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1
  chdef vm1 ip=x.x.x.x
  makehosts vm1
  makedns -n
  makedns -a

create VM

   chdef vm1 mgt=kvm cons=kvm
   chdef vm1 vmcpus=2 vmmemory=4096 vmnics=br0 vmnicnicmodel=virtio vmstorage=dir:///var/lib/libvirt/images/
   optional:

   chtab node=vm1 vm.vidpassword=abc123  (for monitor the installing process from kimchi)

   mkvm vm1 -s 20G

Define the console attributes for VM

   chdef vm1 serialport=0 serialspeed=115200

For more information about modifying VM attributes, pls refer to Define Virtual Machines attributes

Create LE osimage object

After you download the latest LE ISO, pls run the following command to create osimage objects.

  Ubuntu:
  copycds trusty-server-ppc64el.iso
  SLES:
  copycds SLE-12-Server-DVD-ppc64le-GM-DVD1.is

You can check the /install/<os>/ppc64el directory have been created. And you can find the osimage objects with:</os>

   Ubuntu:
   lsdef -t osimage
   ubuntu14.04-ppc64el-install-compute  (osimage)
   ubuntu14.04-ppc64el-install-kvm  (osimage)
   ubuntu14.04-ppc64el-netboot-compute  (osimage)
   ubuntu14.04-ppc64el-statelite-compute  (osimage)
   SLES:
   lsdef -t osimage
   sles12-ppc64le-install-compute  (osimage)
   sles12-ppc64le-install-iscsi  (osimage)
   sles12-ppc64le-install-xen  (osimage)
   sles12-ppc64le-netboot-compute  (osimage)
   sles12-ppc64le-netboot-service  (osimage)
   sles12-ppc64le-stateful-mgmtnode  (osimage)
   sles12-ppc64le-statelite-compute  (osimage)
   sles12-ppc64le-statelite-service  (osimage)

For Ubuntu, in order to boot from network, you need to download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/$(lsb_release -sc)/main/installer-ppc64el/current/images/netboot/", then mount the mini.iso to a tmp directory.
For ubuntu 14.04.2, pls download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/trusty-updates/main/installer-ppc64el/current/images/utopic-netboot/".
For ubuntu 14.04.3, pls download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/trusty-updates/main/installer-ppc64el/current/images/vivid-netboot/".

  mkdir /tmp/iso
  mount -o loop mini.iso /tmp/iso
  ls /tmp/iso/install
  initrd.gz  vmlinux

Then, copy the file /tmp/iso/install/initrd.gz to /install/<ubuntu-version>/ppc64el/install/netboot.</ubuntu-version>

  mkdir -p  /install/<ubuntu-version>/ppc64el/install/netboot
  cp  /tmp/iso/install/initrd.gz  /install/<ubuntu-version>/ppc64el/install/netboot

Prepare grub2 and dhcpd configurations

Make sure the grub2 had been installed on your Management Node:

  rpm -aq | grep grub2
  grub2-xcat-1.0-1.noarch

Note: If you are working with xCAT-dep oldder than 20141012, the modules for xCAT shipped grub2 can not support ubuntu LE smoothly. So the following steps needed to complete the grub2 setting.

  rm /tftpboot/boot/grub2/grub2.ppc
  cp /tftpboot/boot/grub2/powerpc-ieee1275/core.elf /tftpboot/boot/grub2/grub2.ppc
  /bin/cp -rf /tmp/iso/boot/grub/powerpc-ieee1275/elf.mod /tftpboot/boot/grub2/powerpc-ieee1275/

Set 'netboot' attribute to 'grub2'

  chdef vm1 netboot=grub2

Config password for root:

chtab key=system passwd.username=root passwd.password=xxxxxx

Create grub2 boot configuration file by running nodeset:

Ubuntu:
nodeset vm1 osimage=ubuntu14.04-ppc64el-install-compute
SLES:
nodeset vm1 osimage=sles12-ppc64le-install-compute

Send a hard reset to cycle the VM to start OS installation

rpower vm1 boot

Use console to monitor the installing process

On the pkvm host, pls make sure firewalld service had been stopped.

chkconfig firewalld off

Note: Forwarding request to systemctl will disable firewalld.service.

  rm /etc/systemd/system/basic.target.wants/firewalld.service 
  rm /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service

Then, run wvid vm1 on MN

wvid vm1

Besides we could use kimchi to monitor the installing process

 Open "https://<pkvm_ip>:8001" to open kimchi
 There will be a “connect” button you can use below "Actions" button and input Password required:abc123 your have just set before mkvm
 Then you could get the console

To use the text console

makeconservercf
rcons vm1

Provisioning Ubuntu 14.x for powerNV

The steps below are used to provisioning ubuntu LE for powerNV.

Create ubuntu LE osimage object

copycds trusty-server-ppc64el.iso
lsdef -t osimage
ubuntu14.04-ppc64el-install-compute (osimage)
ubuntu14.04-ppc64el-install-kvm (osimage)
ubuntu14.04-ppc64el-netboot-compute (osimage)
ubuntu14.04-ppc64el-statelite-compute (osimage)

And in order to boot from network, you need to download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/$(lsb_release -sc)/main/installer-ppc64el/current/images/netboot/", then mount the mini.iso to a tmp directory.
For ubuntu 14.04.2, pls download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/trusty-updates/main/installer-ppc64el/current/images/utopic-netboot/".
For ubuntu 14.04.3, pls download the mini.iso from "http://ports.ubuntu.com/ubuntu-ports/dists/trusty-updates/main/installer-ppc64el/current/images/vivid-netboot/".

  mkdir /tmp/iso
  mount -o loop mini.iso /tmp/iso
  ls /tmp/iso/install
  initrd.gz  vmlinux

Then, copy the file /tmp/iso/install/initrd.gz to /install/<ubuntu-version>/ppc64el/install/netboot.</ubuntu-version>

  mkdir -p  /install/<ubuntu-version>/ppc64el/install/netboot
  cp  /tmp/iso/install/initrd.gz  /install/<ubuntu-version>/ppc64/installel/netboot

Define the powerNV object

Option 1: use node object updated by hardware discovery process

The hardware discovery process have updated most of attributes for the specified node, you just need modify the following attributes:

   chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1

Option 2: define the node object and its attribute by yourself

The following steps are needed if you don't have done hardware discovery.

Define node
   mkdef node001 groups=all,kvm cons=ipmi mgt=ipmi
   chdef node001 bmc=10.2.101.1 bmcpassword=abc123
   chdef node001 mac=6c:ae:8b:02:12:50 installnic=mac primarynic=mac
   chdef node001 tftpserver=10.1.1.1 conserver=10.1.1.1 nfsserver=10.1.1.1

Note: The discovery is not supported. So the mac address must be obtained by user.

Setup dns

define the name domain for this cluster

  chdef -t site domain=cluster.com

Define IP address for the node

  chdef node001 ip=10.1.101.1
  makedns –n

Config DNS server, the resolv.conf file will be similar to:

cat /etc/resolv.conf
domain cluster.com
search cluster.com
nameserver 10.1.1.1
Check the DNS setup
nslookup node001
Server: 10.1.1.1
Address: 10.1.1.1

Name: node001.cluster.com
Address: 10.1.101.1

Prepare the petitboot, console and dhcpd configurations

chdef node001 serialport=0 serialspeed=115200
nodeset node001 osimage=ubuntu14.04-ppc64el-install-compute

Configure node boot from network

rsetboot node001 net

Reboot the node to start the provisioning process

rpower node001 on/reset

Energy Management

IBM Power Servers support the Energy management capabilities like to query and monitor the

  • Power Saving Status
  • Power Capping Status
  • Power Consumption
  • CPU Frequency
  • Ambient temperature
  • Fan Speed
  • ...

and to set the

  • Power Saving
  • Power Capping
  • CPU frequency

xCAT offers the command 'renergy' to manipulate the Energy related features for Power Server. Refer to the man page of renergy to get the detail of usage.

Configure Raid Arrays

In many new compute machines, the disks in the machines have been formatted to RAID oriented format in manufactory that admin must create raid array for the disks manually before using them. Then how to configure raid in unattended way for hundreds of machines turns to be a problem. In xCAT, you can use the runimage facility to configure raid in the hardware discovery procedure or with separated manual steps.

Following example shows how to use runimage facility manually to create raid0 array for each disk in a Power OPAL machine. You can customize the runme.sh script to achieve your real requirement. For the x86 machine, you can workout a similar runme.sh script for your corresponding case.

IBM has offered a tool iprconfig to configure raid for IBM power machine. To leverage this tool, xCAT enabled it in xCAT genesis.ppc64 so that admin can use genesis.ppc64 to configure raid.

runimage is a facility which will be run in genesis to copy and run an image from xCAT management node. We add the code logic that configure raid into 'runme.sh', then leverage the 'runimage' mechanism to perform the raid configuration flow which we added in 'runme.sh'.

Steps to Set Up Runimage

mkdir -p /install/runimage
cd /install/runimage
edit ./runme.sh with your raid configuration script
chmod 755 runme.sh
tar cvf setupipr.tar .
nodeset <node> runimage=http://<MN IP>/install/runimage/setupipr.tar,shell
rsetboot <node> net
rpower <node> on

The Example of runme.sh for IBM Power Raid Setup

This example script will find out all disks, then format them to raid-oriented format and create raid0 array for each disk.

#!/bin/bash
# IBM(c) 2015 EPL license http://www.eclipse.org/legal/epl-v10.html

# Get all disks which are formatted in JOBD functions
jbod_disks=`iprconfig -c show-jbod-disks | grep "^sd[a-z]\+.*Active" | awk '{print $1}' | sort`

# Format all the jbod disk to 'Advanced Function' format
for disk in $jbod_disks; do
    disklist="$disklist $disk"
done
echo "Format disk [$disklist] for Advanced Function to be ready for raid array creating."
`iprconfig -c format-for-raid $disklist`

# Get all available IOAs
ioas=`iprconfig -c show-ioas | grep "^sg[0-9]\+.*Operational" | awk '{print $1}' | sort`

# Exit if there's no available IOA
if [ -z "$ioas" ]; then
    echo "Error: No available IOA found."
    exit 0
fi

for ioa in $ioas; do
    # Figure out all available disks for the IOA
    disks=`iprconfig -c query-raid-create $ioa | grep "^sg[0-9]\+.*Active" | awk '{print $1}'`

    # Create arraies:
    # Create raid0 for each of active disks
    for disk in $disks; do
        # Create a raid 0 for each disk
        echo "Create raid 0 array with device /dev/$disk."
        `iprconfig -c raid-create -r 0 /dev/$disk >/dev/null`
    done
done

Monitor the Raid Configuration Process

The process to format disk to 'Advanced Function' (raid-oriented format) will cost tens of minutes or hours. During this period, you can use xCAT xdsh command to monitor the progress of raid configuration.

The output of running iprconfig command to display the raid configuration for node p8le-42l. You can parse the output to automitcally minitor the process.

#xdsh p8le-42l iprconfig -c show-config
p8le-42l: Name   PCI/SCSI Location         Description               Status
p8le-42l: ------ ------------------------- ------------------------- -----------------
p8le-42l:        0001:08:00.0/0:            PCI-E SAS RAID Adapter    Operational
p8le-42l:        0001:08:00.0/0:0:0:0       Advanced Function Disk    Active
p8le-42l: sda    0001:08:00.0/0:0:12:0      Physical Disk             5% Formatted
p8le-42l: sdb    0001:08:00.0/0:0:13:0      Physical Disk             5% Formatted
p8le-42l: sdc    0001:08:00.0/0:0:14:0      Physical Disk             5% Formatted
p8le-42l:        0001:08:00.0/0:0:1:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:2:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:3:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:4:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:5:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:6:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:7:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:8:0       Advanced Function Disk    Active
p8le-42l:        0001:08:00.0/0:0:10:0      Enclosure                 Active
p8le-42l:        0001:08:00.0/0:0:9:0       Enclosure                 Active

Appendix A: Installing other packages with Ubuntu official mirror

The Ubuntu iso is being used to install the compute nodes only include packages to run a base operating system, it is likely that users will need to install additional Ubuntu packages from the internet Ubuntu repo or local repo, this section describes how to install additional Ubuntu packages.

Note: the procedure of updating Ubuntu kernel is a little bit different with the update of general Ubuntu packages, if you need to update Ubuntu kernel on the compute nodes, you need to either add the specific version of kernel packages like "linux-image-extra-3.13.0-39-generic linux-headers-3.13.0-39-generic linux-headers-3.13.0-39 linux-generic linux-image-generic" in the otherpkglist file, or write a user customized postscript to run "apt-get -y --force-yes dist-upgrade" on the compute nodes.

A1: Compute nodes can access the internet

step1: Specify the repository

You can generate internet repository source list, refer to Ubuntu Sources List Generator.Use the internet repository directly when define the otherpkgdir attribute, take an example on Ubuntu14.04:

    chdef -t osimage <osimage name> otherpkgdir="http://ports.ubuntu.com/ubuntu-ports/ trusty main,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates main,http://ports.ubuntu.com/ubuntu-ports/ trusty universe,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates universe"

step2: Specify otherpkglist file, for example

create an otherpkglist file,for example, /install/custom/install/ubuntu/compute.otherpkgs.pkglist. Add the packages' name into thist file. And modify the otherpkglist attribute for osimage object.

    chdef -t osimage <osimage name> otherpkglist=/install/custom/install/ubuntu/compute.otherpkgs.pkglist

step3: Use optional (a) updatenode or optional (b) os provision to install/update the packages on the compute nodes;

optional (a): If OS is already provisioned, run "updatenode <nodename> -S" or "updatenode <nodename> -P otherpkgs" </nodename></nodename>

Run updatenode -S to install/update the packages on the compute nodes
    updatenode <nodename> -S
Run updatenode otherpkgs to install/update the packages on the compute nodes
    updatenode <nodename> -P otherpkgs 

optional (b): OS is not provisioned

run rsetboot to instruct them to boot from the network for the next boot:

    rsetboot <nodename> net

The nodeset command tells xCAT what you want to do next with this node, and powering on the node starts the installation process:

    nodeset <nodename> osimage=<osimage name>
    rpower <nodename> boot

A2: Compute nodes can not access the internet

If compute nodes cannot access the internet, there are two ways to install additional packages:use apt proxy or use local mirror;

optional 1: Use apt proxy

Step 1: Install Squid on the server which can access the internet (Here uses management node as the proxy server)

    apt-get install squid

Step 2: Edit the Squid configuration file /etc/squid3/squid.conf, find the line "#http_access deny to_localhost". Add the following 2 lines behind this line.

    acl cn_apt src <compute node sub network>/<net mask length>
    http_access allow cn_apt

For more refer Squid configuring.

Step 3: Restart the proxy service

    service squid3 restart

Step 4: Create a postscript under /install/postscripts/ directory, called aptproxy, add following lines

    #!/bin/sh
    PROXYSERVER=$1
    if [ -z $PROXYSERVER ];then
        PROXYSERVER=$MASTER
    fi

    PROXYPORT=$2
    if [ -z $PROXYPORT ];then
        PROXYPORT=3128
    fi

    if [ -e "/etc/apt/apt.conf" ];then
        sed '/^Acquire::http::Proxy/d' /etc/apt/apt.conf &gt; /etc/apt/apt.conf.new
        mv -f /etc/apt/apt.conf.new /etc/apt/apt.conf
    fi
    echo "Acquire::http::Proxy \"http://${PROXYSERVER}:$PROXYPORT\";" &gt;&gt; /etc/apt/apt.conf

Step 5: add this postscript to compute nodes, the [proxy server ip] and [proxy server port] are optional parameters for this postscript. If they are not specified, xCAT will use the management node ip and 3128 by default.

    chdef <node range> -p postscripts="aptproxy [proxy server ip] [proxy server port]"

Step 6: Edit the otherpkglist file, for example, /install/custom/install/ubuntu/compute.otherpkgs.pkglist. Add the packages' name into thist file. And modify the otherpkglist attribute for osimage object.

    chdef -t osimage <osimage name> otherpkglist=/install/custom/install/ubuntu/compute.otherpkgs.pkglist

Step 7: Edit the otherpkgdir attribute for os image object, can use the internet repositories directly. For example on Ubuntu14.04:

    chdef -t osimage <osimage name> otherpkgdir="http://ports.ubuntu.com/ubuntu-ports/ trusty main,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates main,http://ports.ubuntu.com/ubuntu-ports/ trusty universe,http://ports.ubuntu.com/ubuntu-ports/ trusty-updates universe"

Step 8:

If OS is not provisioned,run nodeset, rsetboot, rpower commands to provision the compute nodes.

    rsetboot <nodename> net
    nodeset <nodename> osimage=<osimage name>
    rpower <nodename> boot

If OS is already provisioned,run "updatenode <nodename> -S" or "updatenode <nodename> -P otherpkgs" </nodename></nodename>

Run updatenode -S to install/update the packages on the compute nodes
    updatenode <nodename> -S
Run updatenode otherpkgs to install/update the packages on the compute nodes
    updatenode <nodename> -P otherpkgs 

Optional 2: Use local mirror

Find a server witch can connect the internet, and can be accessed by the compute nodes.

See more details steps in doc Rsyncmirror

step 1: Install apt-mirror

    apt-get install apt-mirror

step 2: Configure apt-mirror, refer to Rsyncmirror

    vim /etc/apt/mirror.list

step 3: Run apt-mirror to download the repositories(The needed space can be found in Ubuntu Mirrors )

    apt-mirror /etc/apt/mirror.list

step 4: Install apache

    apt-get install apache2

step 5: Setup links to link our local repository folder to our shared apache directory

    ln -s /var/spool/apt-mirror/mirror/archive.ubuntu.com /var/www/archive-ubuntu

When setting the otherpkgdir attribute for the osimages, can use http://<local mirror server ip>/archive-ubuntu/ precise main

For more information about setting local repository mirror can refer How to Setup Local Repository Mirror


Related

Wiki: XCAT_Documentation

MongoDB Logo MongoDB