THIS PAGE UNDER CONSTRUCTION
IBM Flex is the next generation blade center. It's consist of an IBM Flex Chassis, Management Modules(CMM) and blade servers. The type of the management module for IBM Flex is 'cmm', and the blade servers include the IBM Flex System™ p260, p460, and 24L Power 7 servers as well as the IBM Flex System™ x240 Compute Node which is an Intel-processor based server. In this document only the management of POWER 7 blade server will be covered.
IBM Flex System™ p260, p460, and 24L Power 7 server hardware management: Generally, xCAT uses the management type 'blade' to manage the blade center and blade server (The management work is done through the management module). For IBM Flex xCAT will use a management type of 'fsp' to management the POWER 7 blade servers(The management work is done through the xCAT DFM (Direct FSP Management)). For xCAT IBM Flex Power 7 servers, the management approach will be the mix of 'blade' and 'fsp'. Most of the discovery work will be done through CMM and most of the hardware management work will be done through blade's FSP directly.
The following terms will be used in this document:
xCAT DFM: Direct FSP Management is the name that we will use to describe the ability for xCAT software to communicate directly to the IBM FLex Power pblade's service processor without the use of the HMC for management.
CMM node: Chassis Management Module - this term is used to reflect the pair of adapters on the rear of the chassis which have an Ethernet connection.
blade node: blade node is a node with the hwtype set to blade and represents the whole blade server. And the hcp attribute of the blade is set to the FSP's IP.
This requires the new xCAT Direct FSP Management plugin (xCAT-dfm-*.ppc64.rpm), which is not part of the core xCAT open source, but is available as a free download from IBM. You must download this and install it on your xCAT management node (and possibly on your service nodes, depending on your configuration) before proceeding with this document.
Download xCAT-dfm RPM: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=ibm~ClusterSoftware&product=ibm/Other+software/IBM+direct+FSP+management+plug-in+for+xCAT&release=All&platform=All&function=all
Download ISNM-hdwr_svr RPM (linux) http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=ibm~ClusterSoftware&product=ibm/Other+software/IBM+High+Performance+Computing+%28HPC%29+Hardware+Server&release=All&platform=All&function=all
Once you have downloaded these packages, install the hardware server package, and then install DFM:
rhels:
If you have been following the xCAT documentation, you should already have the yum repositories set up to pull in whatever xCAT dependencies and distro RPMs are needed (libstdc++.ppc, libgcc.ppc, openssl.ppc, etc.).
yum install xCAT-dfm-.ppc64.rpm ISNM-hdwr_svr-.ppc64.rpm
The discovery procedure is used to simplify the cluster environment setup for the administrator especially for the cluster with thousands of nodes. Administrator needs to connect the ethernet and provide the power before the discovery process is started. Firstly, discover the CMM and configure the cmm node , then discover and configure the blade server/fsp.
1. The Ethernet interface of CMM and xCAT management node have been connected to the service VLAN so that xCAT management node can connect to the hardware to do the hardware discovery and management work.
2. Configure a dhcp dynamic range for the CMM and FSPs to get the temporary IP to finished the hardware discovery. In this example, the 10.0.0.0/16 will be used as the service vlan, and the 10.0.200.0/24 will be used as the temporary network for the discovery of cmm.
Note: As part of RH6.2, the dhcpd daemon will require the "dhcpd" user to be added to the "/etc/passwd file" . (The dhcpd user will be added automatically after dhcp.ppc64 was installed.) The admin can execute "adduser" command for the new dhcpd user.
adduser -s /sbin/nologin -d / dhcpd
chdef -t network 10_0_0_0-255_255_0_0 dynamicrange=10.0.200.1-10.0.200.200
makedhcp -n service dhcpd restart # linux
startsrc -s dhcpsd # AIX
1. Power on all of the chassis. This will cause the CMMs to get the temporary DHCP IP from the xCAT management node. 2. Run the lsslp to discover the CMMs:
lsslp -m -z -s CMM > /tmp/cmm.stanza
3. Edit the stanza file to give the meaningful node name for the cmms (The mpa attribute should have the same value with the name). Simply the names can be set as cmm01 to cmm99. These CMM node names will require name resolution (added to /etc/host). 4. Define the CMMs to the xCAT database:
cat /tmp/cmm.stanza | mkdef -z
5. Define the static IP for all the cmms
chdef -t group cmm ip='|cmm(\d+)|10.0.100.($1+0)|'
6. Create the CMM node name into the /etc/hosts and dns resolution if being used for name resolution.
makehosts cmm makedns cmm
1. If the user want to change the password for USERID to another one, the following command can be used:
rspconfig cmm USERID=<new_passwd>
2. Initialize the network configuration for cmms. The static IP will be configured to the cmm.
rspconfig cmm initnetwork=*
3. Enable the ssh,snmp for all the cmms
rspconfig cmm sshcfg=enable snmpcfg=enable
1. Define the blade server node definitions
The attribute 'mpa' should be set to the node name of cmm. The attribute 'slotid' should be set to the physical slot id of the blade. The attribute 'hcp' should be set to the IP that admin try to assign to the fsp of the blade.
mkdef cmm[01-02]node[01-14] groups=all,blade mgt=fsp cons=fsp chdef -t group blade mpa='|cmm(\d+)node(\d+)|cmm($1)|'slotid='|cmm(\d+)node(\d+)|($2+0)|' hcp='|cmm(\d+)node(\d+)|10.0.($1+0).($2+0)|' mgt=fsp
[root@c870f3ap01 ~]# nodels blade
cmm01node01
cmm01node03
cmm01node05
cmm01node07
cmm01node09
cmm01node10
cmm01node11
[root@c870f3ap01 ~]# lsdef cmm01node01
Object name: cmm01node01
cons=fsp
groups=blade,all
hcp=12.0.0.32
hwtype=blade
id=1
mgt=fsp
mpa=cmm01
mtm=789542X
nodetype=ppc,osi
parent=cmm01
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
serial=10F752A
slotid=1
[root@c870f3ap01 ~]#
2. Run rscan -u to discover all the blade server and fsp of blade server. The 'rscan -u' will match the xCAT nodes which have been defined in the xCAT database and update them instead of create a new one. It will also provide an error message that specifies if the blade node object is not found in the xCAT database. This type of error should happen when there is a configuration where the chassis contains both single wide and double wide blade configurations. The admin can execute the rmdef command for any unused blade and fsp node objects.
rscan cmm -u If there are a mixture of single and double wide blade in the chassis, the admin should remove the unused blade objects from the xCAT DB. rmdef <cmmxxnodeyy>