IBM Flex combines networking, storage, and compute nodes in a single offering. It's consist of an IBM Flex Chassis, one or two Chassis Management Modules(CMM) and compute nodes. The compute nodes include the IBM Flex System™ p260, p460, and 24L Power 7 servers as well as the IBM Flex System™ x240 x86 server. In this document only the management of x240 blade server will be covered.
The following terms will be used in this document:
Chassis Management Module(CMM) - this term is used to reflect the pair of management modules installed in the rear of the chassis and connected by ethernet to the MN. The CMM is used to discover the compute nodes within the chassis and for some data collection regarding the nodes and chassis.
Compute node: This term is used to refer to the servers in an IBM Flex system. Compute nodes can be either Power 7 servers or x86 Intel based servers.
blade node: blade node refers to a node with the hwtype set to blade and represents the whole compute node server.
Compute blade: refers to a compute server with a blade hwtype.
Here is a summary of the steps required to set up the cluster and what this document will take you through:
In the examples used throughout in this document, the following networks and naming conventions are used:
{{:Prepare the Management Node for xCAT Installation}}
Note: for Flex hardware, the switch configuration is only needed to discover (really to locate) the CMMs. The location of each blade is determined by the CMMs.
{{:Install xCAT on the Management Node}}
First just add the list of CMMs and the groups they belong to:
nodeadd cmm[01-15] groups=cmm,all
Now define attributes that are the same for all CMMs. These can be defined at the group level. For a description of the attribute names, see the node object definition.
chdef -t group cmm groups=cmm,all hwtype=cmm mgt=blade
Next define the attributes that vary for each CMM. There are 2 different ways to do this. Assuming your naming conventions follow a regular pattern, the fastest way to do this is use regular expressions at the group level:
chdef -t group cmm mpa='|(.*)|($1)|' ip='|cmm(\d+)|10.0.50.($1+0)|'
This might look confusing at first, but once you parse it, it's not too bad. The regular expression syntax in xcat database attribute values follows the form:
|pattern-to-match-on-the-nodename|value-to-give-the-attribute|
You use parentheses to indicate what should be matched on the left side and substituted on the right side. So for example, the mpa attribute above is:
|(.*)|($1)|
This means match the entire nodename (.*) and substitute it as the value for mpa. This is what we want because for CMMs the mpa attribute should be set to itself.
For the ip attribute above, it is:
|cmm(\d+)|10.0.50.($1+0)|
This means match the number part of the node name and use it as the last part of the IP address. (Adding 0 to the value just converts it from a string to a number to get rid of any leading zeros, i.e. change 09 to 9.) So for cmm07, the ip attribute will be 10.0.50.7.
For more information on xCAT's database regular expressions, see http://xcat.sourceforge.net/man5/xcatdb.5.html . To verify that the regular expressions are producing what you want, run lsdef for a node and confirm that the values are correct.
If you don't want to use regular expressions, you can create a stanza file containing the node attribute values:
cmm01:
mpa=cmm01
ip=10.0.50.1
cmm02:
mpa=cmm02
ip=10.0.50.2
...
Then pipe this into chdef:
cat <stanzafile> | chdef -z
When you are done defining the CMMs, listing one should look like this:
# lsdef cmm07
Object name: cmm07
groups=cmm,all
hwtype=cmm
ip=10.0.50.7
mgt=blade
mpa=cmm07
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
Add the list of blades and the groups they belong to:
nodeadd cmm[01-15]node[01-14] groups=blade,all
Define the node attributes that are the same for all blades:
chdef -t group blade cons=blade mgt=ipmi nodetype=osi,blade arch=x86_64 installnic=mac profile=compute
netboot=xnba getmac=blade serialport=0 serialspeed=115200
Now define the attributes that vary for each blade. You can use regular expressions (see the CMM section above for explanations):
chdef -t group blade mpa='|cmm(\d+)node(\d+)|cmm($1)|' id='|cmm(\d+)node(\d+)|($2+0)|' ip='|cmm(\d+)node(\d+)|10.1.($1+0).($2+0)|'
Note: we don't set the bmc attribute, because during IMM discovery xCAT will automatically give it an IPv6 address and set that in the BMC attribute.
To see what attribute values this results in, list one node:
# lsdef cmm02node05 -i mpa,ip,bmc
Object name: cmm03node05
ip=10.1.3.5
id=5
mpa=cmm03
If you don't want to use regular expressions, you can create a stanza file containing the node attribute values:
cmm01node01
ip=10.1.1.1
id=1
mpa=cmm01
cmm01node02
ip=10.1.1.2
id=2
mpa=cmm01
...
Then pipe this into chdef:
cat <stanzafile> | chdef -z
nodeadd switch[1-4] groups=switch,all
chdef -t group switch ip='|switch(\d+)|10.0.60.($1+0)|'
There are several passwords required for management:
Use tabedit to give the passwd table contents like:
#key,username,password,cryptmethod,comments,disable
"blade","USERID","Passw0rd",,,
"ipmi","USERID","Passw0rd",,,
"system","root","cluster",,,
All networks in the cluster must be defined in the networks table. When xCAT was installed, it ran makenetworks, which created an entry in this table for each of the networks the management node is connected to. Now is the time to add to the networks table any other networks in the cluster, or update existing networks in the table.
For a sample Networks Setup, see the following example: [Setting_Up_a_Linux_xCAT_Mgmt_Node#Appendix_A:_Network_Table_Setup_Example]
If you want to use hardware discovery, 2 dynamic ranges must be defined in the networks table: one for the service network (CMMs and IMMs), and one for the management network (the OS for each blade). The dynamic range in the service network (in our example 10.0) is used while discovering the CMMs and IMMs using SLP. The dynamic range in the management network (in our example 10.1) is used when booting the blade with the genesis kernel to get the MACs.
chdef -t network 10_0_0_0-255_255_0_0 dynamicrange=10.0.255.1-10.0.255.254
chdef -t network 10_1_0_0-255_255_0_0 dynamicrange=10.1.255.1-10.1.255.254
Since the map between the xCAT node names and IP addresses have been added in the xCAT database, you can run the makehosts xCAT command to create the /etc/hosts file from the xCAT database. (You can skip this step if you are creating /etc/hosts manually.)
makehosts switch,blade,cmm
Verify the entries have been created in the file /etc/hosts.
To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:
Set site.forwarders to your site-wide DNS servers that can resolve site or public hostnames. The DNS on the MN will forward any requests it can't answer to these servers.
chdef -t site forwarders=1.2.3.4,1.2.5.6
Edit /etc/resolv.conf to point the MN to its own DNS. (Note: this won't be required in xCAT 2.8 and above, but is an easy way to test that your DNS is configured properly.)
search cluster
nameserver 10.1.0.1
Run makedns
makedns
For more information about name resolution in an xCAT Cluster, see [Cluster_Name_Resolution].
You usually don't want your DHCP server listening on your public (site) network, so set site.dhcpinterfaces to your MN's cluster facing NICs. For example:
chdef -t site dhcpinterfaces=eth1
Then this will get the network stanza part of the DHCP configuration (including the dynamic range) set:
makedhcp -n
The IP/MAC mappings for the nodes will be added to DHCP automatically as the nodes are discovered.
Nothing to do here - the TFTP server configuration was done by xCAT when it was installed on the Management Node.
makeconservercf
When xCAT discovers hardware on the network, it needs a way to correlate each component returned by SLP to the corresponding node in the xCAT database. The recommended method is to configure the ethernet switches for SNMP access and to tell xCAT which switch port each CMM is plugged into. With this information, for each component returned by SLP, xCAT will get the MAC, find which switch port that MAC has been communicating on, and correlate that to a node name in the database.
The following section explains how to configure your switches for discovery. If you can't configure SNMP on your switches, then use the section after: [#Manually_Discovering_the_CMMs_Instead_of_Using_the_Switch_Ports].
Note: xCAT Flex discovery now does not support the CMM with both primary and standby port.
In large clusters the best automated method for discovering and mapping the CMM to the objects defined in the xCAT database is to allow xCAT to poll data from the site switch to which each chassis CMM ethernet port is connected. This will allow xcat to map the CMM to the definition in the switch table. You need to add the switch/port information below to the switch table, and the authentication information to the switches tables.
# tabdump switch
#node,switch,port,vlan,interface,comments,disable
"cmm01","switch","0/1",,,,
"cmm02","switch","0/2",,,,
where: node is the cmm node object name switch is the hostname of the switch port is the switch port id. Note that xCAT does not need the complete port name. Preceeding non numeric characters are ignored.
If you configured your switches to use SNMP V3, then you need to define several attributes in the switches table. Assuming all of your switches use the same values, you can set these attributes at the group level:
tabch switch=switch switches.snmpversion=3 switches.username=xcatadmin switches.password=passw0rd switches.auth=SHA
# tabdump switches
#switch,snmpversion,username,password,privacy,auth,linkports,sshusername,sshpassword,switchtype,comments,disable
"switch","3","xcatadmin","passw0rd",,"SHA",,,,,,
Note: It might also be necessary to allow authentication at the VLAN level
snmp-server group xcatadmin v3 auth context vlan-230
If you can't enable SNMP on your switches, use this more manual approach to discover your hardware. If you have already discovered your hardware using spldiscover of lsslp --flexdiscover, skip this whole section.
Assuming your CMMs have at least received a dynamic address from the DHCP server, you can run lsslp to discover them and create a stanza file that contains their attributes that can be used to update the existing CMM nodes in the xCAT database. The problem is that without the switch port information, lsslp has no way to correlate the responses from SLP to the correct nodes in the database, so you must do that manually. Run:
lsslp -m -z -s CMM > cmm.stanza
and it will create a stanza file with entries for each CMM that look like this:
cmm01:
objtype=node
mpa=cmm01
nodetype=mp
mtm=789392X
serial=100037A
side=2
groups=cmm,all
mgt=blade
mac=5c:f3:fc:25:da:99
hidden=0
otherinterfaces=70.0.0.7
hwtype=cmm
Note: the otherinterfaces attribute is the dynamic IP address the CMM currently has.
The first thing we want to do is strip out the non-essential attributes, because we have already defined them at a group level:
grep -v -E '(map=|nodetype=|groups=|mgt=|hidden=|hwtype=)' cmm.stanza > cmm2.stanza
Now edit cmm2.stanza and change each "<node>:" line to have the correct node name. Then put these attributes into the database:
cat cmm2.stanza | chdef -z
Now use rspconfig to set the IP address of each CMM to the permanent (static) address specified in the ip attribute:
rspconfig cmm network=*
Note: The slpdiscover command has been replaced by lsslp --flexdiscover in xCAT 2.8. If you are using the lower version ( xCAT 2.7.x ), use the slpdiscover command.
It will perform the last steps in discovery and configuration needed to complete the hardware configuration for both the CMM and the blades for both the switch port information and CMM MAC discovery methods.
Note: If you have run slpdiscover (xCAT 2.7.x)/ lsslp --flexdiscover (xCAT 2.8) before please delete the ipmi.bmcid.
slpdiscover (xCAT 2.7.x)/ lsslp --flexdiscover (xCAT 2.8) performs the following functions:
Note: xCAT only supports the CMM connecting to the vlan 1 of the switch setting.
# lsslp --flexdiscover
cmm01: Found service:management-hardware.IBM:chassis-management-module at address 70.0.0.7
cmm01: Ignoring target in bay 7, no node found with mp.mpa/mp.id matching
cmm01: Ignoring target in bay 8, no node found with mp.mpa/mp.id matching
Configuration of cmm01node05[70.0.0.16] commencing, configuration may take a few minutes to take effect
Note:
After the slpdiscover/lsslp --flexdiscover the blades and their interfaces should be defined and configured properly.
User can list the blade definitions with nodels:
# nodels blade
cmm01node05
cmm01node06
cmm01node11
cmm01node12
# lsdef cmm01node11
Object name: cmm01node11
bmc=70.0.0.9
bmcpassword=Passw0rd1
bmcusername=USERID
cons=ipmi
getmac=blade
groups=all,blade
id=11
mac=34:40:b5:be:7e:f8
mgt=ipmi
mpa=cmm01
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
After the slpdiscover (xCAT 2.7.x)/ lsslp --flexdiscover (xCAT 2.8), user also can list the ipmi table to verify that the bmcid has been updated with blade BMC mac.
# tabdump ipmi
#node,bmc,bmcport,taggedvlan,bmcid,username,password,comments,disable
"blade","|cmm(\d+)node(\d+)|10.0.($1+0).($2+0)|",,,,"USERID","passw0rd",,
"cmm01node05","70.0.0.16",,,"5c:f3:fc:6e:00:41",,,,
"cmm01node06","70.0.0.15",,,"5c:f3:fc:6e:03:94",,,,
User can ping the bmc ip and try the
hardware control commands to verify that the bmc ip and userid/password have been defined and configured correctly.
# rpower cmm01node11 stat
cmm01node11: on
# rinv cmm01node11 vpd
cmm01node11: System Description: IBM Flex System x240+10Gb Fabric
cmm01node11: System Model/MTM: 8737AC1
cmm01node11: System Serial Number: 23FFP63
cmm01node11: Chassis Serial Number: 23FFP63
cmm01node11: Device ID: 32
cmm01node11: Manufacturer ID: IBM (20301)
cmm01node11: BMC Firmware: 1.34 (1AOO27Q 2012/05/04 22:00:54)
cmm01node11: Product ID: 321
# rvitals cmm01node11 leds
cmm01node11: No active error LEDs detected
There are a few ASU settings that need to be changed from the defaults. This section will discuss what needs to be changed and how to run asu update.
If you need to update CMOS/uEFI/BIOS settings on your nodes, download ASU (Advanced Settings Utility) from the IBM Fix Central web site:
Once you have the ASU RPM on your MN (management node), you have several choices of how to run it:
ASU can be run on the management node (MN) and told to connect to the IMM of a node. First install ASU on the MN:
rpm -i ibm_utl_asu_asut78c-9.21_linux_x86-64.rpm
cd /opt/ibm/toolscenter/asu
Determine the IP address, username, and password of the IMM (BMC):
lsdef node1 -i bmc,bmcusername,bmcpasswd
tabdump passwd | grep ipmi # the default if username and password are not set for the node
Run ASU:
./asu64 show all --host <ip> --user <username> --password <pw>
./asu64 show uEFI.ProcessorHyperThreading --host <ip> --user <username> --password <pw>
./asu64 set uEFI.RemoteConsoleRedirection Enable --host <ip> --user <username> --password <pw> # a common setting that needs to be set
If you want to set a lot of settings, you can put them in a file and run:
./asu64 batch <settingsfile> --host <ip> --user <username> --password <pw>
These are the settings needed to enable the serial console:
TBD
These are other settings which are needed.
UEFI Boot/Physical Serial:
loaddefault BootOrder
loaddefault uEFI
set Processors.Hyper-Threading Disable
set BootOrder.BootOrder "PXE Network=Hard Disk 0"
In order to successfully deploy the OS you will need to associate the blade thernet MAC with the blade node object.
Listing the MAC
# rinv cmm01node11 mac
cmm01node11: MAC Address 1: 34:40:b5:be:c0:08
cmm01node11: MAC Address 2: 34:40:b5:be:c0:0c
Setting the MAC
# chdef cmm01node11 mac=34:40:b5:be:c0:08
Listing the MAC in the node object
# lsdef cmm01node11 -i mac
Object name: cmm01node11
mac=34:40:b5:be:c0:08
Documentation for system x blade node provisioning is in another document which describes the steps necesary to properly provision the nodes. When reading this document keep in mind that there are differences in the attributes between the document which is desribing IDataPlex nodes and these balde nodes. Here is the link to the provisioning document.
https://sourceforge.net/apps/mediawiki/xcat/index.php?title=XCAT_iDataPlex_Advanced_Setup
The CMM firmware can be updated by loading the latest cmefs.uxp firmware file using the CMM update command working with the http interface. The administrator needs to download firmware from IBM Fix Central. The compressed tar file will need to be uncompressed and unzipped to extract the firmware update files. Place the cmefs.uxp file in a specified directory on the xCAT MN.
Once the firmware is unzipped and the cmefs.uxp is placed in the directory on the xCAT MN you can use the CMM update command to update the new firmware on one chassis at a time or on all chassis managed by xCAT MN. More details on the CMM update command can be found at: http://publib.boulder.ibm.com/infocenter/flexsys/information/index.jsp?topic=%2Fcom.ibm.acc.cmm.doc%2Fcli_command_update.html
The format of the update command is: flash (-u) the file and reboot (-r) afterwards
update -T system:mm[1] -r -u http://<server>/<MN directory>/<update file>
flash (-u), show progress (-v), and reboot (-r) afterwards
update -T system:mm[1] -v -r -u http://<server>/<MN directory>/<update file>
To update firmware and restart a single CMM cmm01 from xCAT MN 70.0.0.1 use:
ssh USERID@cmm01 update -T system:mm[1] -v -r -u http://70.0.0.1/firmware/cmefs.uxp
If unprompted password is setup on all CMMs then you can use xCAT psh to update all CMMs in the cluster at once.
psh -l USERID cmm update -T system:mm[1] -v -u http://70.0.0.1/firmware/cmefs.uxp
If you are experiencing a "Unsupported security level" message after the CMM firmware was updated then you should run the following command to overcome this issue.
rspconfig cmm sshcfg=enable snmpcfg=enable
Documentation for the system x firmware updates in included in the link below. As with the provisioning document keep in mind that there are some differences in attribues between IDataPlex and blade nodes in the documentation.
https://sourceforge.net/apps/mediawiki/xcat/index.php?title=XCAT_iDataPlex_Advanced_Setup#Updating_Node_Firmware