IBM Flex combines networking, storage, and compute nodes in a single offering. It's consist of an IBM Flex Chassis, one or two Chassis Management Modules(CMM) and compute nodes. The compute nodes include the IBM Flex System™ p260, p460, and 24L Power 7 servers as well as the IBM Flex System™ x240 x86 server. In this document only the management of x240 blade server will be covered.
The following terms will be used in this document:
Here is a summary of the steps required to set up the cluster and what this document will take you through:
In the examples used throughout in this document, the following networks and naming conventions are used:
{{:Prepare the Management Node for xCAT Installation}}
Note: for Flex hardware, the switch configuration is only needed to discover (really to locate) the CMMs. The location of each blade is determined by the CMMs.
{{:Install xCAT on the Management Node}}
First just add the list of CMMs and the groups they belong to:
nodeadd cmm[01-15] groups=cmm,all
Now define attributes that are the same for all CMMs. These can be defined at the group level. For a description of the attribute names, see the node object definition.
chdef -t group cmm groups=cmm,all hwtype=cmm mgt=blade
Next define the attributes that vary for each CMM. There are 2 different ways to do this. Assuming your naming conventions follow a regular pattern, the fastest way to do this is use regular expressions at the group level:
chdef -t group cmm mpa='|(.*)|($1)|' ip='|cmm(\d+)|10.0.50.($1+0)|'
This might look confusing at first, but once you parse it, it's not too bad. The regular expression syntax in xcat database attribute values follows the form:
|pattern-to-match-on-the-nodename|value-to-give-the-attribute|
You use parentheses to indicate what should be matched on the left side and substituted on the right side. So for example, the mpa attribute above is:
|(.*)|($1)|
This means match the entire nodename (.*) and substitute it as the value for mpa. This is what we want because for CMMs the mpa attribute should be set to itself.
For the ip attribute above, it is:
|cmm(\d+)|10.0.50.($1+0)|
This means match the number part of the node name and use it as the last part of the IP address. (Adding 0 to the value just converts it from a string to a number to get rid of any leading zeros, i.e. change 09 to 9.) So for cmm07, the ip attribute will be 10.0.50.7.
For more information on xCAT's database regular expressions, see http://xcat.sourceforge.net/man5/xcatdb.5.html . To verify that the regular expressions are producing what you want, run lsdef for a node and confirm that the values are correct.
If you don't want to use regular expressions, you can create a stanza file containing the node attribute values:
cmm01:
mpa=cmm01
ip=10.0.50.1
cmm02:
mpa=cmm02
ip=10.0.50.2
...
Then pipe this into chdef:
cat <stanzafile> | chdef -z
When you are done defining the CMMs, listing one should look like this:
# lsdef cmm07
Object name: cmm07
groups=cmm,all
hwtype=cmm
ip=10.0.50.7
mgt=blade
mpa=cmm07
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
Add the list of blades and the groups they belong to:
nodeadd cmm[01-15]node[01-14] groups=blade,all
Define the node attributes that are the same for all blades:
chdef -t group blade cons=blade mgt=ipmi nodetype=osi,blade arch=x86_64 installnic=mac profile=compute
netboot=xnba getmac=blade serialport=0 serialspeed=115200
Now define the attributes that vary for each blade. You can use regular expressions (see the CMM section above for explanations):
chdef -t group blade mpa='|cmm(\d+)node(\d+)|cmm($1)|' id='|cmm(\d+)node(\d+)|($2+0)|' ip='|cmm(\d+)node(\d+)|10.1.($1+0).($2+0)|'
Note: we don't set the bmc attribute, because during IMM discovery xCAT will automatically give it an IPv6 address and set that in the BMC attribute.
To see what attribute values this results in, list one node:
# lsdef cmm02node05 -i mpa,ip,bmc
Object name: cmm03node05
ip=10.1.3.5
id=5
mpa=cmm03
If you don't want to use regular expressions, you can create a stanza file containing the node attribute values:
cmm01node01
ip=10.1.1.1
id=1
mpa=cmm01
cmm01node02
ip=10.1.1.2
id=2
mpa=cmm01
...
Then pipe this into chdef:
cat <stanzafile> | chdef -z
nodeadd switch[1-4] groups=switch,all
chdef -t group switch ip='|switch(\d+)|10.0.60.($1+0)|'
There are several passwords required for management:
Use tabedit to give the passwd table contents like:
#key,username,password,cryptmethod,comments,disable
"blade","USERID","Passw0rd",,,
"ipmi","USERID","Passw0rd",,,
"system","root","cluster",,,
All networks in the cluster must be defined in the networks table. When xCAT was installed, it ran makenetworks, which created an entry in this table for each of the networks the management node is connected to. Now is the time to add to the networks table any other networks in the cluster, or update existing networks in the table.
For a sample Networks Setup, see the following example: [Setting_Up_a_Linux_xCAT_Mgmt_Node#Appendix_A:_Network_Table_Setup_Example]
If you want to use hardware discovery, 2 dynamic ranges must be defined in the networks table: one for the service network (CMMs and IMMs), and one for the management network (the OS for each blade). The dynamic range in the service network (in our example 10.0) is used while discovering the CMMs and IMMs using SLP. The dynamic range in the management network (in our example 10.1) is used when booting the blade with the genesis kernel to get the MACs.
chdef -t network 10_0_0_0-255_255_0_0 dynamicrange=10.0.255.1-10.0.255.254
chdef -t network 10_1_0_0-255_255_0_0 dynamicrange=10.1.255.1-10.1.255.254
Since the map between the xCAT node names and IP addresses have been added in the xCAT database, you can run the makehosts xCAT command to create the /etc/hosts file from the xCAT database. (You can skip this step if you are creating /etc/hosts manually.)
makehosts switch,blade,cmm
Verify the entries have been created in the file /etc/hosts.
To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:
Set site.forwarders to your site-wide DNS servers that can resolve site or public hostnames. The DNS on the MN will forward any requests it can't answer to these servers.
chdef -t site forwarders=1.2.3.4,1.2.5.6
Edit /etc/resolv.conf to point the MN to its own DNS. (Note: this won't be required in xCAT 2.8 and above, but is an easy way to test that your DNS is configured properly.)
search cluster
nameserver 10.1.0.1
Run makedns
makedns
For more information about name resolution in an xCAT Cluster, see [Cluster_Name_Resolution].
You usually don't want your DHCP server listening on your public (site) network, so set site.dhcpinterfaces to your MN's cluster facing NICs. For example:
chdef -t site dhcpinterfaces=eth1
Then this will get the network stanza part of the DHCP configuration (including the dynamic range) set:
makedhcp -n
The IP/MAC mappings for the nodes will be added to DHCP automatically as the nodes are discovered.
Nothing to do here - the TFTP server configuration was done by xCAT when it was installed on the Management Node.
makeconservercf
When xCAT discovers hardware on the network, it needs a way to correlate each component returned by SLP to the corresponding node in the xCAT database. The recommended method is to configure the ethernet switches for SNMP access and to tell xCAT which switch port each CMM is plugged into. With this information, for each component returned by SLP, xCAT will get the MAC, find which switch port that MAC has been communicating on, and correlate that to a node name in the database.
The following section explains how to configure your switches for discovery. If you can't configure SNMP on your switches, then use the section after: [#Manually_Discovering_the_CMMs_Instead_of_Using_the_Switch_Ports].
Note: xCAT Flex discovery now does not support the CMM with both primary and standby port.
In large clusters the best automated method for discovering and mapping the CMM to the objects defined in the xCAT database is to allow xCAT to poll data from the site switch to which each chassis CMM ethernet port is connected. This will allow xcat to map the CMM to the definition in the switch table. You need to add the switch/port information below to the switch table, and the authentication information to the switches tables.
# tabdump switch
#node,switch,port,vlan,interface,comments,disable
"cmm01","switch","0/1",,,,
"cmm02","switch","0/2",,,,
where: node is the cmm node object name switch is the hostname of the switch port is the switch port id. Note that xCAT does not need the complete port name. Preceeding non numeric characters are ignored.
If you configured your switches to use SNMP V3, then you need to define several attributes in the switches table. Assuming all of your switches use the same values, you can set these attributes at the group level:
tabch switch=switch switches.snmpversion=3 switches.username=xcatadmin switches.password=passw0rd switches.auth=SHA
# tabdump switches
#switch,snmpversion,username,password,privacy,auth,linkports,sshusername,sshpassword,switchtype,comments,disable
"switch","3","xcatadmin","passw0rd",,"SHA",,,,,,
Note: It might also be necessary to allow authentication at the VLAN level
snmp-server group xcatadmin v3 auth context vlan-230
If you can't enable SNMP on your switches, use this more manual approach to discover your hardware. If you have already discovered your hardware using spldiscover of lsslp --flexdiscover, skip this whole section.
Assuming your CMMs have at least received a dynamic address from the DHCP server, you can run lsslp to discover them and create a stanza file that contains their attributes that can be used to update the existing CMM nodes in the xCAT database. The problem is that without the switch port information, lsslp has no way to correlate the responses from SLP to the correct nodes in the database, so you must do that manually. Run:
lsslp -m -z -s CMM > cmm.stanza
and it will create a stanza file with entries for each CMM that look like this:
cmm01:
objtype=node
mpa=cmm01
nodetype=mp
mtm=789392X
serial=100037A
side=2
groups=cmm,all
mgt=blade
mac=5c:f3:fc:25:da:99
hidden=0
otherinterfaces=70.0.0.7
hwtype=cmm
Note: the otherinterfaces attribute is the dynamic IP address the CMM currently has.
The first thing we want to do is strip out the non-essential attributes, because we have already defined them at a group level:
grep -v -E '(map=|nodetype=|groups=|mgt=|hidden=|hwtype=)' cmm.stanza > cmm2.stanza
Now edit cmm2.stanza and change each "<node>:" line to have the correct node name. Then put these attributes into the database:
cat cmm2.stanza | chdef -z
Now use rspconfig to set the IP address of each CMM to the permanent (static) address specified in the ip attribute:
rspconfig cmm network=*
xCAT provides a command call slpdiscover (in xCAT 2.7) or lsslp --flexdiscover (in xCAT 2.8 and above) to detect the CMM and blade hardware, and configure it. It does the following things:
Notes:
Run the discover command (tail -f /var/log/messages to follow the progress):
# lsslp --flexdiscover # or use slpdiscover for xCAT 2.7
cmm01: Found service:management-hardware.IBM:chassis-management-module at address 10.0.255.7
cmm01: Ignoring target in bay 8, no node found with mp.mpa/mp.id matching
Configuration of cmm01node05[10.0.1.5] commencing, configuration may take a few minutes to take effect
Note: the message "cmm01: Ignoring target in bay 7, no node found with mp.mpa/mp.id matching" that it could not fine a blade in the database with this mpa and id attributes.
After slpdiscover/lsslp --flexdiscover completes, hardware control for the CMMs and blades should be configured properly. First check to see if the mac attribute is set for all of the CMMs and the ipmi.bmcid attribute is set for all of the blades:
lsdef cmm -c -i mac
nodels blade ipmi.bmcid
If they are, then verify hardware control is working:
# rpower blade stat | xcoll
====================================
blade
====================================
on
# rinv cmm01node11 vpd
cmm01node11: System Description: IBM Flex System x240+10Gb Fabric
cmm01node11: System Model/MTM: 8737AC1
cmm01node11: System Serial Number: 23FFP63
cmm01node11: Chassis Serial Number: 23FFP63
cmm01node11: Device ID: 32
cmm01node11: Manufacturer ID: IBM (20301)
cmm01node11: BMC Firmware: 1.34 (1AOO27Q 2012/05/04 22:00:54)
cmm01node11: Product ID: 321
For Flex system x blades you need to set the following hardware settings to enable the console (for rcons):
set DevicesandIOPorts.Com1ActiveAfterBoot Enable
set DevicesandIOPorts.SerialPortSharing Enable
set DevicesandIOPorts.SerialPortAccessMode Dedicated
set DevicesandIOPorts.RemoteConsole Enable
See [XCAT_iDataPlex_Advanced_Setup#Using_ASU_to_Update_CMOS,_uEFI,_or_BIOS_Settings_on_the_Nodes] for how to set these ASU setting.
In order to successfully deploy the OS you need to get the MAC for each blades in-band NIC that is connected to the management network and store it in the blade node object.
You can display the MACs by:
# rinv cmm01node11 mac
cmm01node11: MAC Address 1: 34:40:b5:be:c0:08
cmm01node11: MAC Address 2: 34:40:b5:be:c0:0c
To get the first MAC for each blade and store it in the database:
getmacs blade
To display the MAC in the node object:
# lsdef cmm01node11 -ci mac
cmm01node11: mac=34:40:b5:be:c0:08
{{:Using_Provmethod=osimagename}}
The CMM firmware can be updated by loading the latest cmefs.uxp firmware file using the CMM update command working with the http interface. The administrator needs to download firmware from IBM Fix Central. The compressed tar file will need to be uncompressed and unzipped to extract the firmware update files. Place the cmefs.uxp file in a specified directory on the xCAT MN.
Once the firmware is unzipped and the cmefs.uxp is placed in the directory on the xCAT MN you can use the CMM update command to update the new firmware on one chassis at a time or on all chassis managed by xCAT MN. More details on the CMM update command can be found at: http://publib.boulder.ibm.com/infocenter/flexsys/information/index.jsp?topic=%2Fcom.ibm.acc.cmm.doc%2Fcli_command_update.html
The format of the update command is: flash (-u) the file and reboot (-r) afterwards
update -T system:mm[1] -r -u http://<server>/<MN directory>/<update file>
flash (-u), show progress (-v), and reboot (-r) afterwards
update -T system:mm[1] -v -r -u http://<server>/<MN directory>/<update file>
To update firmware and restart a single CMM cmm01 from xCAT MN 70.0.0.1 use:
ssh USERID@cmm01 update -T system:mm[1] -v -r -u http://70.0.0.1/firmware/cmefs.uxp
If unprompted password is setup on all CMMs then you can use xCAT psh to update all CMMs in the cluster at once.
psh -l USERID cmm update -T system:mm[1] -v -u http://70.0.0.1/firmware/cmefs.uxp
If you are experiencing a "Unsupported security level" message after the CMM firmware was updated then you should run the following command to overcome this issue.
rspconfig cmm sshcfg=enable snmpcfg=enable
The firmware of the blades can be updated by following: https://sourceforge.net/apps/mediawiki/xcat/index.php?title=XCAT_iDataPlex_Advanced_Setup#Updating_Node_Firmware .