IBM Flex combines networking, storage, and compute nodes in a single offering. It's consist of an IBM Flex Chassis, one or two Chassis Management Modules(CMM) and compute nodes. The compute nodes include the IBM Flex System™ p260, p460, and 24L Power 7 servers as well as the IBM Flex System™ x240 x86 server. In this document only the management of x240 blade server will be covered.
The following terms will be used in this document:
Here is a summary of the steps required to set up the cluster and what this document will take you through:
In the examples used throughout in this document, the following networks and naming conventions are used:
{{:Prepare the Management Node for xCAT Installation}}
Note: for Flex hardware, the switch configuration is only needed to discover (really to locate) the CMMs. The location of each blade is determined by the CMMs.
{{:Install xCAT on the Management Node}}
Since the map between the xCAT node names and IP addresses have been added in the xCAT database, you can run the makehosts xCAT command to create the /etc/hosts file from the xCAT database. (You can skip this step if you are creating /etc/hosts manually.)
makehosts switch,blade,cmm
Verify the entries have been created in the file /etc/hosts.
To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:
Set site.forwarders to your site-wide DNS servers that can resolve site or public hostnames. The DNS on the MN will forward any requests it can't answer to these servers.
chdef -t site forwarders=1.2.3.4,1.2.5.6
Edit /etc/resolv.conf to point the MN to its own DNS. (Note: this won't be required in xCAT 2.8 and above, but is an easy way to test that your DNS is configured properly.)
search cluster
nameserver 10.1.0.1
Run makedns
makedns
For more information about name resolution in an xCAT Cluster, see [Cluster_Name_Resolution].
You usually don't want your DHCP server listening on your public (site) network, so set site.dhcpinterfaces to your MN's cluster facing NICs. For example:
chdef -t site dhcpinterfaces=eth1
Then this will get the network stanza part of the DHCP configuration (including the dynamic range) set:
makedhcp -n
The IP/MAC mappings for the nodes will be added to DHCP automatically as the nodes are discovered.
Nothing to do here - the TFTP server configuration was done by xCAT when it was installed on the Management Node.
makeconservercf
{{:Define the CMMs and Switches}}
{{:CMM_Discovery_and_Configuration}}
There are multiple options for getting the blades defined in the xCAT database: The first 2 options are to be used for xCAT 2.8 and later releases. The third option is to be used for our System X flex blade support with xCAT 2.7.
This implementation is most useful when you have uniform blade configurations. If there is a mixture of single and double wide blades in the chassis, you will have to remove the unused blade node definitions in the database after doing the mkdef below.
First, pre-define the blades. It is easiest to base the node names on the cmm and slot location:
mkdef cmm[01-02]node[01-14] groups=blade,all
At a group level, define the node attributes that are the same for all blades:
chdef -t group blade mgt=ipmi cons=ipmi getmac=blade nodetype=mp,osi hwtype=blade installnic=mac \
profile=compute netboot=xnba arch=x86_64
Now define the node attributes that vary for each blade:
You can use regular expressions for this to define the atrributes for all blades in one command (see [Listing_and_Modifying_the_Database#Using_Regular_Expressions_in_the_xCAT_Tables] for an explanation of how to use regular expressions in the xCAT database):
chdef -t group blade mpa='|cmm(\d+)node(\d+)|cmm($1)|' \
slotid='|cmm(\d+)node(\d+)|($2+0)|' bmc='|cmm(\d+)node(\d+)|10.0.($1+0).($2+0)|
To ensure that the attribute values are set the way you want them to be, list one node:
# lsdef cmm02node05 -i mpa,slotid,bmc
Object name: cmm02node05
bmc=10.0.2.5
mpa=cmm02
slotid=5
Now run rscan -u to discover all the blade servers and add the hardware-related attributes to the node skeleton definitions you previously created. The 'rscan -u' command will match the xCAT nodes which have been defined in the xCAT database with the actual blades in the chassis and get attributes like the serial number and mac.
rscan cmm -u
Note: If you get an error message in hardware control commands later on that a blade can't be communicated with, it could be that the chassis contains both single wide and double wide blade configurations, so you have some blade definitions in the database that don't actually exist in the chassis. If this is the case, use the rmdef command to remove the appropriate blade node objects.
Note: This method is suggested when you have a mix of single and double-wide flex blades.
The rscan -z command reads the actual configuration of chassis and creates node definitions in a stanza file for the CMMs and each blade. The stanza file should have all of the correct node attributes that can be piped into chdef, except the node names. This is because xCAT doesn't yet have any way of knowing what node name you want each blade to have. Therefore, you need to manually edit the file to change the node names to what you want to use.
Run the rscan command against all of the CMMs to create a stanza file that contains all of the blades:
rscan cmm -z >nodes.stanza
The following is a sample of the stanza data of one blade from rscan:
sn#y030bg168034:
objtype=node
nodetype=mp
slotid=5
mtm=8737AC1
serial=xxxxxxx
mpa=cmm02
groups=xblade,all
mgt=ipmi
cons=ipmi
hwtype=blade
For a description of each attribute, see node attributes.
Edit nodes.stanza and do 2 things:
Then pipe this into chdef to create the node definitions in the database:
cat nodes.stanza | chdef -z
The support for System X flex blades in xCAT 2.7 follows similar support as the previous x blades. There were modifications made in xCAT 2.8 to enhance the xCAT Flex blade support. The main differences in xCAT 2.7 is that hwtype and getmacs attributes are not used. The id (not slotid) attribute is used to reference the physical slot location. The rscan command does not support System X flex blades, where the slpdiscover command is used to update ipmi hardware information for each blade.
This implementation is most useful when you have uniform blade configurations. If there is a mixture of single and double wide blades in the chassis, you will have to remove the unused blade node definitions in the database after doing the mkdef below.
First, pre-define the blades. It is easiest to base the node names on the cmm and slot location:
mkdef cmm[01-02]node[01-14] groups=blade,all
At a group level, define the node attributes that are the same for all blades:
chdef -t group blade mgt=ipmi cons=ipmi nodetype=mp,osi installnic=mac \
profile=compute netboot=xnba arch=x86_64
Now define the node attributes that vary for each blade:
You can use regular expressions for this to define the atrributes for all blades in one command (see [Listing_and_Modifying_the_Database#Using_Regular_Expressions_in_the_xCAT_Tables] for an explanation of how to use regular expressions in the xCAT database):
chdef -t group blade mpa='|cmm(\d+)node(\d+)|cmm($1)|' \
id='|cmm(\d+)node(\d+)|($2+0)|' bmc='|cmm(\d+)node(\d+)|10.0.($1+0).($2+0)|
To ensure that the attribute values are set the way you want them to be, list one node:
# lsdef cmm02node05 -i mpa,id,bmc
Object name: cmm02node05
bmc=10.0.2.5
mpa=cmm02
id=5
Now run slpdiscover to discover the blade servers and add the hardware-related attributes for the nodes. The 'slpdiscover' command matches the xCAT nodes defined in the xCAT database with the actual blades in the chassis and updates the ipmi table.
slpdiscover
Use the rspconfig command to set the IMM IP address to a permanent static IP address.
rspconfig blade network=*
rspconfig blade USERID=*
You may want to change IMM device name of each blade (the name the CMM knows it by) to be the same as the xCAT node name of the blade:
rspconfig blade textid=*
For Flex system x blades you need to set the following hardware settings to enable the console (for rcons):
set DevicesandIOPorts.Com1ActiveAfterBoot Enable
set DevicesandIOPorts.SerialPortSharing Enable
set DevicesandIOPorts.SerialPortAccessMode Dedicated
set DevicesandIOPorts.RemoteConsole Enable
See [XCAT_iDataPlex_Advanced_Setup#Using_ASU_to_Update_CMOS,_uEFI,_or_BIOS_Settings_on_the_Nodes] for how to set these ASU setting.
In order to successfully deploy the OS you need to get the MAC for each blades in-band NIC that is connected to the management network and store it in the blade node object.
You can display all of the MACs for blades:
# rinv cmm01node11 mac
cmm01node11: MAC Address 1: 34:40:b5:be:c0:08
cmm01node11: MAC Address 2: 34:40:b5:be:c0:0c
To get the first MAC for each blade and store it in the database:
getmacs blade
If you want to use the MAC for an adapter other than the first one, use the -i option of getmacs. For example:
getmacs blade -i eth1
To display the MACs just collected:
# lsdef blade -ci mac
cmm01node01: mac=34:40:b5:be:c0:08
...
{{:Using_Provmethod=osimagename}}
rsetboot compute net
rpower compute boot
{{:Installing Stateful Linux Nodes}}
The nodeset command tells xCAT what you want to do next with this node, rsetboot tells the node hardware to boot from the network for the next boot, and powering on the node using rpower starts the installation process:
nodeset compute osimage=mycomputeimage
rsetboot compute net
rpower compute boot
Tip: when nodeset is run, it processes the kickstart or autoyast template associated with the osimage, plugging in node-specific attributes, and creates a specific kickstart/autoyast file for each node in /install/autoinst. If you need to customize the template, make a copy of the template file that is pointed to by the osimage.template attribute and edit that file (or the files it includes).
{{:Monitor Installation}}
Now that your basic cluster is set up, here are suggestions for additional reading:
The CMM firmware can be updated by loading the latest cmefs.uxp firmware file using the CMM update command working with the http interface. The administrator needs to download firmware from IBM Fix Central. The compressed tar file will need to be uncompressed and unzipped to extract the firmware update files. Place the cmefs.uxp file in a specified directory on the xCAT MN.
Once the firmware is unzipped and the cmefs.uxp is placed in the directory on the xCAT MN you can use the CMM update command to update the new firmware on one chassis at a time or on all chassis managed by xCAT MN. More details on the CMM update command can be found at: http://publib.boulder.ibm.com/infocenter/flexsys/information/index.jsp?topic=%2Fcom.ibm.acc.cmm.doc%2Fcli_command_update.html
The format of the update command is: flash (-u) the file and reboot (-r) afterwards
update -T system:mm[1] -r -u http://<server>/<MN directory>/<update file>
flash (-u), show progress (-v), and reboot (-r) afterwards
update -T system:mm[1] -v -r -u http://<server>/<MN directory>/<update file>
To update firmware and restart a single CMM cmm01 from xCAT MN 70.0.0.1 use:
ssh USERID@cmm01 update -T system:mm[1] -v -r -u http://70.0.0.1/firmware/cmefs.uxp
If unprompted password is setup on all CMMs then you can use xCAT psh to update all CMMs in the cluster at once.
psh -l USERID cmm update -T system:mm[1] -v -u http://70.0.0.1/firmware/cmefs.uxp
If you are experiencing a "Unsupported security level" message after the CMM firmware was updated then you should run the following command to overcome this issue.
rspconfig cmm sshcfg=enable snmpcfg=enable
The firmware of the blades can be updated by following: https://sourceforge.net/apps/mediawiki/xcat/index.php?title=XCAT_iDataPlex_Advanced_Setup#Updating_Node_Firmware .
This section provides manual procedures to help update the firmware for Ethernet and Infiniband (IB) Switch modules. There is more detail information can be referenced in the IBM Flex System documentation under Network switches: http://publib.boulder.ibm.com/infocenter/flexsys/information/
The IB6131 Switch module is a Mellanox IB switch, and you down load firmware (image-PPC_M460EX-SX_3.2.xxx.img) from the Mellanox website into your xCAT Management Node or server that can communicate to Flex IB6131 switch module. We provided the firmware update procedure for the Mellanox IB switches including IB6131 Switch module in our xCAT document Managing the Mellanox Infiniband Network: https://sourceforge.net/apps/mediawiki/xcat/index.php?title=Managing_the_Mellanox_Infiniband_Network#Mellanox_Switch_and_Adapter_Firmware_Update
The IBM Flex system supports Ethernet switch modules models (EN2092 (1GB), EN4093 (10GB), and the firmware is available from the IBM Support Portal http://www-947.ibm.com/support/entry/portal/overview?brandind=hardware~puresystems~pureflex_system. The firmware update procedure used with the Flex Ethernet (EN2092) switch module which will reference two firmware images for OS (GbScSE-1G-10G-7.5.1.xx_OS.img) and Boot (GbScSE-1G-10G-7.5.1.x_Boot.img). These images should be placed on the xCAT MN or FTP server in the /tftpboot directory. Make sure that this server has proper ethernet communication to the Ethernet switch module.
1) Login to the Ethernet switch using the "admin" userid and specify the admin password.
_ssh admin@<switchipaddr>_
2) Get into boot directory, and list current image settings with cur command. This includes 2 OS images called image1 and image2,and will specify which image is the current boot image.
_>> boot_
_>> cur_
3) Get the new Ethernet OS image file from the ftp server to replace the older image on the ethernet switch using gtimg command. The gtimg command will prompt you for full path OS image file name, ftp/root userid, and password. It will ask to specify "data" port, and a confirmation to complete the download, and flashes the update. An example of EN2092 OS image would be "GbScSE-1G-10G-7.5.1.0_OS.img", and replaces "image2" on the ethernet switch.
_>> gtimg image2 <FTP server> GbScSE-1G-10G-7.5.1.0_OS.img_
Enter name of file on FTP/TFTP server: _/tftpboot/GbScSE-1G-10G-7.5.1.0_OS.img_
Enter username for FTP server or hit return for TFTP server: _root_
Enter password for username on FTP server: _<root password>_
Enter the port to use for downloading the image ["data"|"mgt"]: _"data"_
Confirm download operation [y/n]: _y_
4) Get the new Ethernet boot image file from the ftp server to replace cuurent boot image on the ethernet switch using gtimg command. The gtimg command will prompt you for full path OS image file name, ftp/root userid, and password. It will ask to specify "data" port, and a confirmation to complete the download, and flashes the update. An example of EN2092 OS image would be "GbScSE-1G-10G-7.5.1.0_Boot.img", and will point to new boot image2.
_>> gtimg image2 <FTP server> GbScSE-1G-10G-7.5.1.0_Boot.img_
Enter name of file on FTP/TFTP server: _/tftpboot/GbScSE-1G-10G-7.5.1.0_Boot.img_
Enter username for FTP server or hit return for TFTP server: _root_
Enter password for username on FTP server: _<root password>_
Enter the port to use for downloading the image ["data"|"mgt"]: _"data"_
Confirm download operation [y/n]: _y_
5) Validate the current image settings with cur command, where image2 now has the latest firmware level, and that the current boot image is working with latest image2 file. You can then execute the reset command to boot the ethernet switch using the latest firmware level.
_>> cur_
_>> reset_
This section has been moved to an appendix because the discovery method for 2.7.7 and 2.8.1 was modified to be consistent for both p and x Flex blades. The methods below are no longer the preferred methods but are kept here for administrators which may have used these methods previously.
xCAT provides a command call slpdiscover (in xCAT 2.7) or lsslp --flexdiscover (in xCAT 2.8 and above) to detect the CMM and blade hardware, and configure it. It does the following things:
Notes:
Run the discover command (tail -f /var/log/messages to follow the progress):
# lsslp --flexdiscover # or use slpdiscover for xCAT 2.7
cmm01: Found service:management-hardware.IBM:chassis-management-module at address 10.0.255.7
cmm01: Ignoring target in bay 8, no node found with mp.mpa/mp.id matching
Configuration of cmm01node05[10.0.1.5] commencing, configuration may take a few minutes to take effect
Note: the message "cmm01: Ignoring target in bay 7, no node found with mp.mpa/mp.id matching" that it could not fine a blade in the database with this mpa and id attributes.
After slpdiscover/lsslp --flexdiscover completes, hardware control for the CMMs and blades should be configured properly. First check to see if the mac attribute is set for all of the CMMs and the ipmi.bmcid attribute is set for all of the blades:
lsdef cmm -c -i mac
nodels blade ipmi.bmcid
If they are, then verify hardware control is working:
# rpower blade stat | xcoll
====================================
blade
====================================
on
# rinv cmm01node11 vpd
cmm01node11: System Description: IBM Flex System x240+10Gb Fabric
cmm01node11: System Model/MTM: 8737AC1
cmm01node11: System Serial Number: 23FFP63
cmm01node11: Chassis Serial Number: 23FFP63
cmm01node11: Device ID: 32
cmm01node11: Manufacturer ID: IBM (20301)
cmm01node11: BMC Firmware: 1.34 (1AOO27Q 2012/05/04 22:00:54)
cmm01node11: Product ID: 321