XCAT_system_x_support_for_IBM_Flex

There is a newer version of this page. You can find it here.

Introduction

IBM Flex combines networking, storage, and compute nodes in a single offering. It's consist of an IBM Flex Chassis, one or two Chassis Management Modules(CMM) and compute nodes. The compute nodes include the IBM Flex System™ p260, p460, and 24L Power 7 servers as well as the IBM Flex System™ x240 x86 server. In this document only the management of x240 blade server will be covered.

Terminology

The following terms will be used in this document:

Chassis Management Module(CMM) - this term is used to reflect the pair of management modules installed in the rear of the chassis and connected by ethernet to the MN. The CMM is used to discover the compute nodes within the chassis and for some data collection regarding the nodes and chassis.

Compute node: This term is used to refer to the servers in an IBM Flex system. Compute nodes can be either Power 7 servers or x86 Intel based servers.

blade node: blade node refers to a node with the hwtype set to blade and represents the whole compute node server.

Compute blade: refers to a compute server with a blade hwtype.

Overview of Cluster Setup Process

Here is a summary of the steps required to set up the cluster and what this document will take you through:

  1. Prepare the management node - doing these things before installing the xCAT software helps the process to go more smoothly.
  2. Install the xCAT software on the management node.
  3. Configure some cluster wide information
  4. Define a little bit of information in the xCAT database about the ethernet switches and nodes - this is necessary to direct the node discovery process.
  5. Have xCAT configure and start several network daemons - this is necessary for both node discovery and node installation.
  6. Discovery the nodes - during this phase, xCAT configures the BMC's and collects many attributes about each node and stores them in the database.
  7. Set up the OS images and install the nodes.

Example Networks and Naming Conventions Used in This Document

In the examples used throughout in this document, the following networks and naming conventions are used:

  • The service network: 10.0.0.0/255.255.0.0
    • The CMMs have IP addresses like 10.0.50.<chassisnum> and hostnames like cmm<chassisnum>
    • The blade IMMs have IP addresses like 10.0.<chassisnum>.<bladenum>
    • The switch management ports have IP addresses like 10.0.60.<switchnum> and hostnames like switch<switchnum>
  • The management network: 10.1.0.0/255.255.0.0
    • The management node IP address is 10.1.0.1
    • The OS on the blades have IP addresses like 10.1.<chassisnum>.<bladenum> and hostnames like cmm<chassisnum>node<bladenum>

Distro-specific Steps

  • [RH] indicates that step only needs to be done for RHEL and Red Hat based distros (CentOS, Scientific Linux, and in most cases Fedora).
  • [SLES] indicates that step only needs to be done for SLES.

Command Man Pages and Database Attribute Descriptions

Prepare the Management Node for xCAT Installation

{{:Prepare the Management Node for xCAT Installation}}

Note: for Flex hardware, the switch configuration is only needed to discover (really to locate) the CMMs. The location of each blade is determined by the CMMs.

Install xCAT on the Management Node

{{:Install xCAT on the Management Node}}

Define CMMs, Blades, and Switches in the xCAT Database

Define the CMMs

First just add the list of CMMs and the groups they belong to:

nodeadd cmm[01-15] groups=cmm,all

Now define attributes that are the same for all CMMs. These can be defined at the group level. For a description of the attribute names, see the node object definition.

chdef -t group cmm groups=cmm,all hwtype=cmm mgt=blade

Next define the attributes that vary for each CMM. There are 2 different ways to do this. Assuming your naming conventions follow a regular pattern, the fastest way to do this is use regular expressions at the group level:

chdef -t group cmm mpa='|(.*)|($1)|' ip='|cmm(\d+)|10.0.50.($1+0)|'

This might look confusing at first, but once you parse it, it's not too bad. The regular expression syntax in xcat database attribute values follows the form:

|pattern-to-match-on-the-nodename|value-to-give-the-attribute|

You use parentheses to indicate what should be matched on the left side and substituted on the right side. So for example, the mpa attribute above is:

|(.*)|($1)|

This means match the entire nodename (.*) and substitute it as the value for mpa. This is what we want because for CMMs the mpa attribute should be set to itself.

For the ip attribute above, it is:

|cmm(\d+)|10.0.50.($1+0)|

This means match the number part of the node name and use it as the last part of the IP address. (Adding 0 to the value just converts it from a string to a number to get rid of any leading zeros, i.e. change 09 to 9.) So for cmm07, the ip attribute will be 10.0.50.7.

For more information on xCAT's database regular expressions, see http://xcat.sourceforge.net/man5/xcatdb.5.html . To verify that the regular expressions are producing what you want, run lsdef for a node and confirm that the values are correct.

If you don't want to use regular expressions, you can create a stanza file containing the node attribute values:

cmm01:
  mpa=cmm01
  ip=10.0.50.1
cmm02:
  mpa=cmm02
  ip=10.0.50.2
...

Then pipe this into chdef:

cat &lt;stanzafile&gt; | chdef -z

When you are done defining the CMMs, listing one should look like this:

# lsdef cmm07
Object name: cmm07
    groups=cmm,all
    hwtype=cmm
    ip=10.0.50.7
    mgt=blade
    mpa=cmm07
    postbootscripts=otherpkgs
    postscripts=syslog,remoteshell,syncfiles

Define the Blades for each CMM

Add the list of blades and the groups they belong to:

nodeadd cmm[01-15]node[01-14] groups=blade,all

Define the node attributes that are the same for all blades:

chdef -t group blade cons=blade mgt=ipmi nodetype=osi,blade arch=x86_64 installnic=mac profile=compute 
netboot=xnba getmac=blade serialport=0 serialspeed=115200

Now define the attributes that vary for each blade. You can use regular expressions (see the CMM section above for explanations):

chdef -t group blade mpa='|cmm(\d+)node(\d+)|cmm($1)|' id='|cmm(\d+)node(\d+)|($2+0)|' ip='|cmm(\d+)node(\d+)|10.1.($1+0).($2+0)|'

Note: we don't set the bmc attribute, because during IMM discovery xCAT will automatically give it an IPv6 address and set that in the BMC attribute.

To see what attribute values this results in, list one node:

# lsdef cmm02node05 -i mpa,ip,bmc
Object name: cmm03node05
   ip=10.1.3.5
   id=5
   mpa=cmm03

If you don't want to use regular expressions, you can create a stanza file containing the node attribute values:

cmm01node01
   ip=10.1.1.1
   id=1
   mpa=cmm01
cmm01node02
   ip=10.1.1.2
   id=2
   mpa=cmm01
...

Then pipe this into chdef:

cat &lt;stanzafile&gt; | chdef -z

Define the Switches

nodeadd switch[1-4] groups=switch,all
chdef -t group switch ip='|switch(\d+)|10.0.60.($1+0)|'

Fill in More xCAT Tables

The passwd Table

There are several passwords required for management:

  • blade - The userid and password for the CMM.
  • ipmi - The userid and password used to communicate with the IPMI service on the IMM (BMC) of each blade. To avoid problems, this should be the same as the CMM userid and password above.
  • system - The root id and password which will be set on the node OS during node deployment and used later for the administrator to login to the node OS.

Use tabedit to give the passwd table contents like:

#key,username,password,cryptmethod,comments,disable
"blade","USERID","Passw0rd",,,
"ipmi","USERID","Passw0rd",,,
"system","root","cluster",,,

The networks Table

All networks in the cluster must be defined in the networks table. When xCAT was installed, it ran makenetworks, which created an entry in this table for each of the networks the management node is connected to. Now is the time to add to the networks table any other networks in the cluster, or update existing networks in the table.

For a sample Networks Setup, see the following example: [Setting_Up_a_Linux_xCAT_Mgmt_Node#Appendix_A:_Network_Table_Setup_Example]

Declare a dynamic range of addresses for discovery

If you want to use hardware discovery, 2 dynamic ranges must be defined in the networks table: one for the service network (CMMs and IMMs), and one for the management network (the OS for each blade). The dynamic range in the service network (in our example 10.0) is used while discovering the CMMs and IMMs using SLP. The dynamic range in the management network (in our example 10.1) is used when booting the blade with the genesis kernel to get the MACs.

chdef -t network 10_0_0_0-255_255_0_0 dynamicrange=10.0.255.1-10.0.255.254
chdef -t network 10_1_0_0-255_255_0_0 dynamicrange=10.1.255.1-10.1.255.254

Use xCAT to Configure Services on the Management Node

Setup /etc/hosts File

Since the map between the xCAT node names and IP addresses have been added in the xCAT database, you can run the makehosts xCAT command to create the /etc/hosts file from the xCAT database. (You can skip this step if you are creating /etc/hosts manually.)

makehosts switch,blade,cmm

Verify the entries have been created in the file /etc/hosts.

Setup DNS

To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:

  • Ensure that /etc/sysconfig/named does not have ROOTDIR set
  • Set site.forwarders to your site-wide DNS servers that can resolve site or public hostnames. The DNS on the MN will forward any requests it can't answer to these servers.

    chdef -t site forwarders=1.2.3.4,1.2.5.6

  • Edit /etc/resolv.conf to point the MN to its own DNS. (Note: this won't be required in xCAT 2.8 and above, but is an easy way to test that your DNS is configured properly.)

    search cluster
    nameserver 10.1.0.1

  • Run makedns

    makedns

For more information about name resolution in an xCAT Cluster, see [Cluster_Name_Resolution].

Setup DHCP

You usually don't want your DHCP server listening on your public (site) network, so set site.dhcpinterfaces to your MN's cluster facing NICs. For example:

chdef -t site dhcpinterfaces=eth1

Then this will get the network stanza part of the DHCP configuration (including the dynamic range) set:

makedhcp -n

The IP/MAC mappings for the nodes will be added to DHCP automatically as the nodes are discovered.

Setup TFTP

Nothing to do here - the TFTP server configuration was done by xCAT when it was installed on the Management Node.

Setup conserver

makeconservercf

Prepare for Hardware Discovery

When xCAT discovers hardware on the network, it needs a way to correlate each component returned by SLP to the corresponding node in the xCAT database. The recommended method is to configure the ethernet switches for SNMP access and to tell xCAT which switch port each CMM is plugged into. With this information, for each component returned by SLP, xCAT will get the MAC, find which switch port that MAC has been communicating on, and correlate that to a node name in the database.

The following section explains how to configure your switches for discovery. If you can't configure SNMP on your switches, then use the section after: [#Manually_Discovering_the_CMMs_Instead_of_Using_the_Switch_Ports].

Note: xCAT Flex discovery now does not support the CMM with both primary and standby port.

Mapping the CMMs to the switch port information

In large clusters the best automated method for discovering and mapping the CMM to the objects defined in the xCAT database is to allow xCAT to poll data from the site switch to which each chassis CMM ethernet port is connected. This will allow xcat to map the CMM to the definition in the switch table. You need to add the switch/port information below to the switch table, and the authentication information to the switches tables.

# tabdump switch
#node,switch,port,vlan,interface,comments,disable
"cmm01","switch","0/1",,,,
"cmm02","switch","0/2",,,,

where: node is the cmm node object name switch is the hostname of the switch port is the switch port id. Note that xCAT does not need the complete port name. Preceeding non numeric characters are ignored.

If you configured your switches to use SNMP V3, then you need to define several attributes in the switches table. Assuming all of your switches use the same values, you can set these attributes at the group level:

tabch switch=switch switches.snmpversion=3 switches.username=xcatadmin switches.password=passw0rd switches.auth=SHA


# tabdump switches
#switch,snmpversion,username,password,privacy,auth,linkports,sshusername,sshpassword,switchtype,comments,disable
"switch","3","xcatadmin","passw0rd",,"SHA",,,,,,

Note: It might also be necessary to allow authentication at the VLAN level

snmp-server group xcatadmin v3 auth context vlan-230

Manually Discovering the CMMs Instead of Using the Switch Ports

If you can't enable SNMP on your switches, use this more manual approach to discover your hardware. If you have already discovered your hardware using spldiscover of lsslp --flexdiscover, skip this whole section.

Assuming your CMMs have at least received a dynamic address from the DHCP server, you can run lsslp to discover them and create a stanza file that contains their attributes that can be used to update the existing CMM nodes in the xCAT database. The problem is that without the switch port information, lsslp has no way to correlate the responses from SLP to the correct nodes in the database, so you must do that manually. Run:

lsslp -m -z -s CMM &gt; cmm.stanza

and it will create a stanza file with entries for each CMM that look like this:

cmm01:
        objtype=node
        mpa=cmm01
        nodetype=mp
        mtm=789392X
        serial=100037A
        side=2
        groups=cmm,all
        mgt=blade
        mac=5c:f3:fc:25:da:99
        hidden=0
        otherinterfaces=70.0.0.7
        hwtype=cmm

Note: the otherinterfaces attribute is the dynamic IP address the CMM currently has.

The first thing we want to do is strip out the non-essential attributes, because we have already defined them at a group level:

grep -v -E '(map=|nodetype=|groups=|mgt=|hidden=|hwtype=)' cmm.stanza &gt; cmm2.stanza

Now edit cmm2.stanza and change each "<node>:" line to have the correct node name. Then put these attributes into the database:

cat cmm2.stanza | chdef -z

Now use rspconfig to set the IP address of each CMM to the permanent (static) address specified in the ip attribute:

rspconfig cmm network=*

Run Discovery

Note: The slpdiscover command has been replaced by lsslp --flexdiscover in xCAT 2.8. If you are using the lower version ( xCAT 2.7.x ), use the slpdiscover command.

It will perform the last steps in discovery and configuration needed to complete the hardware configuration for both the CMM and the blades for both the switch port information and CMM MAC discovery methods.

Note: If you have run slpdiscover (xCAT 2.7.x)/ lsslp --flexdiscover (xCAT 2.8) before please delete the ipmi.bmcid.

slpdiscover (xCAT 2.7.x)/ lsslp --flexdiscover (xCAT 2.8) performs the following functions:

  1. locate the CMM
  2. configure the CMM
  3. locate the blades by slot id
  4. configure the IMM IP/USER ID/PW for each nodes that has been defined in MN.

Note: xCAT only supports the CMM connecting to the vlan 1 of the switch setting.

# lsslp --flexdiscover
cmm01: Found service:management-hardware.IBM:chassis-management-module at address 70.0.0.7
cmm01: Ignoring target in bay 7, no node found with mp.mpa/mp.id matching
cmm01: Ignoring target in bay 8, no node found with mp.mpa/mp.id matching
Configuration of cmm01node05[70.0.0.16] commencing, configuration may take a few minutes to take effect

Note:

  1. "cmm01: Ignoring target in bay 7, no node found with mp.mpa/mp.id matching" This message means the bay 7 does not have a node defined on MN for this CMM and slot id.
  2. the IMM password will be same as the CMM password setting.

Checking the Result of the slpdiscover or lsslp --flexdiscover Command

After the slpdiscover/lsslp --flexdiscover the blades and their interfaces should be defined and configured properly.

List the blade node defintions

User can list the blade definitions with nodels:

# nodels blade
cmm01node05
cmm01node06
cmm01node11
cmm01node12

List a blade node attributes

# lsdef cmm01node11
Object name: cmm01node11
    bmc=70.0.0.9
    bmcpassword=Passw0rd1
    bmcusername=USERID
    cons=ipmi
    getmac=blade
    groups=all,blade
    id=11
    mac=34:40:b5:be:7e:f8
    mgt=ipmi
    mpa=cmm01
    postbootscripts=otherpkgs
    postscripts=syslog,remoteshell,syncfiles

Check the ipmi table

After the slpdiscover (xCAT 2.7.x)/ lsslp --flexdiscover (xCAT 2.8), user also can list the ipmi table to verify that the bmcid has been updated with blade BMC mac.

# tabdump ipmi
#node,bmc,bmcport,taggedvlan,bmcid,username,password,comments,disable
"blade","|cmm(\d+)node(\d+)|10.0.($1+0).($2+0)|",,,,"USERID","passw0rd",,
"cmm01node05","70.0.0.16",,,"5c:f3:fc:6e:00:41",,,,
"cmm01node06","70.0.0.15",,,"5c:f3:fc:6e:03:94",,,,

Verify that hardware control is working

User can ping the bmc ip and try the

hardware control commands to verify that the bmc ip and userid/password have been defined and configured correctly.

# rpower cmm01node11 stat
cmm01node11: on


# rinv cmm01node11 vpd
cmm01node11: System Description: IBM Flex System x240+10Gb Fabric
cmm01node11: System Model/MTM: 8737AC1
cmm01node11: System Serial Number: 23FFP63
cmm01node11: Chassis Serial Number: 23FFP63
cmm01node11: Device ID: 32
cmm01node11: Manufacturer ID: IBM (20301)
cmm01node11: BMC Firmware: 1.34 (1AOO27Q 2012/05/04 22:00:54)
cmm01node11: Product ID: 321


# rvitals cmm01node11 leds
cmm01node11: No active error LEDs detected

Using ASU to Update CMOS, uEFI, or BIOS Settings on the Nodes

There are a few ASU settings that need to be changed from the defaults. This section will discuss what needs to be changed and how to run asu update.

Download ASU

If you need to update CMOS/uEFI/BIOS settings on your nodes, download ASU (Advanced Settings Utility) from the IBM Fix Central web site:

Once you have the ASU RPM on your MN (management node), you have several choices of how to run it:

Run ASU Out-of-Band

ASU can be run on the management node (MN) and told to connect to the IMM of a node. First install ASU on the MN:

rpm -i ibm_utl_asu_asut78c-9.21_linux_x86-64.rpm
cd /opt/ibm/toolscenter/asu

Determine the IP address, username, and password of the IMM (BMC):

lsdef node1 -i bmc,bmcusername,bmcpasswd
tabdump passwd | grep ipmi      # the default if username and password are not set for the node

Run ASU:

./asu64 show all --host &lt;ip&gt; --user &lt;username&gt; --password &lt;pw&gt;
./asu64 show uEFI.ProcessorHyperThreading --host &lt;ip&gt; --user &lt;username&gt; --password &lt;pw&gt;
./asu64 set uEFI.RemoteConsoleRedirection Enable --host &lt;ip&gt; --user &lt;username&gt; --password &lt;pw&gt;  # a common setting that needs to be set

If you want to set a lot of settings, you can put them in a file and run:

./asu64 batch &lt;settingsfile&gt; --host &lt;ip&gt; --user &lt;username&gt; --password &lt;pw&gt;

These are the settings needed to enable the serial console:

TBD

These are other settings which are needed.

UEFI Boot/Physical Serial:


loaddefault BootOrder
loaddefault uEFI
set Processors.Hyper-Threading Disable
set BootOrder.BootOrder "PXE Network=Hard Disk 0"

Collect and set the MAC address in preparation for deployment

In order to successfully deploy the OS you will need to associate the blade thernet MAC with the blade node object.

Listing the MAC

# rinv cmm01node11 mac
cmm01node11: MAC Address 1: 34:40:b5:be:c0:08
cmm01node11: MAC Address 2: 34:40:b5:be:c0:0c

Setting the MAC

# chdef cmm01node11 mac=34:40:b5:be:c0:08

Listing the MAC in the node object

# lsdef cmm01node11 -i mac
Object name: cmm01node11
   mac=34:40:b5:be:c0:08

Blade node provisioning

Documentation for system x blade node provisioning is in another document which describes the steps necesary to properly provision the nodes. When reading this document keep in mind that there are differences in the attributes between the document which is desribing IDataPlex nodes and these balde nodes. Here is the link to the provisioning document.

https://sourceforge.net/apps/mediawiki/xcat/index.php?title=XCAT_iDataPlex_Advanced_Setup

Update the CMM firmware

The CMM firmware can be updated by loading the latest cmefs.uxp firmware file using the CMM update command working with the http interface. The administrator needs to download firmware from IBM Fix Central. The compressed tar file will need to be uncompressed and unzipped to extract the firmware update files. Place the cmefs.uxp file in a specified directory on the xCAT MN.

Once the firmware is unzipped and the cmefs.uxp is placed in the directory on the xCAT MN you can use the CMM update command to update the new firmware on one chassis at a time or on all chassis managed by xCAT MN. More details on the CMM update command can be found at: http://publib.boulder.ibm.com/infocenter/flexsys/information/index.jsp?topic=%2Fcom.ibm.acc.cmm.doc%2Fcli_command_update.html

The format of the update command is: flash (-u) the file and reboot (-r) afterwards

update -T system:mm[1] -r -u http://&lt;server&gt;/&lt;MN directory&gt;/&lt;update file&gt;

flash (-u), show progress (-v), and reboot (-r) afterwards

update -T system:mm[1] -v -r -u http://&lt;server&gt;/&lt;MN directory&gt;/&lt;update file&gt;

To update firmware and restart a single CMM cmm01 from xCAT MN 70.0.0.1 use:

ssh USERID@cmm01 update -T system:mm[1] -v -r -u http://70.0.0.1/firmware/cmefs.uxp

If unprompted password is setup on all CMMs then you can use xCAT psh to update all CMMs in the cluster at once.

psh -l USERID cmm update -T system:mm[1] -v -u http://70.0.0.1/firmware/cmefs.uxp

If you are experiencing a "Unsupported security level" message after the CMM firmware was updated then you should run the following command to overcome this issue.

rspconfig cmm sshcfg=enable snmpcfg=enable

Blade node firmware updates

Documentation for the system x firmware updates in included in the link below. As with the provisioning document keep in mind that there are some differences in attribues between IDataPlex and blade nodes in the documentation.

https://sourceforge.net/apps/mediawiki/xcat/index.php?title=XCAT_iDataPlex_Advanced_Setup#Updating_Node_Firmware

MongoDB Logo MongoDB