Isilon Troubleshoot Guide

Error#1 : Node's Baseboard Management Controller (BMC) and/or Chassis Management Controller (CMC) are unresponsive. Hardware is no longer being monitored

Issue : The Baseboard Management Controller (BMC) and/or Chassis Management Controller (CMC) on S210, X210, X410, NL410 and HD400 nodes can sometimes become unresponsive. When this issue occurs, the affected node may produce an event (900010011)

Resolution:

Check below version are matching, if not you need to upgrade to the recommended

BMC (Baseboard Management Controller) firmware to version 1.25 and above 
CMC (Chassis Management Controller) firmware to version 02.05 and above
OneFS version to 8.0.0.4 or 8.0.1.1 or newer

How to check BMC & CMC firmware versions ?
IsilonCluster1-X# isi upgrade cluster firmware devices

Device Type Firmware Mismatch Lnns
---------------------------------------------------------------------------
BMC_S1400FP        BMC      1.25.9722               -         1-9,14-16,19
BMC_S2600CP        BMC      1.25.9722               -         13,17-18
BMC_S2600CP        BMC      1.20.5446               -         10-12
CMC_HFHB           CMC      01.02                   -         10-12
CMC_HFHB           CMC      02.05                   -         13
CMC_Yeti           CMC                           -         8
CMC_Yeti           CMC      00.0b                   -         1-7,9
CMC_Yeti           CMC      02.05                   -         14
CMC_HFHB           CMC      02.07                   -         17-18
CMC_Yeti           CMC      02.07                   -         15-16,19
......................
---------------------------------------------------------------------------
Total: xx

If Firmware is missing for particular Lnns, maybe that particular node is not responding, you may need to reboot the node.

How to Check BMC Version on Particular Node:
IsilonCluster1-8# /usr/bin/isi_hwtools/isi_ipmicmc -d -V -a bmc | grep firmware IPMI firmware version = 01.25

Reset BMC/CMC
- To reset the BMC on all nodes in the cluster, run the following command:
# isi_for_array -s /usr/bin/isi_hwtools/isi_ipmicmc -c -a bmc

- When this completes, reset the CMC on all nodes in the cluster by running the following command:
# isi_for_array -s /usr/bin/isi_hwtools/isi_ipmicmc -c -a cmc

- If the cluster contains HD400 or X210 nodes, also run the following command to reset the CAR:
# isi_for_array -s /usr/bin/isi_hwtools/isi_ipmicmc -c -a car

for upgrade the Firnware, refer the below KB Article Number 000466373
https://emcservice.force.com/CustomersPartners/kA2j0000000R5lGCAS


1 comment:

  1. Hey, that's pretty useful commands, thanks a lot!

    ReplyDelete