Scripting Storage Troubleshooting

Qlogic FCOE adapter disappeared after reboot

Hi Community members!!  If you experience problems with your FCOE adapters the following post hopefully will help you. Enjoy!

Background

I run into an environment where vSphere hosts on Dell Blades were showing high uptime and that is great but couple of hosts experienced PSOD and solution needed to be implemented to make the platform stable and avoid reproduction of the issue.
After checking possible causes I noticed CNA adapter Qlogic 57840 providing network and storage to hosts was running old firmware and driver version with known issues could cause network outage and PSOD so was time to patch all hosts.
After patching hosts with this driver  as recommended here on compatibility guide all seemed to work OK until another reboot was performed on hosts to install other patches (needed to fix VPXA issues (check article for this issue)) and noticing host came back online without FCOE adapters causing all storage devices to disappear.

Symptoms

After some investigation on vmkernel logs I noticed messages that were not seen before installing the driver as shown below.

2017-06-19T23:35:55.840Z cpu10:32868)<3>bnx2fc:vmhba38:0000:01:00.4: bnx2fc_stop:3168 Clean-up FCoE CNA Queues
2017-06-19T23:35:55.840Z cpu10:32868)<3>bnx2fc:vmhba38:0000:01:00.4: bnx2fc_stop:3175 dev->netq_state 0x0
2017-06-19T23:35:57.864Z cpu13:32826)<3>bnx2fc:vmhba38:0000:01:00.4: bnx2fc_indicate_kcqe:1333 DESTROY success

2017-06-19T23:35:42.683Z cpu7:32868)<3>bnx2fc:vmhba39:0000:01:00.5: bnx2fc_stop:3168 Clean-up FCoE CNA Queues
2017-06-19T23:35:42.683Z cpu7:32868)<3>bnx2fc:vmhba39:0000:01:00.5: bnx2fc_stop:3175 dev->netq_state 0x0
2017-06-19T23:35:44.704Z cpu11:35167)<3>bnx2fc:vmhba39:0000:01:00.5: bnx2fc_indicate_kcqe:1333 DESTROY success

While checking if this was reported already I found this article by Dell reporting the issue on ESXi 6 but all symptoms were same I was experiencing in ESXi 5.5 so I decided to test this out.

Resolution:

Workaround offered is to change default configuration of module bnx2fc parameterbnx2fc_autodiscovery from 0 to 1 using below command

“esxcfg-module -s bnx2fc_autodiscovery=1 bnx2fc”

Verifying configuration of module bnx2fc in VMware ESXi 5.5.0 build-5230635

esxcli system module parameters list -m bnx2fc

~ # esxcli system module parameters list -m bnx2fc
Name Type Value Description
——————— —- —– ——————————————————————————————————————————
bnx2fc_autodiscovery long parameter to control auto FCoE discovery during system boot. 1 = Enable auto discovery. 0 = Disable auto discovery (Default).

Changing module configuration as recommended by Dell.

~ # esxcli system module parameters set -m bnx2fc -p bnx2fc_autodiscovery=1

After rebooting the host to test the workaround the storage adapters are persistent and storage access is healthy.

This issue will be resolved by Dell on next release of driver in 2018 until you will want to ensure your host profiles are updated to include this configuration.

 

Script:

https://github.com/bakingclouds/PowerCLi-vCenter/blob/bakingclouds-vcenter-scripts/update-bnx2fc-module-config.ps1

Applies to:

Dell Blades with adapter Qlogic 57840 running ESXI 5.5
Dell Blades with adapter Qlogic 57840 running ESXi 6

Author: Guillermo Ramallo

Leave a Reply