We have encountered an issue with some Mellanox cards where rdma_bind_addr succeeds but the ib_verbs pointer is NULL which caused spdk to crash when attempting to use this port. The reason for this seems to be an invalid GUID of 0 (bellow is a procedure to re-flash it).
I don't think that this should cause spdk to crash, so I added a patch for review to check that the IB verbs is not NULL after binding - https://review.gerrithub.io/c/spdk/spdk/+/417858
Hope this helps,
It seems that Mellanox has a "blank_guid" option in its mstflint flash interface, so some manufacturers may provide RNICs with a base GUID of 0. This issue can be fixed by using the same tool to flash a new GUID. We use the MAC to generate it.
#Here is an example of such an RNIC with 0 as a base GUID:
#First generate the GUID out of the MAC
BASE_GUID=$(echo $BASE_MAC | cut -c 1-6)"0300"$(echo $BASE_MAC | cut -c 7-12)
#Now I flash the new GUID: