During my experimentation of unbinding NVMe controllers from the Linux nvme driver and then binding them to vfio-pci for use with SPDK, I encountered unusual behavior with one of the controllers.   For some, initially inexplicable, reason, one of the NVMe controllers did get unbound as desired from the nvme driver, but it refused to bind to vfio-pci, whereas all the other NVMe controllers had no trouble at all binding to vfio-pci.  Inspection of the kernel log (dmesg) didn’t help.   And so after a bunch of debugging I uncovered the culprit:   the /sys driver attribute, driver_override.   By default, all of my NVMe controller’s appeared to have that attribute empty/null, e.g.:

# cat /sys/bus/pci/devices/0000:40:00.0/driver_override
(null)

However, I discovered that for the NVMe controller that refused to bind to vfio-pci, its driver_override attribute contained the string “nvme”:

# cat /sys/bus/pci/devices/0000:40:00.0/driver_override
nvme

Per Linux kernel documentation, ABI/testing/sysfs-bus-pci:

This file allows the driver for a device to be specified which
will override standard static and dynamic ID matching.  When
specified, only a driver with a name matching the value written
to driver_override will have an opportunity to bind to the
device.

Eureka!   So, that explains why I had a particular NVMe device that refused to bind to vfio-pci.   I wanted to share this discovery with other folks in case that run into a similar issue.   Now, the mystery that remains:    how and why did this particular NVMe controller get its driver_override attribute set to “nvme”?   It’s not being used as boot device, I’ve never attempted to use it with LVM (Linux Logical Volume Management), nor built any file systems on it, or any such thing.   I grep’d through both my real rootfs’s /etc and searched through my initramfs as well, but I’ve yet to discover what’s responsible for setting that particular NVMe controller’s driver_override.    Anyone have some ideas?

thanks,

--
Lance Hartmann
lance.hartmann@oracle.com