[PATCH 0/6] Support DAX for device-mapper dm-linear devices
by Toshi Kani
This patch-set adds DAX support to device-mapper dm-linear devices
used by LVM. It works with LVM commands as follows:
- Creation of a logical volume with all DAX capable devices (such
as pmem) sets the logical volume DAX capable as well.
- Once a logical volume is set to DAX capable, the volume may not
be extended with non-DAX capable devices.
The direct_access interface is added to dm and dm-linear to map
a request to a target device.
- Patch 1-2 introduce GENHD_FL_DAX flag to indicate DAX capability.
- Patch 3-4 add direct_access functions to dm and dm-linear.
- Patch 5-6 set GENHD_FL_DAX to dm when all targets are DAX capable.
---
Toshi Kani (6):
1/6 genhd: Add GENHD_FL_DAX to gendisk flags
2/6 block: Check GENHD_FL_DAX for DAX capability
3/6 dm: Add dm_blk_direct_access() for mapped device
4/6 dm-linear: Add linear_direct_access()
5/6 dm, dm-linear: Add dax_supported to dm_target
6/6 dm: Enable DAX support for mapper device
---
drivers/block/brd.c | 2 +-
drivers/md/dm-linear.c | 19 +++++++++++++++++++
drivers/md/dm-table.c | 12 +++++++++---
drivers/md/dm.c | 38 ++++++++++++++++++++++++++++++++++++--
drivers/md/dm.h | 7 +++++++
drivers/nvdimm/pmem.c | 2 +-
drivers/s390/block/dcssblk.c | 1 +
fs/block_dev.c | 5 +++--
include/linux/device-mapper.h | 15 +++++++++++++++
include/linux/genhd.h | 2 +-
10 files changed, 93 insertions(+), 10 deletions(-)
4 years, 8 months
[PATCH v4 0/5] Introduce device_add_disk() to kill gendisk.driverfs_dev
by Dan Williams
Changes since v3 [1]:
1/ Broke out the non-trivial conversions into their own patches.
2/ Fix a behavior change in arch/um/drivers/ubd_kern.c. This driver
optionally creates parented and un-parented block devices. (Bart)
3/ Fix a behavior change in drivers/mmc/card/block.c. This driver does
not use blk-core infrastructure for partitions and instead registers
partitions as gendisk-children of the whole device gendisk.
Crediting Bart since his report triggered this re-review.
4/ Use the local CARD_TO_DEV() helper in drivers/block/rsxx/dev.c.
5/ Use the local 'dev' variable that represents the parent device in
drivers/scsi/sd.c.
[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-June/005988.html
---
Answer the "// FIXME: remove" include/linux/genhd.h. This should be
functionally equivalent to the previous state. These patches received a
build success notification from the kbuild robot across 122 configs.
---
Dan Williams (5):
block: introduce device_add_disk()
mmc: move 'parent' tracking to mmc_blk_data
um: track 'parent' device in a local variable
block: convert to device_add_disk()
block: remove ->driverfs_dev
arch/powerpc/sysdev/axonram.c | 3 +--
arch/um/drivers/ubd_kern.c | 5 +++--
block/genhd.c | 18 +++++++++---------
drivers/block/cciss.c | 3 +--
drivers/block/floppy.c | 3 +--
drivers/block/mtip32xx/mtip32xx.c | 5 ++---
drivers/block/ps3disk.c | 3 +--
drivers/block/ps3vram.c | 3 +--
drivers/block/rsxx/dev.c | 4 +---
drivers/block/skd_main.c | 8 +++-----
drivers/block/sunvdc.c | 3 +--
drivers/block/virtio_blk.c | 3 +--
drivers/block/xen-blkfront.c | 3 +--
drivers/ide/ide-cd.c | 3 +--
drivers/ide/ide-gd.c | 3 +--
drivers/memstick/core/ms_block.c | 3 +--
drivers/memstick/core/mspro_block.c | 3 +--
drivers/mmc/card/block.c | 5 +++--
drivers/mtd/mtd_blkdevs.c | 4 +---
drivers/nvdimm/blk.c | 3 +--
drivers/nvdimm/btt.c | 3 +--
drivers/nvdimm/bus.c | 2 +-
drivers/nvdimm/pmem.c | 3 +--
drivers/nvme/host/core.c | 3 +--
drivers/s390/block/dasd_genhd.c | 3 +--
drivers/s390/block/dcssblk.c | 3 +--
drivers/s390/block/scm_blk.c | 3 +--
drivers/scsi/sd.c | 3 +--
drivers/scsi/sr.c | 3 +--
include/linux/genhd.h | 8 ++++++--
30 files changed, 50 insertions(+), 72 deletions(-)
4 years, 8 months
[PATCH] nfit: add Microsoft NVDIMM DSM command set to white list
by Stuart Hayes
Add the Microsoft _DSM command set to the white list of NVDIMM command sets.
This command set is documented at https://msdn.microsoft.com/library/windows/hardware/mt604741.
Signed-off-by: Stuart Hayes <stuart.w.hayes(a)gmail.com>
---
drivers/acpi/nfit.c | 9 ++++++---
drivers/acpi/nfit.h | 4 ++++
include/uapi/linux/ndctl.h | 1 +
3 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index 2215fc8..48fc575 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1130,11 +1130,11 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
}
/*
- * Until standardization materializes we need to consider up to 3
+ * Until standardization materializes we need to consider several
* different command sets. Note, that checking for function0 (bit0)
* tells us if any commands are reachable through this uuid.
*/
- for (i = NVDIMM_FAMILY_INTEL; i <= NVDIMM_FAMILY_HPE2; i++)
+ for (i = NVDIMM_FAMILY_INTEL; i <= NVDIMM_FAMILY_MSFT; i++)
if (acpi_check_dsm(adev_dimm->handle, to_nfit_uuid(i), 1, 1))
break;
@@ -1150,7 +1150,9 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
dsm_mask = 0x1fe;
if (disable_vendor_specific)
dsm_mask &= ~(1 << 8);
- } else {
+ } else if (nfit_mem->family == NVDIMM_FAMILY_MSFT)
+ dsm_mask = 0xffffffff;
+ else {
dev_err(dev, "unknown dimm command family\n");
nfit_mem->family = -1;
return force_enable_dimms ? 0 : -ENODEV;
@@ -2692,6 +2694,7 @@ static __init int nfit_init(void)
acpi_str_to_uuid(UUID_NFIT_DIMM, nfit_uuid[NFIT_DEV_DIMM]);
acpi_str_to_uuid(UUID_NFIT_DIMM_N_HPE1, nfit_uuid[NFIT_DEV_DIMM_N_HPE1]);
acpi_str_to_uuid(UUID_NFIT_DIMM_N_HPE2, nfit_uuid[NFIT_DEV_DIMM_N_HPE2]);
+ acpi_str_to_uuid(UUID_NFIT_DIMM_N_MSFT, nfit_uuid[NFIT_DEV_DIMM_N_MSFT]);
nfit_wq = create_singlethread_workqueue("nfit");
if (!nfit_wq)
diff --git a/drivers/acpi/nfit.h b/drivers/acpi/nfit.h
index 11cb383..f06fa91 100644
--- a/drivers/acpi/nfit.h
+++ b/drivers/acpi/nfit.h
@@ -31,6 +31,9 @@
#define UUID_NFIT_DIMM_N_HPE1 "9002c334-acf3-4c0e-9642-a235f0d53bc6"
#define UUID_NFIT_DIMM_N_HPE2 "5008664b-b758-41a0-a03c-27c2f2d04f7e"
+/* https://msdn.microsoft.com/library/windows/hardware/mt604741 */
+#define UUID_NFIT_DIMM_N_MSFT "1ee68b36-d4bd-4a1a-9a16-4f8e53d46e05"
+
#define ACPI_NFIT_MEM_FAILED_MASK (ACPI_NFIT_MEM_SAVE_FAILED \
| ACPI_NFIT_MEM_RESTORE_FAILED | ACPI_NFIT_MEM_FLUSH_FAILED \
| ACPI_NFIT_MEM_NOT_ARMED)
@@ -40,6 +43,7 @@ enum nfit_uuids {
NFIT_DEV_DIMM = NVDIMM_FAMILY_INTEL,
NFIT_DEV_DIMM_N_HPE1 = NVDIMM_FAMILY_HPE1,
NFIT_DEV_DIMM_N_HPE2 = NVDIMM_FAMILY_HPE2,
+ NFIT_DEV_DIMM_N_MSFT = NVDIMM_FAMILY_MSFT,
NFIT_SPA_VOLATILE,
NFIT_SPA_PM,
NFIT_SPA_DCR,
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index 309915f..ba5a8c7 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -298,6 +298,7 @@ struct nd_cmd_pkg {
#define NVDIMM_FAMILY_INTEL 0
#define NVDIMM_FAMILY_HPE1 1
#define NVDIMM_FAMILY_HPE2 2
+#define NVDIMM_FAMILY_MSFT 3
#define ND_IOCTL_CALL _IOWR(ND_IOCTL, ND_CMD_CALL,\
struct nd_cmd_pkg)
--
1.8.3.1
4 years, 8 months
[PATCH v3 0/3] Introduce device_add_disk() to kill gendisk.driverfs_dev
by Dan Williams
Changes since v2 [1]:
* Rebase on top of linux-block.git for-4.8/drivers
[1]: https://lkml.org/lkml/2016/3/11/414
---
Answer the "// FIXME: remove" include/linux/genhd.h. This should be
functionally equivalent to the previous state. These patches received a
build success notification from the kbuild robot across 116 configs.
The following series implements...
---
Dan Williams (3):
block: introduce device_add_disk()
block: convert to device_add_disk()
block: remove ->driverfs_dev
arch/powerpc/sysdev/axonram.c | 3 +--
arch/um/drivers/ubd_kern.c | 3 +--
block/genhd.c | 18 +++++++++---------
drivers/block/cciss.c | 3 +--
drivers/block/floppy.c | 3 +--
drivers/block/mtip32xx/mtip32xx.c | 5 ++---
drivers/block/ps3disk.c | 3 +--
drivers/block/ps3vram.c | 3 +--
drivers/block/rsxx/dev.c | 4 +---
drivers/block/skd_main.c | 8 +++-----
drivers/block/sunvdc.c | 3 +--
drivers/block/virtio_blk.c | 3 +--
drivers/block/xen-blkfront.c | 3 +--
drivers/ide/ide-cd.c | 3 +--
drivers/ide/ide-gd.c | 3 +--
drivers/memstick/core/ms_block.c | 3 +--
drivers/memstick/core/mspro_block.c | 3 +--
drivers/mmc/card/block.c | 3 +--
drivers/mtd/mtd_blkdevs.c | 4 +---
drivers/nvdimm/blk.c | 3 +--
drivers/nvdimm/btt.c | 3 +--
drivers/nvdimm/bus.c | 2 +-
drivers/nvdimm/pmem.c | 3 +--
drivers/nvme/host/core.c | 3 +--
drivers/s390/block/dasd_genhd.c | 3 +--
drivers/s390/block/dcssblk.c | 3 +--
drivers/s390/block/scm_blk.c | 3 +--
drivers/scsi/sd.c | 3 +--
drivers/scsi/sr.c | 3 +--
include/linux/genhd.h | 8 ++++++--
30 files changed, 46 insertions(+), 72 deletions(-)
--
Signature
4 years, 8 months
[RFC] x86/mm: only allow memmap=XX!YY over existing RAM
by Yigal Korman
Before this patch, passing a range that is beyond the physical memory
range will succeed, the user will see a /dev/pmem0 and will be able to
access it. Reads will always return 0 and writes will be silently
ignored.
I've gotten more than one bug report about mkfs.{xfs,ext4} or nvml
failing that were eventually tracked down to be wrong values passed to
memmap.
This patch prevents the above issue by instead of adding a new memory
range, only update a RAM memory range with the PRAM type. This way,
passing the wrong memmap will either not give you a pmem at all or give
you a smaller one that actually has RAM behind it.
And if someone still needs to fake a pmem that doesn't have RAM behind
it, they can simply do memmap=XX@YY,XX!YY.
Signed-off-by: Yigal Korman <yigal(a)plexistor.com>
---
arch/x86/kernel/e820.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 569c1e4..bcd2ebb1 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -877,7 +877,7 @@ static int __init parse_memmap_one(char *p)
e820_add_region(start_at, mem_size, E820_RESERVED);
} else if (*p == '!') {
start_at = memparse(p+1, &p);
- e820_add_region(start_at, mem_size, E820_PRAM);
+ e820_update_range(start_at, mem_size, E820_RAM, E820_PRAM);
} else
e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1);
--
1.9.3
4 years, 8 months
[PATCH] libnvdimm: use devm_add_action_or_reset()
by Dan Williams
Clean up needless calls to the action routine by letting
devm_add_action_or_reset() call it automatically. This does cause the
disk to registered and immediately unregistered when a memory allocation
fails, but the block layer should be prepared for such an event.
Reported-by: Sudip Mukherjee <sudipm.mukherjee(a)gmail.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
drivers/nvdimm/blk.c | 11 ++++-------
drivers/nvdimm/pmem.c | 12 +++++-------
2 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index 495e06d9f7e7..dfe691cf4d74 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -267,10 +267,8 @@ static int nsblk_attach_disk(struct nd_namespace_blk *nsblk)
q = blk_alloc_queue(GFP_KERNEL);
if (!q)
return -ENOMEM;
- if (devm_add_action(dev, nd_blk_release_queue, q)) {
- blk_cleanup_queue(q);
+ if (devm_add_action_or_reset(dev, nd_blk_release_queue, q))
return -ENOMEM;
- }
blk_queue_make_request(q, nd_blk_make_request);
blk_queue_max_hw_sectors(q, UINT_MAX);
@@ -282,10 +280,6 @@ static int nsblk_attach_disk(struct nd_namespace_blk *nsblk)
disk = alloc_disk(0);
if (!disk)
return -ENOMEM;
- if (devm_add_action(dev, nd_blk_release_disk, disk)) {
- put_disk(disk);
- return -ENOMEM;
- }
disk->driverfs_dev = dev;
disk->first_minor = 0;
@@ -296,6 +290,9 @@ static int nsblk_attach_disk(struct nd_namespace_blk *nsblk)
set_capacity(disk, 0);
add_disk(disk);
+ if (devm_add_action_or_reset(dev, nd_blk_release_disk, disk))
+ return -ENOMEM;
+
if (nsblk_meta_size(nsblk)) {
int rc = nd_integrity_init(disk, nsblk_meta_size(nsblk));
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 608fc4464574..2ef1229e8ded 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -269,10 +269,8 @@ static int pmem_attach_disk(struct device *dev,
* At release time the queue must be dead before
* devm_memremap_pages is unwound
*/
- if (devm_add_action(dev, pmem_release_queue, q)) {
- blk_cleanup_queue(q);
+ if (devm_add_action_or_reset(dev, pmem_release_queue, q))
return -ENOMEM;
- }
if (IS_ERR(addr))
return PTR_ERR(addr);
@@ -288,10 +286,6 @@ static int pmem_attach_disk(struct device *dev,
disk = alloc_disk_node(0, nid);
if (!disk)
return -ENOMEM;
- if (devm_add_action(dev, pmem_release_disk, disk)) {
- put_disk(disk);
- return -ENOMEM;
- }
disk->fops = &pmem_fops;
disk->queue = q;
@@ -305,6 +299,10 @@ static int pmem_attach_disk(struct device *dev,
nvdimm_badblocks_populate(to_nd_region(dev->parent), &pmem->bb, res);
disk->bb = &pmem->bb;
add_disk(disk);
+
+ if (devm_add_action_or_reset(dev, pmem_release_disk, disk))
+ return -ENOMEM;
+
revalidate_disk(disk);
return 0;
4 years, 8 months
Mapping PCIe BAR as PMEM
by Vijairaj
Hi,
I am running the 4.4 kernel on AMD64 and was wondering what's a good way of
mapping the 16M of battery backed SRAM on a custom PCIe card as PMEM. Once
mapped, I will create a file system on the block device.
I am thinking about doing the following:
- Use the kernel parameter memmap=nn!ss to reserve the PCIe BAR
- Write a PCI driver to enable the PCI device `pci_enable_device()`
- Load the PCI driver
- Load the nd_pmem driver
Is this sufficient or is there anything else required to be done in the PCI
driver?
thanks,
Vijairaj
4 years, 8 months
[PATCH] Fix reported kmemleak
by Shaun Tancheff
This fixes a memory leak reported by a few people in 4.7-rc1
kmemleak report after 9082e87bfbf8 ("block: remove struct bio_batch")
This patch just formalizes the one in this discussion here:
https://lkml.kernel.org/r/20160606112620.GA29910@e104818-lin.cambridge.ar...
The same issue appears here:
https://lkml.kernel.org/r/20160607102651.GA6480@dhcp12-144.nay.redhat.com
Patch is also available at:
https://github.com/stancheff/linux.git
branch: v4.7-rc2+bio_put
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: axboe(a)kernel.dk
cc: David Drysdale <drysdale(a)google.com>
Cc: Xiong Zhou <xzhou(a)redhat.com>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: linux-next(a)vger.kernel.org
Cc: linux-nvdimm(a)ml01.01.org
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Larry Finger <Larry.Finger(a)lwfinger.net>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: LKML <linux-kernel(a)vger.kernel.org>,
Cc: Jens Axboe <axboe(a)fb.com>,
Cc: bart.vanassche(a)sandisk.com
Shaun Tancheff (1):
Missing bio_put following submit_bio_wait
block/blk-lib.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--
2.8.1
4 years, 8 months
[PATCH] libnvdimm, nfit: treat volatile virtual CD region as read-only pmem
by Lee, Chun-Yi
This patch adds codes to treat a volatile virtual CD region as a
read-only pmem region, then read-only /dev/pmem* device can be mounted
with iso9660.
It's useful to work with the httpboot in EFI firmware to pull a remote
ISO file to the local memory region for booting and installation.
Wiki page of UEFI HTTPBoot with OVMF:
https://en.opensuse.org/UEFI_HTTPBoot_with_OVMF
Signed-off-by: Lee, Chun-Yi <jlee(a)suse.com>
Cc: Gary Lin <GLin(a)suse.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw(a)rjwysocki.net>
---
drivers/acpi/nfit.c | 8 +++++++-
drivers/nvdimm/region_devs.c | 26 +++++++++++++++++++++++++-
include/linux/libnvdimm.h | 2 ++
3 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index 2215fc8..b100a17 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1949,6 +1949,7 @@ static int acpi_nfit_init_mapping(struct acpi_nfit_desc *acpi_desc,
switch (nfit_spa_type(spa)) {
case NFIT_SPA_PM:
case NFIT_SPA_VOLATILE:
+ case NFIT_SPA_VCD:
nd_mapping->start = memdev->address;
nd_mapping->size = memdev->region_size;
break;
@@ -1995,7 +1996,7 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc,
if (nfit_spa->nd_region)
return 0;
- if (spa->range_index == 0) {
+ if (spa->range_index == 0 && nfit_spa_type(spa) != NFIT_SPA_VCD) {
dev_dbg(acpi_desc->dev, "%s: detected invalid spa index\n",
__func__);
return 0;
@@ -2059,6 +2060,11 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc,
ndr_desc);
if (!nfit_spa->nd_region)
rc = -ENOMEM;
+ } else if (nfit_spa_type(spa) == NFIT_SPA_VCD) {
+ nfit_spa->nd_region = nvdimm_vcd_region_create(nvdimm_bus,
+ ndr_desc);
+ if (!nfit_spa->nd_region)
+ rc = -ENOMEM;
}
out:
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 40fcfea..f155941 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -56,9 +56,19 @@ static struct device_type nd_volatile_device_type = {
.release = nd_region_release,
};
+static struct device_type nd_vcd_device_type = {
+ .name = "nd_vcd",
+ .release = nd_region_release,
+};
+
+bool is_nd_vcd(struct device *dev)
+{
+ return dev ? dev->type == &nd_vcd_device_type : false;
+}
+
bool is_nd_pmem(struct device *dev)
{
- return dev ? dev->type == &nd_pmem_device_type : false;
+ return dev ? dev->type == &nd_pmem_device_type || is_nd_vcd(dev) : false;
}
bool is_nd_blk(struct device *dev)
@@ -338,6 +348,9 @@ static ssize_t read_only_store(struct device *dev,
int rc = strtobool(buf, &ro);
struct nd_region *nd_region = to_nd_region(dev);
+ if (is_nd_vcd(dev))
+ return -ENXIO;
+
if (rc)
return rc;
@@ -687,6 +700,9 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus,
ro = 1;
}
+ if (dev_type == &nd_vcd_device_type)
+ ro = 1;
+
if (dev_type == &nd_blk_device_type) {
struct nd_blk_region_desc *ndbr_desc;
struct nd_blk_region *ndbr;
@@ -774,6 +790,14 @@ struct nd_region *nvdimm_pmem_region_create(struct nvdimm_bus *nvdimm_bus,
}
EXPORT_SYMBOL_GPL(nvdimm_pmem_region_create);
+struct nd_region *nvdimm_vcd_region_create(struct nvdimm_bus *nvdimm_bus,
+ struct nd_region_desc *ndr_desc)
+{
+ return nd_region_create(nvdimm_bus, ndr_desc, &nd_vcd_device_type,
+ __func__);
+}
+EXPORT_SYMBOL_GPL(nvdimm_vcd_region_create);
+
struct nd_region *nvdimm_blk_region_create(struct nvdimm_bus *nvdimm_bus,
struct nd_region_desc *ndr_desc)
{
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 0c3c30c..0a1f949 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -145,6 +145,8 @@ u32 nd_cmd_out_size(struct nvdimm *nvdimm, int cmd,
int nvdimm_bus_check_dimm_count(struct nvdimm_bus *nvdimm_bus, int dimm_count);
struct nd_region *nvdimm_pmem_region_create(struct nvdimm_bus *nvdimm_bus,
struct nd_region_desc *ndr_desc);
+struct nd_region *nvdimm_vcd_region_create(struct nvdimm_bus *nvdimm_bus,
+ struct nd_region_desc *ndr_desc);
struct nd_region *nvdimm_blk_region_create(struct nvdimm_bus *nvdimm_bus,
struct nd_region_desc *ndr_desc);
struct nd_region *nvdimm_volatile_region_create(struct nvdimm_bus *nvdimm_bus,
--
2.1.4
4 years, 8 months
Re: [PATCH] efi: kernel param for legacy NVDIMM support
by Dan Williams
On Fri, Jun 10, 2016 at 1:48 AM, Yigal Korman <yigal(a)plexistor.com> wrote:
> On Thu, Jun 9, 2016 at 11:27 PM, Dan Williams <dan.j.williams(a)intel.com> wrote:
>>
>> On Thu, Jun 9, 2016 at 10:42 AM, Yigal Korman <yigal(a)plexistor.com> wrote:
>> > The 'efi_legacy_pmem' parameter will convert EFI persistent memory range
>> > (type 14) into E820 legacy NVDIMM (type 12) memory range.
>> >
>> > Background:
>> >
>> > In contrast with the NVDIMM E820 types where we can clearly distinguish
>> > between old NVDIMMs (type-12) and ACPI 6.0 NVDIMMs (type-7), the EFI
>> > memory types for NVDIMMs are the same before ACPI 6.0 and after
>> > (type-14).
>> > This means that old NVDIMMs under EFI aren't supported even though
>> > they work fine if booted with BIOS (E820).
>> >
>> > So allow the user to explicitly request the kernel to identify NVDIMMs
>> > as legacy under EFI.
>> >
>>
>> I'm concerned with the potential for this command line parameter to
>> collide with NFIT defined ranges. At a minimum it should confirm that
>> there is not already an NFIT describing the same address ranges.
>>
>
> That's a valid concern, but not related to this patch directly, the
> same might happen today when an 'memmap=XX!YY' kernel parameter
> collides with an NFIT on the same range.
We should fix that too, i.e. always prefer NFIT over plain / imprecise
type-12 / efi-type-14 range. Actually, in the type-12 vs NFIT case we
should be even more strict...
> Or, albeit a far fetched scenario, a platform vendor will decide to
> provide an NFIT for a non-ACPI 6.0 and leave the old E820 type-12.
If a platform has e820 type-12 (without memmap=) and an NFIT we should
throw a WARN_TAINT firmware warning, because that platform has taken
the time to update to ACPI 6.0 NFIT, but still use the out of spec
type-12. That memory type is something we only supported as a
stop-gap for platforms that had no other option prior to ACPI 6.
>> However, we have the ability to override / inject ACPI tables and
>> methods from the kernel. Why not use that facility to custom craft an
>> NFIT when the BIOS fails to provide one? That way EFI type-14
>> maintains a constant interpretation as just a reserved memory range
>> with no other side effects.
>
> That might be an interesting way to implement memmap=XX!YY in general
> and can also replace the funny code in arch/x86/kernel/pmem.c.
>
> But, it's more complex and probably has its own caveats, this patch is
> simpler and straight forward, providing direct value.
No, it makes a bad situation worse. In the type-12 case it is
obviously out of spec and the only thing that address range could be
is a legacy platform. The efi-type-14 case is accidentally in-spec
and collides / aliases with the ACPI 6 definition.
We should always look for an NFIT in the type-14 case. Injecting a
crafted NFIT for this legacy case removes confusion as there is only
one action the kernel takes and no possibility for collisions. I also
do not want to make it easier for firmware developers to skip the
necessary due diligence to decide whether attributes like numa
topology, posted write queue flush mechanisms, and media health,
etc... need description.
4 years, 8 months