Re: Detecting NUMA per pmem
by Oren Berman
Hi Ross
Thanks for the speedy reply. I am also adding the public list to this
thread as you suggested.
We have tried to dump the SPA table and this is what we get:
/*
* Intel ACPI Component Architecture
* AML/ASL+ Disassembler version 20160108-64
* Copyright (c) 2000 - 2016 Intel Corporation
*
* Disassembly of NFIT, Sun Oct 22 10:46:19 2017
*
* ACPI Data Table [NFIT]
*
* Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue
*/
[000h 0000 4] Signature : "NFIT" [NVDIMM Firmware
Interface Table]
[004h 0004 4] Table Length : 00000028
[008h 0008 1] Revision : 01
[009h 0009 1] Checksum : B2
[00Ah 0010 6] Oem ID : "SUPERM"
[010h 0016 8] Oem Table ID : "SMCI--MB"
[018h 0024 4] Oem Revision : 00000001
[01Ch 0028 4] Asl Compiler ID : " "
[020h 0032 4] Asl Compiler Revision : 00000001
[024h 0036 4] Reserved : 00000000
Raw Table Data: Length 40 (0x28)
0000: 4E 46 49 54 28 00 00 00 01 B2 53 55 50 45 52 4D // NFIT(.....SUPERM
0010: 53 4D 43 49 2D 2D 4D 42 01 00 00 00 01 00 00 00 // SMCI--MB........
0020: 01 00 00 00 00 00 00 00
As you can see the memory region info is missing.
This specific check was done on a supermicro server.
We also performed a bios update but the results were the same.
As said before ,the pmem devices are detected correctly and we verified
that they correspond to different numa nodes using the PCM utility.However,
linux still reports both pmem devices to be on the same numa - Numa 0.
If this information is missing, why pmem devices and address ranges are
still detected correctly?
Is there another table that we need to check?
I also ran dmidecode and the NVDIMMs are being listed (we tested with
netlist NVDIMMs). I can also see the bank locator showing P0 and P1 which I
think indicates the numa. Here is an example:
Handle 0x002D, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x002A
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: P1-DIMMA3
Bank Locator: P0_Node0_Channel0_Dimm2
Type: DDR4
Type Detail: Synchronous
Speed: 2400 MHz
Manufacturer: Netlist
Serial Number: 66F50006
Asset Tag: P1-DIMMA3_AssetTag (date:16/42)
Part Number: NV3A74SBT20-000
Rank: 1
Configured Clock Speed: 1600 MHz
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x003B, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0038
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: P2-DIMME3
Bank Locator: P1_Node1_Channel0_Dimm2
Type: DDR4
Type Detail: Synchronous
Speed: 2400 MHz
Manufacturer: Netlist
Serial Number: 66B50010
Asset Tag: P2-DIMME3_AssetTag (date:16/42)
Part Number: NV3A74SBT20-000
Rank: 1
Configured Clock Speed: 1600 MHz
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Did you encounter such a a case? We would appreciate any insight you might
have.
BR
Oren Berman
On 20 October 2017 at 19:22, Ross Zwisler <ross.zwisler(a)linux.intel.com>
wrote:
> On Thu, Oct 19, 2017 at 06:12:24PM +0300, Oren Berman wrote:
> > Hi Ross
> > My name is Oren Berman and I am a senior developer at lightbitslabs.
> > We are working with NDIMMs but we encountered a problem that the
> kernel
> > does not seem to detect the numa id per PMEM device.
> > It always reports numa 0 although we have NVDIMM devices on both
> nodes.
> > We checked that it always returns 0 from sysfs and also from
> retrieving
> > the device of pmem in the kernel and calling dev_to_node.
> > The result is always 0 for both pmem0 and pmem1.
> > In order to make sure that indeed both numa sockets are used we ran
> > intel's pcm utlity. We verified that writing to pmem 0 increases
> socket 0
> > utilization and writing to pmem1 increases socket 1 utilization so
> the hw
> > works properly.
> > Only the detection seems to be invalid.
> > Did you encounter such a problem?
> > We are using kernel version 4.9 - are you aware of any fix for this
> issue
> > or workaround that we can use.
> > Are we missing something?
> > Thanks for any help you can give us.
> > BR
> > Oren Berman
>
> Hi Oren,
>
> My first guess is that your platform isn't properly filling out the
> "proximity
> domain" field in the NFIT SPA table.
>
> See section 5.2.25.2 in ACPI 6.2:
> http://uefi.org/sites/default/files/resources/ACPI_6_2.pdf
>
> Here's how to check that:
>
> # cd /tmp
> # cp /sys/firmware/acpi/tables/NFIT .
> # iasl NFIT
>
> Intel ACPI Component Architecture
> ASL+ Optimizing Compiler version 20160831-64
> Copyright (c) 2000 - 2016 Intel Corporation
>
> Binary file appears to be a valid ACPI table, disassembling
> Input file NFIT, Length 0xE0 (224) bytes
> ACPI: NFIT 0x0000000000000000 0000E0 (v01 BOCHS BXPCNFIT 00000001 BXPC
> 00000001)
> Acpi Data Table [NFIT] decoded
> Formatted output: NFIT.dsl - 5191 bytes
>
> This will give you an NFIT.dsl file which you can look at. Here is what my
> SPA table looks like for an emulated QEMU NVDIMM:
>
> [028h 0040 2] Subtable Type : 0000 [System Physical
> Address Range]
> [02Ah 0042 2] Length : 0038
>
> [02Ch 0044 2] Range Index : 0002
> [02Eh 0046 2] Flags (decoded below) : 0003
> Add/Online Operation Only : 1
> Proximity Domain Valid : 1
> [030h 0048 4] Reserved : 00000000
> [034h 0052 4] Proximity Domain : 00000000
> [038h 0056 16] Address Range GUID :
> 66F0D379-B4F3-4074-AC43-0D3318B78CDB
> [048h 0072 8] Address Range Base : 0000000240000000
> [050h 0080 8] Address Range Length : 0000000440000000
> [058h 0088 8] Memory Map Attribute : 0000000000008008
>
> So, the "Proximity Domain" field is 0, and this lets the system know which
> NUMA node to associate with this memory region.
>
> BTW, in the future it's best to CC our public list,
> linux-nvdimm(a)lists.01.org,
> as a) someone else might have the same question and b) someone else might
> know
> the answer.
>
> Thanks,
> - Ross
>
4 years, 1 month
[PATCH v3 0/2] Support ACPI 6.1 update in NFIT Control Region Structure
by Toshi Kani
ACPI 6.1, Table 5-133, updates NVDIMM Control Region Structure as
follows.
- Valid Fields, Manufacturing Location, and Manufacturing Date
are added from reserved range. No change in the structure size.
- IDs (SPD values) are stored as arrays of bytes (i.e. big-endian
format). The spec clarifies that they need to be represented
as arrays of bytes as well.
Patch 1 changes the NFIT driver to comply with ACPI 6.1.
Patch 2 adds a new sysfs file "id" to show NVDIMM ID defined in ACPI 6.1.
The patch-set applies on linux-pm.git acpica.
link: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
---
v3:
- Need to coordinate with ACPICA update (Bob Moore, Dan Williams)
- Integrate with ACPICA changes in struct acpi_nfit_control_region.
(commit 138a95547ab0)
v2:
- Remove 'mfg_location' and 'mfg_date'. (Dan Williams)
- Rename 'unique_id' to 'id' and make this change as a separate patch.
(Dan Williams)
---
Toshi Kani (3):
1/2 acpi/nfit: Update nfit driver to comply with ACPI 6.1
2/3 acpi/nfit: Add sysfs "id" for NVDIMM ID
---
drivers/acpi/nfit.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
4 years, 1 month
[PATCH 0/18 v6] dax, ext4, xfs: Synchronous page faults
by Jan Kara
Hello,
here is the sixth version of my patches to implement synchronous page faults
for DAX mappings to make flushing of DAX mappings possible from userspace so
that they can be flushed on finer than page granularity and also avoid the
overhead of a syscall.
I think we are ready to get this merged - I've talked to Dan and he said he
could take the patches through his tree. It would just be nice to get final
ack from Christoph for the first patch implementing MAP_VALIDATE and someone
from XFS folks to check patch 17 (make xfs_filemap_pfn_mkwrite use
__xfs_filemap_fault()).
---
We use a new mmap flag MAP_SYNC to indicate that page faults for the mapping
should be synchronous. The guarantee provided by this flag is: While a block
is writeably mapped into page tables of this mapping, it is guaranteed to be
visible in the file at that offset also after a crash.
How I implement this is that ->iomap_begin() indicates by a flag that inode
block mapping metadata is unstable and may need flushing (use the same test as
whether fdatasync() has metadata to write). If yes, DAX fault handler refrains
from inserting / write-enabling the page table entry and returns special flag
VM_FAULT_NEEDDSYNC together with a PFN to map to the filesystem fault handler.
The handler then calls fdatasync() (vfs_fsync_range()) for the affected range
and after that calls DAX code to update the page table entry appropriately.
I did some basic performance testing on the patches over ramdisk - timed
latency of page faults when faulting 512 pages. I did several tests: with file
preallocated / with file empty, with background file copying going on / without
it, with / without MAP_SYNC (so that we get comparison). The results are
(numbers are in microseconds):
File preallocated, no background load no MAP_SYNC:
min=9 avg=10 max=46
8 - 15 us: 508
16 - 31 us: 3
32 - 63 us: 1
File preallocated, no background load, MAP_SYNC:
min=9 avg=10 max=47
8 - 15 us: 508
16 - 31 us: 2
32 - 63 us: 2
File empty, no background load, no MAP_SYNC:
min=21 avg=22 max=70
16 - 31 us: 506
32 - 63 us: 5
64 - 127 us: 1
File empty, no background load, MAP_SYNC:
min=40 avg=124 max=242
32 - 63 us: 1
64 - 127 us: 333
128 - 255 us: 178
File empty, background load, no MAP_SYNC:
min=21 avg=23 max=67
16 - 31 us: 507
32 - 63 us: 4
64 - 127 us: 1
File empty, background load, MAP_SYNC:
min=94 avg=112 max=181
64 - 127 us: 489
128 - 255 us: 23
So here we can see the difference between MAP_SYNC vs non MAP_SYNC is about
100-200 us when we need to wait for transaction commit in this setup.
Changes since v5:
* really updated the manpage
* improved comment describing IOMAP_F_DIRTY
* fixed XFS handling of VM_FAULT_NEEDSYNC in xfs_filemap_pfn_mkwrite()
Changes since v4:
* fixed couple of minor things in the manpage
* make legacy mmap flags always supported, remove them from mask declared
to be supported by ext4 and xfs
Changes since v3:
* updated some changelogs
* folded fs support for VM_SYNC flag into patches implementing the
functionality
* removed ->mmap_validate, use ->mmap_supported_flags instead
* added some Reviewed-by tags
* added manpage patch
Changes since v2:
* avoid unnecessary flushing of faulted page (Ross) - I've realized it makes no
sense to remeasure my benchmark results (after actually doing that and seeing
no difference, sigh) since I use ramdisk and not real PMEM HW and so flushes
are ignored.
* handle nojournal mode of ext4
* other smaller cleanups & fixes (Ross)
* factor larger part of finishing of synchronous fault into a helper (Christoph)
* reorder pfnp argument of dax_iomap_fault() (Christoph)
* add XFS support from Christoph
* use proper MAP_SYNC support in mmap(2)
* rebased on top of 4.14-rc4
Changes since v1:
* switched to using mmap flag MAP_SYNC
* cleaned up fault handlers to avoid passing pfn in vmf->orig_pte
* switched to not touching page tables before we are ready to insert final
entry as it was unnecessary and not really simplifying anything
* renamed fault flag to VM_FAULT_NEEDDSYNC
* other smaller fixes found by reviewers
Honza
4 years, 3 months
Re: KVM "fake DAX" flushing interface - discussion
by Xiao Guangrong
On 11/22/2017 02:19 AM, Rik van Riel wrote:
> We can go with the "best" interface for what
> could be a relatively slow flush (fsync on a
> file on ssd/disk on the host), which requires
> that the flushing task wait on completion
> asynchronously.
I'd like to clarify the interface of "wait on completion
asynchronously" and KVM async page fault a bit more.
Current design of async-page-fault only works on RAM rather
than MMIO, i.e, if the page fault caused by accessing the
device memory of a emulated device, it needs to go to
userspace (QEMU) which emulates the operation in vCPU's
thread.
As i mentioned before the memory region used for vNVDIMM
flush interface should be MMIO and consider its support
on other hypervisors, so we do better push this async
mechanism into the flush interface design itself rather
than depends on kvm async-page-fault.
4 years, 6 months
Re: [PATCH] nvdimm: Remove minimum size requirement
by Soccer Liu
Hi:
As part of processing in setting up the environment for running unitests, I was able to work through the instrcutions in https://github.com/pmem/ndctl/tree/0a628fdf4fe58a283b16c1bbaa49bb28b1842b... the way until I hit the followingbuild error (Segmentation fault) when buiding libnvdimm.o.
Anyone hit this before?
root@ubuntu:/home/soccerl/nvdimm# make M=tools/testing/nvdimm AR tools/testing/nvdimm/built-in.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/core.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/bus.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/dimm_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/dimm.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/region_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/region.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/namespace_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/label.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/claim.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/btt_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/pfn_devs.o CC [M] tools/testing/nvdimm/../../../drivers/nvdimm/dax_devs.o CC [M] tools/testing/nvdimm/config_check.o LD [M] tools/testing/nvdimm/libnvdimm.oSegmentation faultscripts/Makefile.build:548: recipe for target 'tools/testing/nvdimm/libnvdimm.o' failedmake[1]: *** [tools/testing/nvdimm/libnvdimm.o] Error 139Makefile:1511: recipe for target '_module_tools/testing/nvdimm' failedmake: *** [_module_tools/testing/nvdimm] Error 2
My devbox has 4.13 Linux in it.I am not sure whether it has anything to do with fact that I didnt do anything with ndctl/ndctl.spec.in (because I am not sure how to apply those dependendies to my testbox)
Any idea?
ThanksCheng-mean
On Thursday, August 31, 2017 3:31 PM, Dan Williams <dan.j.williams(a)intel.com> wrote:
On Mon, Aug 7, 2017 at 11:13 AM, Dan Williams <dan.j.williams(a)intel.com> wrote:
> On Mon, Aug 7, 2017 at 11:09 AM, Cheng-mean Liu (SOCCER)
> <soccerl(a)microsoft.com> wrote:
>> Hi Dan:
>>
>> I am wondering if failing on those unittests is still an issue for this minimum size requirement change.
>
> Yes, I just haven't had a chance to circle back and get this fixed up.
>
> You can reproduce by running:
>
> make TESTS=dpa-alloc check
>
> ...in a checkout of the ndctl project: https://github.com/pmem/ndctl
>
> If you attempt that, note the required setup of the nfit_test modules
> documented in README.md in that same repository.
I have not had any time to fix up the unit test for this. Soccer, can
you take a look?
4 years, 7 months
[PATCH v6 0/2] dax, dm: stop requiring dax for device-mapper
by Dan Williams
Changes since v5 [1]:
* Make DAX_DRIVER select DAX to simplify the Kconfig dependencies
(Michael)
* Rebase on 4.15-rc1 and add new IS_ENABLED(CONFIG_DAX_DRIVER) checks in
drivers/md/dm-log-writes.c.
[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-September/012569.html
---
Hi Mike,
Bart points out that the DAX core is unconditionally enabled if
device-mapper is enabled. Add some config machinery and some
stub/static-inline routines to allow dax infrastructure to be deleted
from device-mapper at compile time.
---
Dan Williams (2):
dax: introduce CONFIG_DAX_DRIVER
dm: allow device-mapper to operate without dax support
arch/powerpc/platforms/Kconfig | 2 -
drivers/dax/Kconfig | 5 ++
drivers/md/Kconfig | 1
drivers/md/dm-linear.c | 6 +++
drivers/md/dm-log-writes.c | 95 +++++++++++++++++++++-------------------
drivers/md/dm-stripe.c | 6 +++
drivers/md/dm.c | 10 +++-
drivers/nvdimm/Kconfig | 2 -
drivers/s390/block/Kconfig | 2 -
include/linux/dax.h | 30 ++++++++++---
10 files changed, 99 insertions(+), 60 deletions(-)
4 years, 7 months
[PATCH 00/15] dax: prep work for fixing dax-dma vs truncate collisions
by Dan Williams
This is hopefully the uncontroversial lead-in set of changes that lay
the groundwork for solving the dax-dma vs truncate problem. The overview
of the changes is:
1/ Disable DAX when we do not have struct page entries backing dax
mappings, or otherwise allow limited DAX support for axonram and
dcssblk. Is anyone actually using the DAX capability of axonram
dcssblk?
2/ Disable code paths that establish potentially long lived DMA
access to a filesystem-dax memory mapping, i.e. RDMA and V4L2. In the
4.16 timeframe the plan is to introduce a "register memory for DMA
with a lease" mechanism for userspace to establish mappings but also
be responsible for tearing down the mapping when the kernel needs to
invalidate the mapping due to truncate or hole-punch.
3/ Add a wakeup mechanism for awaiting for DAX pages to be released
from DMA access.
This overall effort started when Christoph noted during the review of
the MAP_DIRECT proposal:
get_user_pages on DAX doesn't give the same guarantees as on
pagecache or anonymous memory, and that is the problem we need to
fix. In fact I'm pretty sure if we try hard enough (and we might
have to try very hard) we can see the same problem with plain direct
I/O and without any RDMA involved, e.g. do a larger direct I/O write
to memory that is mmap()ed from a DAX file, then truncate the DAX
file and reallocate the blocks, and we might corrupt that new file.
We'll probably need a special setup where there is little other
chance but to reallocate those used blocks.
So what we need to do first is to fix get_user_pages vs unmapping
DAX mmap()ed blocks, be that from a hole punch, truncate, COW
operation, etc.
Included in the changes is a nfit_test mechanism to trivially trigger
this collision by delaying the put_page() that the block layer performs
after performing direct-I/O to a filesystem-DAX page.
Given the ongoing coordination of this set across multiple sub-systems
and the dax core my proposal is to manage this as a branch in the nvdimm
tree with acks from mm, rdma, v4l2, ext4, and xfs.
---
Dan Williams (15):
dax: quiet bdev_dax_supported()
mm, dax: introduce pfn_t_special()
dax: require 'struct page' by default for filesystem dax
brd: remove dax support
dax: stop using VM_MIXEDMAP for dax
dax: stop using VM_HUGEPAGE for dax
dax: stop requiring a live device for dax_flush()
dax: store pfns in the radix
tools/testing/nvdimm: add 'bio_delay' mechanism
IB/core: disable memory registration of fileystem-dax vmas
[media] v4l2: disable filesystem-dax mapping support
mm, dax: enable filesystems to trigger page-idle callbacks
mm, devmap: introduce CONFIG_DEVMAP_MANAGED_PAGES
dax: associate mappings with inodes, and warn if dma collides with truncate
wait_bit: introduce {wait_on,wake_up}_devmap_idle
arch/powerpc/platforms/Kconfig | 1
arch/powerpc/sysdev/axonram.c | 3 -
drivers/block/Kconfig | 12 ---
drivers/block/brd.c | 65 --------------
drivers/dax/device.c | 1
drivers/dax/super.c | 113 +++++++++++++++++++++----
drivers/infiniband/core/umem.c | 49 ++++++++---
drivers/media/v4l2-core/videobuf-dma-sg.c | 39 ++++++++-
drivers/nvdimm/pmem.c | 13 +++
drivers/s390/block/Kconfig | 1
drivers/s390/block/dcssblk.c | 4 +
fs/Kconfig | 8 ++
fs/dax.c | 131 +++++++++++++++++++----------
fs/ext2/file.c | 1
fs/ext2/super.c | 6 +
fs/ext4/file.c | 1
fs/ext4/super.c | 6 +
fs/xfs/xfs_file.c | 2
fs/xfs/xfs_super.c | 20 ++--
include/linux/dax.h | 17 ++--
include/linux/memremap.h | 24 +++++
include/linux/mm.h | 47 ++++++----
include/linux/mm_types.h | 20 +++-
include/linux/pfn_t.h | 13 +++
include/linux/vma.h | 33 +++++++
include/linux/wait_bit.h | 10 ++
kernel/memremap.c | 36 ++++++--
kernel/sched/wait_bit.c | 64 ++++++++++++--
mm/Kconfig | 5 +
mm/hmm.c | 13 ---
mm/huge_memory.c | 8 +-
mm/ksm.c | 3 +
mm/madvise.c | 2
mm/memory.c | 22 ++++-
mm/migrate.c | 3 -
mm/mlock.c | 5 +
mm/mmap.c | 8 +-
mm/swap.c | 3 -
tools/testing/nvdimm/Kbuild | 1
tools/testing/nvdimm/test/iomap.c | 62 ++++++++++++++
tools/testing/nvdimm/test/nfit.c | 34 ++++++++
tools/testing/nvdimm/test/nfit_test.h | 1
42 files changed, 650 insertions(+), 260 deletions(-)
create mode 100644 include/linux/vma.h
4 years, 7 months
[RFC PATCH v2 0/7] ndctl: nvdimmd: notify/monitor the feathers of over threshold event
by QI Fuli
Hi, here is my second version of nvdimm daemon, It would be appreciated
if you could check it.
Change log since v1:
- Adding a config file(/etc/nvdimmd/nvdimmd.conf)
- Using struct log_ctx instead of syslog()
- Using log_syslog() to save the notify messages to syslog
- Using log_file() to save the notify messages to special file
- Adding LOG_NOTICE level into log_priority
- Using automake instead of Makefile
- Adding a new util file(nvdimmd/util.c) including helper functions needed
for nvdimm daemon.
- Adding nvdimmd_test program
---
This is a patch set of nvdimmd, a tiny daemon to monitor the features of over
threshold events. When an over thershold event fires, nvdimmd will output the
notification including dimm health status to syslog or a special file to
users' configuration. Users can choose the output format to be structured json
or text.
Here are out put samples.
- json format:
2017/11/10 11:15:03 [28065] log_notify: nvdimm dimm over threshold notify
{
"dev":"nmem1",
"id":"cdab-0a-07e0-feffffff",
"handle":1,
"phys_id":1,
"health":{
"health_state":"non-critical",
"temperature_celsius":23,
"spares_percentage":75,
"alarm_temperature":true,
"alarm_spares":true,
"temperature_threshold":40,
"spares_threshold":5,
"life_used_percentage":5,
"shutdown_state":"clean"
}
}
- text format:
2017/11/10 16:21:53 [12479] log_notify: nvdimm dimm over threshold notify
dev: nmem1
health_state: non-critical
spares_percentage: 75
TODO list:
- The dimms to monitor should be filtered by namespace and region
- Add more information into the notify message
- Make nvdimmd_test an ndctl command or an option of ndctl inject-error
QI Fuli (7):
ndctl: nvdimmd: add LOG_NOTICE level into log_priority
ndctl: nvdimmd: add nvdimmd necessary util functions
ndctl: nvdimmd: add nvdimmd necessary functions
ndctl: nvdimmd: add body file of nvdimm daemon
ndctl: nvdimmd: add nvdimmd config file
ndctl: nvdimmd: add the unit file of systemd for nvdimmd service
ndctl: nvdimmd: add a temporary test for nvdimm daemon
Makefile.am | 2 +-
configure.ac | 1 +
nvdimmd/Makefile.am | 47 ++++++++
nvdimmd/libnvdimmd.c | 315 ++++++++++++++++++++++++++++++++++++++++++++++++
nvdimmd/libnvdimmd.h | 53 ++++++++
nvdimmd/nvdimmd.c | 112 +++++++++++++++++
nvdimmd/nvdimmd.conf | 25 ++++
nvdimmd/nvdimmd.service | 7 ++
nvdimmd/nvdimmd_test.c | 142 ++++++++++++++++++++++
nvdimmd/util.c | 80 ++++++++++++
nvdimmd/util.h | 33 +++++
util/log.c | 2 +
util/log.h | 3 +
13 files changed, 821 insertions(+), 1 deletion(-)
create mode 100644 nvdimmd/Makefile.am
create mode 100644 nvdimmd/libnvdimmd.c
create mode 100644 nvdimmd/libnvdimmd.h
create mode 100644 nvdimmd/nvdimmd.c
create mode 100644 nvdimmd/nvdimmd.conf
create mode 100644 nvdimmd/nvdimmd.service
create mode 100644 nvdimmd/nvdimmd_test.c
create mode 100644 nvdimmd/util.c
create mode 100644 nvdimmd/util.h
--
2.9.5
4 years, 8 months
[RFC PATCH 0/4] Teach EDAC driver about NVDIMMs
by Tony Luck
A Skylake server may have some DIMM slots filled with NVDIMMs
instead of normal DDR4 DIMMs. These are enumerated differently
by the memory controller.
Sadly there isn't an easy way to just peek at some memory controller
register to find the size of these DIMMs, so we have to rely on the
NFIT and SMBIOS tables to get that information.
This series only tackles the topology function of the EDAC
driver. A later series of patches will fix the address translation
parts so that errors in NVDIMMs will be reported correctly.
It's marked "RFC" because it depends on the new ACPCIA version 20171110
which has only just made it to Rafael's tree.
Some of you may only care about some of the parts that touch code you
maintain, but I copied you on all four because you might like to see
the bigger picture.
Tony Luck (4):
acpi, nfit: Add function to look up nvdimm device and provide SMBIOS
handle
firmware: dmi: Add function to look up a handle and return DIMM size
edac: Add new memory type for non-volatile DIMMs
EDAC, skx_edac: Detect non-volatile DIMMs
drivers/acpi/nfit/core.c | 27 +++++++++++++++++++++
drivers/edac/Kconfig | 2 ++
drivers/edac/edac_mc.c | 1 +
drivers/edac/edac_mc_sysfs.c | 3 ++-
drivers/edac/skx_edac.c | 56 ++++++++++++++++++++++++++++++++++++++++----
drivers/firmware/dmi_scan.c | 29 +++++++++++++++++++++++
include/acpi/nfit.h | 19 +++++++++++++++
include/linux/dmi.h | 2 ++
include/linux/edac.h | 3 +++
9 files changed, 136 insertions(+), 6 deletions(-)
create mode 100644 include/acpi/nfit.h
base-commit: 3fc70f8be59950ee2deecefdddb68be19b8cddd1
--
2.14.1
4 years, 8 months
[fstests PATCH v4 0/4] add test for DAX MAP_SYNC support
by Ross Zwisler
The purpose of this series is to exercise the new MAP_SYNC mmap()
functionality [1]. It adds a test which uses dm-log-writes to try and
replay filesystem metadata operations for a file that is being written via
mmap().
If MAP_SYNC is active the dm-log-writes replay will recreate the file's
block allocations and you'll end up with a test file which is a known
size.
If MAP_SYNC is not active the metadata writes will most likely be lost and
the replay will either fail to create the test file at all or it will may
be smaller. In all of my testing the file simply doesn't exist on the
replay if MAP_SYNC is ommited.
This test relies on a kernel with both the MAP_SYNC mmap() functionality
and the DAX enabling in dm-log-writes. These will both appear in kernel
v4.15-rc1. For ease of testing I've posted a kernel that is v4.14 plus
just those two patch series [2].
This test also relies on xfsprogs having support for MAP_SYNC and for
dm-log-writes. I've posed a tree adding that support [3]. This xfsprogs
series is still under review so if the xfs_io interfaces change this
test will of course need to be updated as well.
Lastly, I've also posted a working version of this series [4].
Changes since v3:
- Enhanced xfs_io with MAP_SYNC and dm-log-writes functionality instead of
creating a one-off test program. (Dave Chinner)
- Improved dm target version checking. (Amir)
- Fixed dm-log-writes replay issue, some general cleanup, broke changes
out into a series.
[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-November/013164.html
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/zwisler/linux.git/log/?h=...
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/zwisler/xfsprogs-dev.git/...
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/zwisler/xfstests-dev.git/...
Ross Zwisler (4):
common/rc: add _scratch_has_mount_option()
dm-log-writes: only replay log to marks that exist
dm-log-writes: allow DAX to be used when possible
generic: add test for DAX MAP_SYNC support
common/dmlogwrites | 9 +++--
common/rc | 33 ++++++++++++++++--
doc/requirement-checking.txt | 5 +--
tests/generic/468 | 83 ++++++++++++++++++++++++++++++++++++++++++++
tests/generic/468.out | 3 ++
tests/generic/group | 1 +
6 files changed, 128 insertions(+), 6 deletions(-)
create mode 100755 tests/generic/468
create mode 100644 tests/generic/468.out
--
2.9.5
4 years, 8 months