Patch "acpi/nfit, x86/mce: Handle only uncorrectable machine checks" has been added to the 4.18-stable tree
by gregkh@linuxfoundation.org
This is a note to let you know that I've just added the patch titled
acpi/nfit, x86/mce: Handle only uncorrectable machine checks
to the 4.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=s...
The filename of the patch is:
acpi-nfit-x86-mce-handle-only-uncorrectable-machine-checks.patch
and it can be found in the queue-4.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5d96c9342c23ee1d084802dcf064caa67ecaa45b Mon Sep 17 00:00:00 2001
From: Vishal Verma <vishal.l.verma(a)intel.com>
Date: Thu, 25 Oct 2018 18:37:28 -0600
Subject: acpi/nfit, x86/mce: Handle only uncorrectable machine checks
From: Vishal Verma <vishal.l.verma(a)intel.com>
commit 5d96c9342c23ee1d084802dcf064caa67ecaa45b upstream.
The MCE handler for nfit devices is called for memory errors on a
Non-Volatile DIMM and adds the error location to a 'badblocks' list.
This list is used by the various NVDIMM drivers to avoid consuming known
poison locations during IO.
The MCE handler gets called for both corrected and uncorrectable errors.
Until now, both kinds of errors have been added to the badblocks list.
However, corrected memory errors indicate that the problem has already
been fixed by hardware, and the resulting interrupt is merely a
notification to Linux.
As far as future accesses to that location are concerned, it is
perfectly fine to use, and thus doesn't need to be included in the above
badblocks list.
Add a check in the nfit MCE handler to filter out corrected mce events,
and only process uncorrectable errors.
Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: Omar Avelar <omar.avelar(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
CC: Arnd Bergmann <arnd(a)arndb.de>
CC: Dan Williams <dan.j.williams(a)intel.com>
CC: Dave Jiang <dave.jiang(a)intel.com>
CC: elliott(a)hpe.com
CC: "H. Peter Anvin" <hpa(a)zytor.com>
CC: Ingo Molnar <mingo(a)redhat.com>
CC: Len Brown <lenb(a)kernel.org>
CC: linux-acpi(a)vger.kernel.org
CC: linux-edac <linux-edac(a)vger.kernel.org>
CC: linux-nvdimm(a)lists.01.org
CC: Qiuxu Zhuo <qiuxu.zhuo(a)intel.com>
CC: "Rafael J. Wysocki" <rjw(a)rjwysocki.net>
CC: Ross Zwisler <zwisler(a)kernel.org>
CC: stable <stable(a)vger.kernel.org>
CC: Thomas Gleixner <tglx(a)linutronix.de>
CC: Tony Luck <tony.luck(a)intel.com>
CC: x86-ml <x86(a)kernel.org>
CC: Yazen Ghannam <yazen.ghannam(a)amd.com>
Link: http://lkml.kernel.org/r/20181026003729.8420-1-vishal.l.verma@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 3 ++-
drivers/acpi/nfit/mce.c | 4 ++--
3 files changed, 5 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ static inline int umc_normaddr_to_sysadd
int mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
+bool mce_is_correctable(struct mce *m);
DECLARE_PER_CPU(unsigned, mce_exception_count);
DECLARE_PER_CPU(unsigned, mce_poll_count);
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -538,7 +538,7 @@ bool mce_is_memory_error(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_is_memory_error);
-static bool mce_is_correctable(struct mce *m)
+bool mce_is_correctable(struct mce *m)
{
if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
return false;
@@ -548,6 +548,7 @@ static bool mce_is_correctable(struct mc
return true;
}
+EXPORT_SYMBOL_GPL(mce_is_correctable);
static bool cec_add_mce(struct mce *m)
{
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -25,8 +25,8 @@ static int nfit_handle_mce(struct notifi
struct acpi_nfit_desc *acpi_desc;
struct nfit_spa *nfit_spa;
- /* We only care about memory errors */
- if (!mce_is_memory_error(mce))
+ /* We only care about uncorrectable memory errors */
+ if (!mce_is_memory_error(mce) || mce_is_correctable(mce))
return NOTIFY_DONE;
/*
Patches currently in stable-queue which might be from vishal.l.verma(a)intel.com are
queue-4.18/acpi-nfit-x86-mce-validate-a-mce-s-address-before-using-it.patch
queue-4.18/acpi-nfit-x86-mce-handle-only-uncorrectable-machine-checks.patch
3 years, 9 months
[GIT PULL] libnvdimm fixes for 4.20-rc3
by Williams, Dan J
Hi Linus, please pull from:
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-fixes-4.20-rc3
...to receive a small batch of fixes for v4.20-rc3.
These have soaked in -next for a few releases. The overflow
continuation fix addresses something that has been broken for several
releases. Arguably it could wait even longer, but it's a one line fix
and this finishes the last of the known address range scrub bug
reports. The revert addresses a lockdep regression. The unit tests are
not critical to fix, but no reason to hold this fix back.
---
The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a:
Linux 4.20-rc1 (2018-11-04 15:37:52 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-fixes-4.20-rc3
for you to fetch changes up to 2121db09630113e67b51ae78c18115f1858f648a:
Revert "acpi, nfit: Further restrict userspace ARS start requests" (2018-11-10 09:54:28 -0800)
----------------------------------------------------------------
libnvdimm 4.20-rc3
- Address Range Scrub overflow continuation handling has been broken
since it was initially merged. It was only recently that error injection
and platform-BIOS support enabled this corner case to be exercised.
- The recent attempt to provide more isolation for the kernel Address
Range Scrub state machine from userapace initiated sessions triggers a
lockdep report. Revert and try again at the next merge window.
- Fix a kasan reported buffer overflow in libnvdimm unit test
infrastrucutre (nfit_test)
----------------------------------------------------------------
Dan Williams (2):
acpi, nfit: Fix ARS overflow continuation
Revert "acpi, nfit: Further restrict userspace ARS start requests"
Masayoshi Mizuma (1):
tools/testing/nvdimm: Fix the array size for dimm devices.
drivers/acpi/nfit/core.c | 19 +++++--------------
tools/testing/nvdimm/test/nfit.c | 8 ++++----
2 files changed, 9 insertions(+), 18 deletions(-)
3 years, 9 months
linux-nvdimm@lists.01.org - this account has been hacked! Change all your passwords!
by linux-nvdimm@lists.01.org
Hello!
I have bad news for you.
19/07/2018 - on this day I hacked your operating system and got full access to your account linux-nvdimm(a)lists.01.org
It is useless to change the password, my malware intercepts it every time.
How it was:
In the software of the router to which you were connected that day, there was a vulnerability.
I first hacked this router and placed my malicious code on it.
When you entered in the Internet, my trojan was installed on the operating system of your device.
After that, I made a full dump of your disk (I have all your address book, history of viewing sites, all files, phone numbers and addresses of all your contacts).
A month ago, I wanted to lock your device and ask for a small amount of money to unlock.
But I looked at the sites that you regularly visit, and came to the big delight of your favorite resources.
I'm talking about sites for adults.
I want to say - you are a big, big pervert. You have unbridled fantasy!!!
After that, an idea came to my mind.
I made a screenshot of the intimate website where you have fun (you know what it is about, right?).
After that, I made a screenshot of your joys (using the camera of your device) and joined all together.
It turned out beautifully, do not doubt.
I am strongly belive that you would not like to show these pictures to your relatives, friends or colleagues.
I think $714 is a very small amount for my silence.
Besides, I spent a lot of time on you!
I accept money only in Bitcoins.
My BTC wallet: 1H9bS7Zb6LEANLkM8yiF8EsoGEtMEeLFvC
You do not know how to replenish a Bitcoin wallet?
In any search engine write "how to send money to btc wallet".
It's easier than send money to a credit card!
For payment you have a little more than two days (exactly 50 hours).
Do not worry, the timer will start at the moment when you open this letter. Yes, yes .. it has already started!
After payment, my virus and dirty photos with you self-destruct automatically.
Narrative, if I do not receive the specified amount from you, then your device will be blocked, and all your contacts will receive a photos with your "joys".
I want you to be prudent.
- Do not try to find and destroy my virus! (All your data is already uploaded to a remote server)
- Do not try to contact me (this is not feasible, I sent you an email from your account)
- Various security services will not help you; formatting a disk or destroying a device will not help either, since your data is already on a remote server.
P.S. I guarantee you that I will not disturb you again after payment, as you are not my single victim.
This is a hacker code of honor.
>From now on, I advise you to use good antiviruses and update them regularly (several times a day)!
Don't be mad at me, everyone has their own work.
Farewell.
3 years, 9 months
Re: [Ksummit-discuss] [RFC PATCH 3/3] libnvdimm, MAINTAINERS: Subsystem Profile
by Mauro Carvalho Chehab
Em Sat, 17 Nov 2018 11:38:50 +1100
NeilBrown <neilb(a)suse.com> escreveu:
> On Fri, Nov 16 2018, Dan Williams wrote:
>
> > On Fri, Nov 16, 2018 at 12:37 PM Rodrigo Vivi <rodrigo.vivi(a)gmail.com> wrote:
> >>
> >> On Thu, Nov 15, 2018 at 8:38 AM Leon Romanovsky <leon(a)kernel.org> wrote:
> >> >
> >> > On Thu, Nov 15, 2018 at 06:10:36AM -0800, Mauro Carvalho Chehab wrote:
> >> > > Em Thu, 15 Nov 2018 09:03:11 +0100
> >> > > Geert Uytterhoeven <geert(a)linux-m68k.org> escreveu:
> >> > >
> >> > > > Hi Dan,
> >> > > >
> >> > > > On Thu, Nov 15, 2018 at 6:06 AM Dan Williams <dan.j.williams(a)intel.com> wrote:
> >> > > > > Document the basic policies of the libnvdimm subsystem and provide a
> >> > > > > first example of a Subsystem Profile for others to duplicate and edit.
> >> > > > >
> >> > > > > Cc: Ross Zwisler <zwisler(a)kernel.org>
> >> > > > > Cc: Vishal Verma <vishal.l.verma(a)intel.com>
> >> > > > > Cc: Dave Jiang <dave.jiang(a)intel.com>
> >> > > > > Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
> >> > > >
> >> > > > Thanks for your patch!
> >> > > >
> >> > > > > --- /dev/null
> >> > > > > +++ b/Documentation/nvdimm/subsystem-profile.rst
> >> > > >
> >> > > > > +Trusted Reviewers
> >> > > > > +-----------------
> >> > > > > +Johannes Thumshirn
> >> > > > > +Toshi Kani
> >> > > > > +Jeff Moyer
> >> > > > > +Robert Elliott
> >> > > >
> >> > > > Don't you want to add email addresses?
> >> > > > Only the first one is listed in MAINTAINERS.
> >> > >
> >> > > IMO, it makes sense to have their e-mails here, in a way that it could
> >> > > easily be parsed by get_maintainers.pl.
> >> >
> >> > I personally think that list of "trusted reviewers" makes more harm than
> >> > good. It creates unneeded negative feelings to those who wanted to be in
> >> > this list, but for any reason they don't. Those reviewers will feel
> >> > "untrusted".
> >>
> >> I'd like to +1 on this concern here. Besides leaving all the other
> >> people demotivated.
> >
> > Yes, that's a valid concern, I overlooked that unfortunate interpretation.
> >
> >>
> >> A small group of trusted reviewers doesn't scale. People will get overloaded.
> >> Or you won't be able to enforce that all patches need to get Reviews.
> >>
> >> Reviews should be coming from everywhere and commiters and maintainers
> >> deciding on what to trust or re-review.
> >>
> >> Also the list is hard to maintain and keep the lists updated.
> >
> > I understand the concern, and as I saw feedback come in I realized
> > there were more people that I would add to that reviewer list for
> > libnvdimm.
> >
> > Stepping back the end goal is to have an initial list of recommended
> > people to follow up with directly to seek a second opinion, or help in
> > cases where a contributor otherwise needs some direction / engagement
> > that they are not readily receiving from the maintainer. Typically
> > someone just lurks on the mailing list for a few weeks to get a feel
> > for who the usual suspects are in the subsystem, but for a new
> > contributor identifying those individuals may be difficult.
> >
> > One of the contributing factors of lack of response to a patchset is
> > that they are sent with the implicit expectation that the maintainer
> > will get to eventually, and typically other people feel content to sit
> > back and watch. If instead a contributor sent a direct mail to a
> > "trusted reviewer" saying, "Hey, Alice, Bob seems busy can you help me
> > out?" that seems more likely to rope in additional review help.
>
> In here is, I think, a real issue that listing "trusted reviewers" might
> help address.
> As you say, people don't feel the need to comment - particularly if they
> don't see anything wrong (often best to insert a bug to encourage
> responses!).
> Maybe if we list people, it will make them feel that their opinion is
> valuable (trusted!) and that will encourage them to Ack or Review a
> patch.
> I have found that being given a title of responsibility can change my
> thinking from "someone should" to "I should".
I heard a similar feedback from some subsystems: giving someone a
formal credit may actually help to get more reviews.
However, as Leon pointed later in this tread:
Em Sun, 18 Nov 2018 09:12:54 +0200
Leon Romanovsky <leon(a)kernel.org> escreveu:
> On Fri, Nov 16, 2018 at 03:39:47AM -0800, Mauro Carvalho Chehab wrote:
> > Em Thu, 15 Nov 2018 11:43:51 -0800
> > Joe Perches <joe(a)perches.com> escreveu:
> >
> > > On Thu, 2018-11-15 at 19:40 +0000, Luck, Tony wrote:
> > > > > I would recommend to remove this section at all.
> > > > > New maintainers won't come out of blue, but will be come
> > > > > from existing community and such individuals for sure will see
> > > > > and judge by themselves to whom they trust and to whom not.
> > > >
> > > > Perhaps this is more of a hint to contributors than to maintainers
> > > > (see earlier discussion on who is the target audience for these documents).
> > > >
> > > > It would help contributors know some names of useful reviewers (and
> > > > thus this list should be picked up by scripts/get_maintainer.pl to help
> > > > the user compose Cc: lists for e-mail patches).
> > >
> > > Trusted reviewers should be specifically listed
> > > in the MAINTAINERS file with an "R:" entry.
> > >
> > > get_maintainers should not look anywhere else.
> >
> > I know that making get_maintainers to look elsewhere can make it more
> > complex and slower, but IMHO, by having a per-subsystem profile, this is
> > unavoidable.
> >
> > The thing is that touching at a single MAINTAINERS file every time a new
> > reviewer comes is painful. Also, MAINTAINERS file format doesn't allow
> > adding free text explaining the criteria for someone to become a
> > reviewer.
>
> You are pointing to the actual problem -> someone needs to maintain such
> lists, Removal of persons from that list won't be easy task too.
While adding new reviewers is easy (someone just need to send a patch,
with the Acked-by from the reviewer to be included), getting someone
removed from it can be very painful (and will likely require some written
policy about how to do that).
Thanks,
Mauro
3 years, 9 months
Re: [Ksummit-discuss] [RFC PATCH 3/3] libnvdimm, MAINTAINERS: Subsystem Profile
by Mauro Carvalho Chehab
Em Thu, 15 Nov 2018 21:35:20 +0200
Leon Romanovsky <leon(a)kernel.org> escreveu:
> On Thu, Nov 15, 2018 at 11:09:34AM -0800, Mauro Carvalho Chehab wrote:
> > Em Thu, 15 Nov 2018 18:20:08 +0200
> > Leon Romanovsky <leon(a)kernel.org> escreveu:
> >
> > > On Thu, Nov 15, 2018 at 06:10:36AM -0800, Mauro Carvalho Chehab wrote:
> > > > Em Thu, 15 Nov 2018 09:03:11 +0100
> > > > Geert Uytterhoeven <geert(a)linux-m68k.org> escreveu:
> > > >
> > > > > Hi Dan,
> > > > >
> > > > > On Thu, Nov 15, 2018 at 6:06 AM Dan Williams <dan.j.williams(a)intel.com> wrote:
> > > > > > Document the basic policies of the libnvdimm subsystem and provide a
> > > > > > first example of a Subsystem Profile for others to duplicate and edit.
> > > > > >
> > > > > > Cc: Ross Zwisler <zwisler(a)kernel.org>
> > > > > > Cc: Vishal Verma <vishal.l.verma(a)intel.com>
> > > > > > Cc: Dave Jiang <dave.jiang(a)intel.com>
> > > > > > Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
> > > > >
> > > > > Thanks for your patch!
> > > > >
> > > > > > --- /dev/null
> > > > > > +++ b/Documentation/nvdimm/subsystem-profile.rst
> > > > >
> > > > > > +Trusted Reviewers
> > > > > > +-----------------
> > > > > > +Johannes Thumshirn
> > > > > > +Toshi Kani
> > > > > > +Jeff Moyer
> > > > > > +Robert Elliott
> > > > >
> > > > > Don't you want to add email addresses?
> > > > > Only the first one is listed in MAINTAINERS.
> > > >
> > > > IMO, it makes sense to have their e-mails here, in a way that it could
> > > > easily be parsed by get_maintainers.pl.
> > >
> > > I personally think that list of "trusted reviewers" makes more harm than
> > > good. It creates unneeded negative feelings to those who wanted to be in
> > > this list, but for any reason they don't. Those reviewers will feel
> > > "untrusted".
> >
> > Yeah, perhaps something like "most active reviewers" would sound
> > better.
>
> I would recommend to remove this section at all.
> New maintainers won't come out of blue, but will be come
> from existing community and such individuals for sure will see
> and judge by themselves to whom they trust and to whom not.
I see your point, but, on the other hand, having a list with the ones
that are actively doing reviews helps newcomers.
I would keep, but perhaps it makes sense to add some notice about a
criteria about how to be included at the "active reviewers" list,
(the criteria probably belongs to the subsystem profile), e. g.
something like:
"Active reviewers are developers that contribute with more than 25
patches per year and do more than 50 reviews per year on
on patches written for drivers that they're not usual contributors"
Cheers,
Mauro
3 years, 9 months
RE: [Ksummit-discuss] [RFC PATCH 3/3] libnvdimm, MAINTAINERS: Subsystem Profile
by Luck, Tony
> I would recommend to remove this section at all.
> New maintainers won't come out of blue, but will be come
> from existing community and such individuals for sure will see
> and judge by themselves to whom they trust and to whom not.
Perhaps this is more of a hint to contributors than to maintainers
(see earlier discussion on who is the target audience for these documents).
It would help contributors know some names of useful reviewers (and
thus this list should be picked up by scripts/get_maintainer.pl to help
the user compose Cc: lists for e-mail patches).
-Tony
3 years, 9 months
Re: [Ksummit-discuss] [RFC PATCH 3/3] libnvdimm, MAINTAINERS: Subsystem Profile
by Mauro Carvalho Chehab
Em Thu, 15 Nov 2018 18:20:08 +0200
Leon Romanovsky <leon(a)kernel.org> escreveu:
> On Thu, Nov 15, 2018 at 06:10:36AM -0800, Mauro Carvalho Chehab wrote:
> > Em Thu, 15 Nov 2018 09:03:11 +0100
> > Geert Uytterhoeven <geert(a)linux-m68k.org> escreveu:
> >
> > > Hi Dan,
> > >
> > > On Thu, Nov 15, 2018 at 6:06 AM Dan Williams <dan.j.williams(a)intel.com> wrote:
> > > > Document the basic policies of the libnvdimm subsystem and provide a
> > > > first example of a Subsystem Profile for others to duplicate and edit.
> > > >
> > > > Cc: Ross Zwisler <zwisler(a)kernel.org>
> > > > Cc: Vishal Verma <vishal.l.verma(a)intel.com>
> > > > Cc: Dave Jiang <dave.jiang(a)intel.com>
> > > > Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
> > >
> > > Thanks for your patch!
> > >
> > > > --- /dev/null
> > > > +++ b/Documentation/nvdimm/subsystem-profile.rst
> > >
> > > > +Trusted Reviewers
> > > > +-----------------
> > > > +Johannes Thumshirn
> > > > +Toshi Kani
> > > > +Jeff Moyer
> > > > +Robert Elliott
> > >
> > > Don't you want to add email addresses?
> > > Only the first one is listed in MAINTAINERS.
> >
> > IMO, it makes sense to have their e-mails here, in a way that it could
> > easily be parsed by get_maintainers.pl.
>
> I personally think that list of "trusted reviewers" makes more harm than
> good. It creates unneeded negative feelings to those who wanted to be in
> this list, but for any reason they don't. Those reviewers will feel
> "untrusted".
Yeah, perhaps something like "most active reviewers" would sound
better.
Cheers,
Mauro
3 years, 9 months
[PATCH] acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node
by Dan Williams
Persistent memory, as described by the ACPI NFIT (NVDIMM Firmware
Interface Table), is the first known instance of a memory range
described by a unique "target" proximity domain. Where "initiator" and
"target" proximity domains is an approach that the ACPI HMAT
(Heterogeneous Memory Attributes Table) uses to described the unique
performance properties of a memory range relative to a given initiator
(e.g. CPU or DMA device).
Currently the numa-node for a /dev/pmemX block-device or /dev/daxX.Y
char-device follows the traditional notion of 'numa-node' where the
attribute conveys the closest online numa-node. That numa-node attribute
is useful for cpu-binding and memory-binding processes *near* the
device. However, when the memory range backing a 'pmem', or 'dax' device
is onlined (memory hot-add) the memory-only-numa-node representing that
address needs to be differentiated from the set of online nodes. In
other words, the numa-node association of the device depends on whether
you can bind processes *near* the cpu-numa-node in the offline
device-case, or bind process *on* the memory-range directly after the
backing address range is onlined.
Allow for the case that platform firmware describes persistent memory
with a unique proximity domain, i.e. when it is distinct from the
proximity of DRAM and CPUs that are on the same socket. Plumb the Linux
numa-node translation of that proximity through the libnvdimm region
device to namespaces that are in device-dax mode. With this in place the
proposed kmem driver [1] can optionally discover a unique numa-node
number for the address range as it transitions the memory from an
offline state managed by a device-driver to an online memory range
managed by the core-mm.
[1]: https://lkml.org/lkml/2018/10/23/9
Reported-by: Fan Du <fan.du(a)intel.com>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Jérôme Glisse <jglisse(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
arch/powerpc/platforms/pseries/papr_scm.c | 1 +
drivers/acpi/nfit/core.c | 8 ++++++--
drivers/acpi/numa.c | 1 +
drivers/dax/bus.c | 4 +++-
drivers/dax/bus.h | 3 ++-
drivers/dax/dax-private.h | 4 ++++
drivers/dax/pmem/core.c | 4 +++-
drivers/nvdimm/e820.c | 1 +
drivers/nvdimm/nd.h | 2 +-
drivers/nvdimm/of_pmem.c | 1 +
drivers/nvdimm/region_devs.c | 1 +
include/linux/libnvdimm.h | 1 +
12 files changed, 25 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index ee9372b65ca5..6a0a35b872d1 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -233,6 +233,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
memset(&ndr_desc, 0, sizeof(ndr_desc));
ndr_desc.attr_groups = region_attr_groups;
ndr_desc.numa_node = dev_to_node(&p->pdev->dev);
+ ndr_desc.target_node = ndr_desc.numa_node;
ndr_desc.res = &p->res;
ndr_desc.of_node = p->dn;
ndr_desc.provider_data = p;
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index f8c638f3c946..2225e3de33ac 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2825,11 +2825,15 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc,
ndr_desc->res = &res;
ndr_desc->provider_data = nfit_spa;
ndr_desc->attr_groups = acpi_nfit_region_attribute_groups;
- if (spa->flags & ACPI_NFIT_PROXIMITY_VALID)
+ if (spa->flags & ACPI_NFIT_PROXIMITY_VALID) {
ndr_desc->numa_node = acpi_map_pxm_to_online_node(
spa->proximity_domain);
- else
+ ndr_desc->target_node = acpi_map_pxm_to_node(
+ spa->proximity_domain);
+ } else {
ndr_desc->numa_node = NUMA_NO_NODE;
+ ndr_desc->target_node = NUMA_NO_NODE;
+ }
/*
* Persistence domain bits are hierarchical, if
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 274699463b4f..b9d86babb13a 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -84,6 +84,7 @@ int acpi_map_pxm_to_node(int pxm)
return node;
}
+EXPORT_SYMBOL(acpi_map_pxm_to_node);
/**
* acpi_map_pxm_to_online_node - Map proximity ID to online node
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 568168500217..c620ad52d7e5 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -214,7 +214,7 @@ static void dax_region_unregister(void *region)
}
struct dax_region *alloc_dax_region(struct device *parent, int region_id,
- struct resource *res, unsigned int align,
+ struct resource *res, int target_node, unsigned int align,
unsigned long pfn_flags)
{
struct dax_region *dax_region;
@@ -244,6 +244,7 @@ struct dax_region *alloc_dax_region(struct device *parent, int region_id,
dax_region->id = region_id;
dax_region->align = align;
dax_region->dev = parent;
+ dax_region->target_node = target_node;
if (sysfs_create_groups(&parent->kobj, dax_region_attribute_groups)) {
kfree(dax_region);
return NULL;
@@ -348,6 +349,7 @@ struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id,
dev_dax->dax_dev = dax_dev;
dev_dax->region = dax_region;
+ dev_dax->target_node = dax_region->target_node;
kref_get(&dax_region->kref);
inode = dax_inode(dax_dev);
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index ce977552ffb5..8619e3299943 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -10,7 +10,8 @@ struct dax_device;
struct dax_region;
void dax_region_put(struct dax_region *dax_region);
struct dax_region *alloc_dax_region(struct device *parent, int region_id,
- struct resource *res, unsigned int align, unsigned long flags);
+ struct resource *res, int target_node, unsigned int align,
+ unsigned long flags);
enum dev_dax_subsys {
DEV_DAX_BUS,
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index a82ce48f5884..a45612148ca0 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -26,6 +26,7 @@ void dax_bus_exit(void);
/**
* struct dax_region - mapping infrastructure for dax devices
* @id: kernel-wide unique region for a memory range
+ * @target_node: effective numa node if this memory range is onlined
* @kref: to pin while other agents have a need to do lookups
* @dev: parent device backing this region
* @align: allocation and mapping alignment for child dax devices
@@ -34,6 +35,7 @@ void dax_bus_exit(void);
*/
struct dax_region {
int id;
+ int target_node;
struct kref kref;
struct device *dev;
unsigned int align;
@@ -46,6 +48,7 @@ struct dax_region {
* data while the device is activated in the driver.
* @region - parent region
* @dax_dev - core dax functionality
+ * @target_node: effective numa node if dev_dax memory range is onlined
* @dev - device core
* @pgmap - pgmap for memmap setup / lifetime (driver owned)
* @ref: pgmap reference count (driver owned)
@@ -54,6 +57,7 @@ struct dax_region {
struct dev_dax {
struct dax_region *region;
struct dax_device *dax_dev;
+ int target_node;
struct device dev;
struct dev_pagemap pgmap;
struct percpu_ref ref;
diff --git a/drivers/dax/pmem/core.c b/drivers/dax/pmem/core.c
index bdcff1b14e95..f71019ce0647 100644
--- a/drivers/dax/pmem/core.c
+++ b/drivers/dax/pmem/core.c
@@ -20,6 +20,7 @@ struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys)
struct nd_namespace_common *ndns;
struct nd_dax *nd_dax = to_nd_dax(dev);
struct nd_pfn *nd_pfn = &nd_dax->nd_pfn;
+ struct nd_region *nd_region = to_nd_region(dev->parent);
ndns = nvdimm_namespace_common_probe(dev);
if (IS_ERR(ndns))
@@ -52,7 +53,8 @@ struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys)
memcpy(&res, &pgmap.res, sizeof(res));
res.start += offset;
dax_region = alloc_dax_region(dev, region_id, &res,
- le32_to_cpu(pfn_sb->align), PFN_DEV|PFN_MAP);
+ nd_region->target_node, le32_to_cpu(pfn_sb->align),
+ PFN_DEV|PFN_MAP);
if (!dax_region)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c
index 521eaf53a52a..36be9b619187 100644
--- a/drivers/nvdimm/e820.c
+++ b/drivers/nvdimm/e820.c
@@ -47,6 +47,7 @@ static int e820_register_one(struct resource *res, void *data)
ndr_desc.res = res;
ndr_desc.attr_groups = e820_pmem_region_attribute_groups;
ndr_desc.numa_node = e820_range_to_nid(res->start);
+ ndr_desc.target_node = ndr_desc.numa_node;
set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc))
return -ENXIO;
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index e79cc8e5c114..623216545cc8 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -153,7 +153,7 @@ struct nd_region {
u16 ndr_mappings;
u64 ndr_size;
u64 ndr_start;
- int id, num_lanes, ro, numa_node;
+ int id, num_lanes, ro, numa_node, target_node;
void *provider_data;
struct kernfs_node *bb_state;
struct badblocks bb;
diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c
index 0a701837dfc0..ecaaa27438e2 100644
--- a/drivers/nvdimm/of_pmem.c
+++ b/drivers/nvdimm/of_pmem.c
@@ -68,6 +68,7 @@ static int of_pmem_region_probe(struct platform_device *pdev)
memset(&ndr_desc, 0, sizeof(ndr_desc));
ndr_desc.attr_groups = region_attr_groups;
ndr_desc.numa_node = dev_to_node(&pdev->dev);
+ ndr_desc.target_node = ndr_desc.numa_node;
ndr_desc.res = &pdev->resource[i];
ndr_desc.of_node = np;
set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 174a418cb171..86cd425b786d 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1060,6 +1060,7 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus,
nd_region->flags = ndr_desc->flags;
nd_region->ro = ro;
nd_region->numa_node = ndr_desc->numa_node;
+ nd_region->target_node = ndr_desc->target_node;
ida_init(&nd_region->ns_ida);
ida_init(&nd_region->btt_ida);
ida_init(&nd_region->pfn_ida);
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 097072c5a852..941102c0c81f 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -124,6 +124,7 @@ struct nd_region_desc {
void *provider_data;
int num_lanes;
int numa_node;
+ int target_node;
unsigned long flags;
struct device_node *of_node;
};
3 years, 9 months
Re: [driver-core PATCH v6 2/9] async: Add support for queueing on specific NUMA node
by Greg KH
On Sun, Nov 11, 2018 at 08:59:03PM +0100, Pavel Machek wrote:
> On Sun 2018-11-11 11:32:08, Greg KH wrote:
> > On Thu, Nov 08, 2018 at 10:06:50AM -0800, Alexander Duyck wrote:
> > > Introduce four new variants of the async_schedule_ functions that allow
> > > scheduling on a specific NUMA node.
> > >
> > > The first two functions are async_schedule_near and
> > > async_schedule_near_domain end up mapping to async_schedule and
> > > async_schedule_domain, but provide NUMA node specific functionality. They
> > > replace the original functions which were moved to inline function
> > > definitions that call the new functions while passing NUMA_NO_NODE.
> > >
> > > The second two functions are async_schedule_dev and
> > > async_schedule_dev_domain which provide NUMA specific functionality when
> > > passing a device as the data member and that device has a NUMA node other
> > > than NUMA_NO_NODE.
> > >
> > > The main motivation behind this is to address the need to be able to
> > > schedule device specific init work on specific NUMA nodes in order to
> > > improve performance of memory initialization.
> > >
> > > Signed-off-by: Alexander Duyck <alexander.h.duyck(a)linux.intel.com>
> > > ---
> >
> > No one else from Intel has reviewed/verified this code at all?
> >
> > Please take advantages of the resources you have that most people do
> > not, get reviewes from your coworkers please before you send this out
> > again, as they can give you valuable help before the community has to
> > review the code...
>
> We always said to companies we want to see code as soon as
> possible. You don't have to review their code, but discouraging the
> posting seems wrong.
I have a long history of Intel using me for their basic "find the
obvious bugs" review process for new driver subsystems and core changes.
When I see new major patches show up from an Intel developer without
_any_ other review from anyone else, directed at me, I get suspicious it
is happening again.
If you note, Intel is the _only_ company I say this to their developers
because of this problem combined with the fact that they have a whole
load of developers that they should be running things by first.
And yes, to answer Dan's point, we do want to do review in public. But
this is v6 of a core patchset and there has been NO review from anyone
else at Intel on this. So if that review was going to happen, one would
have thought it would have by now, instead of relying on me to do it.
And yes, I am grumpy, but I am grumpy because of the history here. I am
not trying to discourage anything, only to take ADVANTAGE of resources
that almost no other company provides.
Hope this helps explain my statement here.
thanks,
greg k-h
3 years, 9 months
[Question] About atomic operations on nvdimm in Linux kernel
by Fumiya Shigemitsu
Hi all,
I have a question about atomic operations on nvdimm in Linux kernel.
We can use atomic operations in Linux kernel such as atomic_read(),
atomic_write() ...etc., but they don't handle the differences of
endianness. From my understanding, we should pay attention to them when
storing data to persistent disk. (e.g. metadata in filesystem such as ext4)
Maybe I can handle this problem by read/write lock, spinlock...etc.
However, atomic_* operations are useful, so I think that they should be in
Linux kernel for nvdimm. I suppose that this implementation is not
difficult, but there is no version of it as far as I know.
They are not just implemented or is there a reason that we don't need
them?
I'm sorry if there already was in the discussion or I have a wrong
understanding.
Regards,
Fumiya Shigemitsu.
3 years, 9 months