---- On Fri, 31 Aug 2018 17:42:55 +0800 Jan Kara <jack(a)suse.cz> wrote ----
On Fri 31-08-18 09:38:09, Dave Chinner wrote:
> On Thu, Aug 30, 2018 at 03:47:32PM -0400, Mikulas Patocka wrote:
> > On Thu, 30 Aug 2018, Jeff Moyer wrote:
> > > Mike Snitzer <snitzer(a)redhat.com> writes:
> > >
> > > > Until we properly add DAX support to dm-snapshot I'm afraid we
> > > > need to tolerate this "regression". Since reality is the
> > > > support for snapshot of a DAX DM device never worked in a robust way.
> > >
> > > Agreed.
> > >
> > > -Jeff
> > You can't support dax on snapshot - if someone maps a block and the block
> > needs to be moved, then what?
> This is only a problem for access via mmap and page faults.
> At the filesystem level, it's no different to the existing direct IO
> algorithm for read/write IO - we simply allocate new space, copy the
> data we need to copy into the new space (may be no copy needed), and
> then write the new data into the new space. I'm pretty sure that for
> bio-based IO to dm-snapshot devices the algorithm will be exactly
> the same.
> However, for direct access via mmap, we have to modify how the
> userspace virtual address is mapped to the physical location. IOWs,
> during the COW operation, we have to invalidate all existing user
> mappings we have for that physical address. This means we have to do
> an invalidation after the allocate/copy part of the COW operation.
> If we are doing this during a page fault, it means we'll probably
> have to restart the page fault so it can look up the new physical
> address associated with the faulting user address. After we've done
> the invalidation, any new (or restarted) page fault finds the
> location of new copy we just made, maps it into the user address
> space, updates the ptes and we're all good.
> Well, that's the theory. We haven't implemented this for XFS yet, so
> it might end up a little different, and we might yet hit unexpected
> problems (it's DAX, that's what happens :/).
Yes, that's outline of a plan :)
> It's a whole different ballgame for a dm-snapshot device - block
> devices are completely unaware of page faults to DAX file mappings.
Actually, block devices are not completely unaware of DAX page faults -
they will get ->direct_access callback for the fault range. It does not
currently convey enough information - we also need to inform the block
device whether it is read or write. But that's about all that's needed to
add AFAICT. And by comparing returned PFN with the one we have stored in
the radix tree (which we have if that file offset is mapped by anybody),
the filesystem / DAX code can tell whether remapping happened and do the
I am trying to investigate how to make dm-snapshot to support DAX, and I
dropped a patchset to upstream for comments. Any suggestion is welcome.
In the beginning, I haven't considered the situation of mmap write faults.
From Dan's reply and this email thread, now I have a more clear
The question is that, even the virtual dm block device has been informed that
the mmap may have write operations through PROT_WRITE, if userspace directly
operate the virtual address of origin device like memcpy, dm-snapshot doesn't
have chance to detect this behavior.
Although dm-snapshot can have chance to prepare a COW area to back up origin's
blocks within ->direct_access callback for the fault range, how can it to have
opportunity to read the data from origin device and save it to COW?