Dear Linux kernel group
my name is Ludwig Petrosyan I am working in DESY (Germany)
we are responsible for the control system of all accelerators in DESY.
For a 7-8 years we have switched to MTCA.4 systems and using PCIe as a
I am mostly responsible for the Linux drivers of the AMC Cards (PCIe
The idea is start to use peer to peer transaction for PCIe endpoint (DMA
and/or usual Read/Write)
Could You please advise me where to start, is there some Documentation
how to do it.
with best regards
On 11/21/2016 09:36 PM, Deucher, Alexander wrote:
This is certainly not the first time this has been brought up, but
I'd like to try and get some consensus on the best way to move this forward. Allowing
devices to talk directly improves performance and reduces latency by avoiding the use of
staging buffers in system memory. Also in cases where both devices are behind a switch,
it avoids the CPU entirely. Most current APIs (DirectGMA, PeerDirect, CUDA, HSA) that
deal with this are pointer based. Ideally we'd be able to take a CPU virtual address
and be able to get to a physical address taking into account IOMMUs, etc. Having struct
pages for the memory would allow it to work more generally and wouldn't require as
much explicit support in drivers that wanted to use it.
Some use cases:
1. Storage devices streaming directly to GPU device memory
2. GPU device memory to GPU device memory streaming
3. DVB/V4L/SDI devices streaming directly to GPU device memory
4. DVB/V4L/SDI devices streaming directly to storage devices
Here is a relatively simple example of how this could work for testing. This is
obviously not a complete solution.
- Device memory will be registered with Linux memory sub-system by created corresponding
struct page structures for device memory
- get_user_pages_fast() will return corresponding struct pages when CPU address points
to the device memory
- put_page() will deal with struct pages for device memory
Previously proposed solutions and related proposals:
DMA-API/PCI map_peer_resource support for peer-to-peer
Pros: Low impact, already largely reviewed.
Cons: requires explicit support in all drivers that want to support it, doesn't
handle S/G in device memory.
2. ZONE_DEVICE IO
Direct I/O and DMA for persistent memory (https://lwn.net/Articles/672457/
Add support for ZONE_DEVICE IO memory with struct pages.
Pro: Doesn't waste system memory for ZONE metadata
Cons: CPU access to ZONE metadata slow, may be lost, corrupted on device reset.
RDMA subsystem DMA-BUF support (http://www.spinics.net/lists/linux-rdma/msg38748.html
Pros: uses existing dma-buf interface
Cons: dma-buf is handle based, requires explicit dma-buf support in drivers.
iopmem : A block device for PCIe memory (https://lwn.net/Articles/703895/
Heterogeneous Memory Management
6. Some new mmap-like interface that takes a userptr and a length and returns a dma-buf
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html