nvme_ctrlr.c:1224:nvme_ctrlr_process_init: ***ERROR*** Initialization timed out in state 3
by Oza Oza
I have ported SPDK for ARMv8.
And DPDK is compiled with version: 16.11.1
init controller is failing.
root@ubuntu:/home/oza/SPDK/spdk#
odepth=128 --size=4G --readwrite=read --filename=0000.01.00.00/1 --bs=4096
--i
/home/oza/fio /home/oza/SPDK/spdk
EAL: pci driver is being registered 0x1nreadtest: (g=0): rw=read,
bs=4096B-4096B,4096B-4096B,4096B-4096B, ioengine=spdk_fio, iodepth=128
fio-2.17-29-gf0ac1
Starting 1 process
Starting Intel(R) DPDK initialization ...
[ DPDK EAL parameters: fio -c 1 --file-prefix=spdk_pid6448
--base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 8 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: cannot open /proc/self/numa_maps, consider that all memory is in
socket_id 0
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
EAL: using IOMMU type 1 (Type 1)
EAL: vfio_group_fd=11 iommu_group_no=3 *vfio_dev_fd=13
EAL: reg=0x2000 fd=13 cap_offset=0x50
EAL: the msi-x bar number is 0 0x2000 0x200
EAL: Hotplug doesn't support vfio yet
spdk_fio_setup() is being called
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
EAL: vfio_group_fd=11 iommu_group_no=3 *vfio_dev_fd=16
EAL: reg=0x2000 fd=16 cap_offset=0x50
EAL: the msi-x bar number is 0 0x2000 0x200
EAL: inside pci_vfio_write_config offset=4
nvme_ctrlr.c:1224:nvme_ctrlr_process_init: ***ERROR*** Initialization
timed out in state 3
nvme_ctrlr.c: 403:nvme_ctrlr_shutdown: ***ERROR*** did not shutdown within
5 seconds
EAL: Hotplug doesn't support vfio yet
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 8086:953 spdk_nvme
EAL: vfio_group_fd=11 iommu_group_no=3 *vfio_dev_fd=18
EAL: reg=0x2000 fd=18 cap_offset=0x50
EAL: the msi-x bar number is 0 0x2000 0x200
EAL: Hotplug doesn't support vfio yet
EAL: Requested device 0000:01:00.0 cannot be used
spdk_nvme_probe() failed
Regards,
Oza.
3 years, 5 months
A issue about maximums of write latency when we access the same block consecutively.
by 储
Hi, all
Recently, we use a demo to obverse the latency.
The demo is based on 'hello_world.c' in 'spdk/examples/nvme/hello_world'.
The modifications are described as following.
---------------------------------------------------------------------------------------------------------------------------------------------------------------
static void write_complete(void *arg, const struct spdk_nvme_cpl *completion) {
struct hello_world_sequence *sequence = arg;
spdk_free(sequence->buf);
sequence->is_completed = 1;
}
---------------------------------------------------------------------------------------------------------------------------------------------------------------
hello_world(int id) {
...
clock_gettime(CLOCK_REALTIME, &time1);
rc = spdk_nvme_ns_cmd_write(ns_entry->ns, ns_entry->qpair, sequence.buf,
id, /* LBA start */
1, /* number of LBAs */
write_complete, &sequence, 0);
...
while (!sequence.is_completed) {
spdk_nvme_qpair_process_completions(ns_entry->qpair, 0);
}
clock_gettime(CLOCK_REALTIME, &time2);
printf("%ld \n", diff(time1,time2).tv_nsec);
...
}
---------------------------------------------------------------------------------------------------------------------------------------------------------------
int main() {
...
int i = 500;
while (i > 0) {
if (i-- % 4 == 0) {
id += 10;
}
hello_world(id);
}
...
}
---------------------------------------------------------------------------------------------------------------------------------------------------------------
We find that when we access the same block consecutively, the occurrence of maximum latencies will become more frequent.
Additionally, they can reach even 2-3 ms and present a periodical change.
The related experiment results can be seen in accessory 'result'. The results are like:
Why?
(1) As shown in the file 'result', for the same block, the latency of the first accessing is about 10-12 μs while the second, third and the forth accessing can reach 700-900 μs even 2-3 ms?
I want to know the reason why the operation difference between the first accessing and the others exists.
(2) Why the maximums of 2-3 ms have a periodical change as shown in the above figure?
Best wishes,
Jiajia Chu
3 years, 5 months
simulated NVMe device
by Crane Chu
Hi,
Recently, I started to read and try SPDK source code, focusing on NVMe
driver layer. However, my laptop does not have NVMe devices, and I think
most of us have the same problem. So, I implemented a simulated NVMe device
in env_dpdk, and now I can run some examples (hello_world, perf, identify)
on it. If it is useful, I'd like to submit patches.
The code is derived from QEMU NVMe device (e.g. nvme.c, nvme.h), which use
GPL. Would it be a license problem in SPDK?
Thanks,
-Crane
3 years, 5 months
No free hugepages reported in hugepages
by Oza Oza
Hi All,
1) DPDK fails if huge-pages are not reserved. But should not it fall
back to regular 4k pages ?
2) And how user space should know the requirement of number of hug
pages before-hand ?
a. Rather should it always be reserved before-hand in the system?,
with that we might waste some.
root@bcm958742t:~# /usr/share/spdk/examples/nvme/perf -r 'trtype:PCIe
traddr:0000:01:00.0' -q 128 -s 4096 -w write -d 2048 -t 30 -c 0x1
Starting DPDK 16.11.1 initialization...
[ DPDK EAL parameters: perf -c 1 -m 2048 --file-prefix=spdk_pid3[
187.734815] audit: type=1701 audit(1500448013.777:3): auid=4294967295 uid=0
gid=0 ses=4294967295 pid=3465 comm="perf"
exe="/usr/share/spdk/examples/nvme/perf" sig=6 res=1
465 ]
EAL: Detected 8 lcore(s)
EAL: No free hugepages reported in hugepages-2048kB
PANIC in rte_eal_init():
Cannot get hugepage information
1: [/usr/share/spdk/examples/nvme/perf() [0x4197d4]]
Aborted (core dumped)
Regards,
Oza.
3 years, 6 months
Re: [SPDK] NVMe-oF Target Library
by Walker, Benjamin
I'm reviving this old thread because I'm back to working in this area again. The
challenge for changes this large is to figure out how to do them in small
pieces. Some of the major changes to the threading model will need to be done in
one go, but I think there are a few incremental steps we can take to improve the
code base and prepare it for the big transition.
I think the first of those changes is to remove the "mode" parameter from
subsystems. Today, NVMe-oF subsystems can be in either direct (I/O routed to
nvme library) or virtual (I/O routed to bdev library) mode. Recently, a new bdev
command was added by the wider community (thanks for the patch!) that adds an
NVMe passthrough command to the bdev layer. That allows us to send an NVMe
command through the regular bdev stack. The commands are generally only
interpreted by the NVMe bdev module - the other backing devices don't report
support for the NVMe passthrough command - but that's good enough. Given that
new capability, we can do anything in virtual mode that we previously did in
direct mode.
The only reason we didn't remove direct mode immediately after the addition of
NVMe passthrough was because we wanted to do a full performance evaluation to
verify the bdev layer doesn't have a measurable amount of overhead. I'm glad to
report those results have come in and the overhead of routing I/O through the
bdev library instead of nvme isn't measurable for any hardware set up we were
able to build.
I wrote up a patch here: https://review.gerrithub.io/#/c/369496/
The next big step is probably to make some changes to the transport API to
accommodate the new ideas in my previous email.
Discussion and requests are always welcome!
Thanks,
Ben
On Fri, 2017-07-14 at 17:16 +0000, Walker, Benjamin wrote:
>
> -----Original Message-----
> From: Walker, Benjamin
> Sent: Wednesday, April 26, 2017 2:06 PM
> To: spdk(a)lists.01.org
> Subject: NVMe-oF Target Library
>
> Hi all,
>
> I was hoping to start a bit of a design discussion about the future of the
> NVMe-oF target library (lib/nvmf). The NVMe-oF target was originally created
> as
> part of a skunkworks project and was very much an application. It wasn't
> divided into a library and an app as it is today. Right before we released it,
> I decided to attempt to break it up into a library and an application, but I
> never really finished that task. I'd like to resume that work now, but let the
> entire community weigh in on what the library looks like.
>
> First, libraries in SPDK (most things that live in lib/) shouldn't enforce a
> threading model. They should, as much as possible, be entirely passive C
> libraries with as few dependencies as we can manage. Applications in SPDK
> (things that live in app/), on the other hand, necessarily must choose a
> particular threading model. We universally use our application/event framework
> (lib/event) for apps, which spawns one thread per core, etc. We'll continue
> this model for NVMe-oF where app/nvmf_tgt will be a full application with a
> threading model dictated by the application/event framework, while lib/nvmf
> will be a passive C library that will depend only on other passive C
> libraries.
> I don't think this distinction is at all reality today, but let's work to make
> it so.
>
> The other major issue with the NVMe-oF target implementation is that it has
> quite a few baked in assumptions about what the backing storage device looks
> like. In particular, it was written assuming that it was talking directly to
> an
> NVMe device (Direct mode), and the ability to route I/O to the bdev layer
> (Virtual mode) was added much later and isn't entirely fleshed out yet. One of
> these assumptions is that real NVMe devices don't benefit from multiple queues
> - you can get the full performance from an NVMe device using just one queue
> pair. That isn't necessarily true for bdevs, which may be arbitrarily
> complex virtualized devices. Given that assumption, the NVMe-oF target
> today only creates a single queue pair to the backing storage device and only
> uses a single thread to route I/O to it. We're definitely going to need to
> break that assumption.
>
> The first discussion that I want to have is around what the high level
> concepts
> should be. We clearly need to expose things like "subsystem", "queue
> pair/connection", "namespace", and "port". We should probably have an object
> that represents the entire target too, maybe "nvmf_tgt". However, in order to
> separate the threading model from the library I think we'll need at least two
> more concepts.
>
> First, some thread has to be in charge of polling for new connections. We
> typically refer to this as the "acceptor" thread today. Maybe the best way to
> handle this is to add an "accept" function that takes the nvmf_tgt object as
> an
> argument. This function can only be called one a single thread at a time and
> is
> repeatedly called to discover new connections. I think the user will end up
> passing a callback in to this function that will be called for each new
> connection discovered.
>
> Second, once a new connection is discovered, we need to hand it off to some
> collection that a dedicated thread can poll. This collection of connections
> would be tied specifically to that dedicated thread, but it wouldn't
> necessarily be tied to a subsystem or a particular storage device. I don't
> really know what to call this thing - right now I'm kind of thing
> "io_handler".
>
> So the general flow for an application would be to construct a target, add
> subsystems, namespaces, and ports as needed, and then poll the target for
> incoming connections. For each new connection, the application would assign it
> to an io_handler (using whatever algorithm it wanted) and then poll the
> io_handlers to actually handle I/O on the connections. Does this seem like a
> reasonable design at a very high level? Feedback is very much welcome and
> encouraged.
>
> If I don't hear back with a bunch of "you're wrong!" or "that's stupid!" type
> replies over the next few days, the next step will be to write up a new header
> file for the library that we can discuss in more detail.
>
> Thanks,
> Ben
3 years, 6 months
Re: [SPDK] SPDK Blob Store
by Harris, James R
Hi Neil,
Welcome to SPDK!
A Blobstore “Hello World” example is a fantastic idea – definitely something to put on our Trello backlog (which I’ve done just now). You can see our SPDK Trello boards at trello.com/spdk – your suggestion is now in the Low Hanging Fruit column on the Things to Do board.
Until that example is ready, I can point you to some of our unit test code that could help in meantime. test/unit/lib/blob/blob.c/blob_ut.c has a number of tests that show some very basic operations with the SPDK Blobstore. All of these tests use a simple malloc buffer to simulate the underlying block device – this code is in test/unit/lib/blob/bs_dev_common.c.
The key functions to look at are:
Initialize new blobstore => spdk_bs_init
Load existing blobstore => spdk_bs_load
Create blob => spdk_bs_md_create_blob
Open blob => spdk_bs_md_open_blob
Read from blob => spdk_bs_io_read_blob
Write to blob => spdk_bs_io_write_blob
Close blob => spdk_bs_md_close_blob
Unload blobstore => spdk_bs_unload
One last note as an FYI – a “logical volume store” or lvolstore is under development for SPDK. Lvolstore will provide the ability to dynamically partition a block device into smaller block devices using Blobstore for managing the on-disk metadata. You can see some additional details on the corresponding Trello board.
Thanks,
-Jim
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Neil Kumar <nkumar(a)ocient.com>
Reply-To: Storage Performance Development Kit <spdk(a)lists.01.org>
Date: Tuesday, July 11, 2017 at 7:20 AM
To: "spdk(a)lists.01.org" <spdk(a)lists.01.org>
Subject: [SPDK] SPDK Blob Store
Hello,
I am pretty new to the SPDK but am currently working on trying to implement a basic blobstore application for an NVMe Drive. It looks like all of the relevant code has already been written, but i do not see any example code for implementation. Are there any recommended resources or places to start?
I am basically, just for the purpose of understanding, attempting something very similar to the hello_world example. However, I am trying to write/read/list/delete blobs instead of just a basic string. Is it really as easy as creating a spdk_blob type and then calling the corresponding functions from blobstore?
Create blob -> spdk_bs_md_create_blob
Read blob -> spdk_bs_md_open_blob
Write blob -> _spdk_blob_persist
delete blob -> spdk_bs_md_delete_blob
Thanks,
Neil
3 years, 6 months
Re: [SPDK] [PATCH v2] igb_uio: issue FLR during open and release of device file
by Gregory Etelson
Hello Shijith,
Please add the patch to uio_pci_generic.c file in Linux kernel
We experience similar faults with NVMe devices
On Wednesday, 12 July 2017 06:40:55 IDT Tan, Jianfeng wrote:
>
> > -----Original Message-----
> > From: Shijith Thotton [mailto:shijith.thotton@caviumnetworks.com]
> > Sent: Friday, July 7, 2017 7:14 PM
> > To: dev(a)dpdk.org
> > Cc: Yigit, Ferruh; Gregory Etelson; Thomas Monjalon; Stephen Hemminger;
> > Tan, Jianfeng; Lu, Wenzhuo
> > Subject: [PATCH v2] igb_uio: issue FLR during open and release of device file
> >
> > Set UIO info device file operations open and release. Call pci reset
> > function inside open and release to clear device state at start and end.
> > Copied this behaviour from vfio_pci kernel module code. With this patch,
> > it is not mandatory to issue FLR by PMD's during init and close.
> >
> > Bus master enable and disable are added in open and release respectively
> > to take care of device DMA.
> >
> > Signed-off-by: Shijith Thotton <shijith.thotton(a)caviumnetworks.com>
>
> Reviewed-by: Jianfeng Tan <jianfeng.tan(a)intel.com>
3 years, 6 months
SPDK Blob Store
by Neil Kumar
Hello,
I am pretty new to the SPDK but am currently working on trying to implement a basic blobstore application for an NVMe Drive. It looks like all of the relevant code has already been written, but i do not see any example code for implementation. Are there any recommended resources or places to start?
I am basically, just for the purpose of understanding, attempting something very similar to the hello_world example. However, I am trying to write/read/list/delete blobs instead of just a basic string. Is it really as easy as creating a spdk_blob type and then calling the corresponding functions from blobstore?
Create blob -> spdk_bs_md_create_blob
Read blob -> spdk_bs_md_open_blob
Write blob -> _spdk_blob_persist
delete blob -> spdk_bs_md_delete_blob
Thanks,
Neil
3 years, 6 months
nvmf in-capsule data test
by qseaxd
Hi SPDK experts,
I'm using SPDK NVMF on target and would like test in-capsule data HW
support,
While send data by using ‘perf’ on the host doesn’t send in capsule data.
Is SPDK host support on sending in-capsule data? if yes, can you please
explain how to?
Thanks,
Q
3 years, 6 months
RDMA Queue Pair
by Kumaraparameshwaran Rathnavel
Hi All,
I would like to get few pointers on how the RDMA is used in NVMf Target . The below statement decides the value of Qdepth of NVMf Qpair.
nvmf_min(max_rw_depth, addr->attr.max_qp_rd_atom and anything more than this will be Queued irrespective of what the upper layer has the Queue depth as.
And the value for attr.max_qp_rd_atom is got by Querying the device. I see that ideally in most of NICs the value is 16. So does this mean that in a Queue pair there cannot be more RDMA requests than this value.
Please correct me if I am wrong.
Thanking You,
Param.
3 years, 6 months