[lkp-robot] [ipc/sem.c] f4b5bafaf7: aim9.shared_memory.ops_per_sec 11.3% improvement
by kernel test robot
Greeting,
FYI, we noticed a 11.3% improvement of aim9.shared_memory.ops_per_sec due to commit:
commit: f4b5bafaf7c0a3b2f204e48c07b5335ed93266fa ("ipc/sem.c: avoid using spin_unlock_wait()")
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
in testcase: aim9
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 64G memory
with following parameters:
testtime: 300s
test: shared_memory
cpufreq_governor: performance
test-description: Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
In addition to that, the commit also has significant impact on the following tests:
+------------------+------------------------------------------------------------------+
| testcase: change | aim9: aim9.shared_memory.ops_per_sec 11.5% improvement |
| test machine | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory |
| test parameters | cpufreq_governor=performance |
| | test=shared_memory |
| | testtime=300s |
+------------------+------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: aim9/300s-shared_memory-performance/ivb43
6487b8d2876d7d39 f4b5bafaf7c0a3b2f204e48c07
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
1073533 ± 0% +11.3% 1194345 ± 0% aim9.shared_memory.ops_per_sec
3221639 ± 0% +11.2% 3584021 ± 0% aim9.time.minor_page_faults
28206 ± 8% -12.5% 24690 ± 0% meminfo.Active(file)
3.56 ± 1% -5.0% 3.38 ± 4% turbostat.RAMWatt
14128 ± 9% -12.8% 12326 ± 1% numa-meminfo.node0.Active(file)
14081 ± 8% -12.2% 12365 ± 0% numa-meminfo.node1.Active(file)
7051 ± 8% -12.5% 6172 ± 0% proc-vmstat.nr_active_file
7051 ± 8% -12.5% 6172 ± 0% proc-vmstat.nr_zone_active_file
3221639 ± 0% +11.2% 3584021 ± 0% time.minor_page_faults
41.48 ± 1% +9.7% 45.50 ± 1% time.user_time
3531 ± 9% -12.7% 3081 ± 0% numa-vmstat.node0.nr_active_file
3531 ± 9% -12.7% 3081 ± 0% numa-vmstat.node0.nr_zone_active_file
3520 ± 8% -12.2% 3091 ± 0% numa-vmstat.node1.nr_active_file
3520 ± 8% -12.2% 3091 ± 0% numa-vmstat.node1.nr_zone_active_file
1.26 ± 16% -70.4% 0.37 ± 71% perf-profile.calltrace.cycles-pp.pid_vnr.SYSC_semtimedop.sys_semop.entry_SYSCALL_64_fastpath
1.38 ± 18% -53.1% 0.65 ± 8% perf-profile.children.cycles-pp.pid_vnr
8.29 ± 8% -37.2% 5.20 ± 10% perf-profile.self.cycles-pp.SYSC_semtimedop
1.37 ± 19% -57.5% 0.58 ± 14% perf-profile.self.cycles-pp.pid_vnr
76641 ± 27% +64.8% 126335 ± 25% slabinfo.kmalloc-8.active_objs
76927 ± 27% +64.5% 126565 ± 25% slabinfo.kmalloc-8.num_objs
839.50 ± 4% -13.0% 730.00 ± 8% slabinfo.nsproxy.active_objs
839.50 ± 4% -13.0% 730.00 ± 8% slabinfo.nsproxy.num_objs
15877 ± 4% -6.7% 14819 ± 4% slabinfo.vm_area_struct.active_objs
15877 ± 4% -6.7% 14819 ± 4% slabinfo.vm_area_struct.num_objs
0.09 ±110% +188.8% 0.26 ± 31% sched_debug.cfs_rq:/.nr_spread_over.stddev
12.61 ± 31% -35.8% 8.10 ± 31% sched_debug.cfs_rq:/.removed_util_avg.stddev
7.42 ± 48% -48.3% 3.83 ± 83% sched_debug.cfs_rq:/.util_avg.min
341584 ± 4% +31.4% 448942 ± 5% sched_debug.cpu.avg_idle.min
138800 ± 3% -8.5% 127032 ± 1% sched_debug.cpu.avg_idle.stddev
1134 ± 12% -23.0% 873.83 ± 16% sched_debug.cpu.nr_switches.min
628.08 ± 39% -66.7% 209.39 ± 59% sched_debug.cpu.sched_count.min
215.04 ± 61% -91.7% 17.89 ± 88% sched_debug.cpu.sched_goidle.min
132.92 ± 30% -43.5% 75.06 ± 39% sched_debug.cpu.ttwu_count.min
3.713e+11 ± 4% +20.5% 4.476e+11 ± 7% perf-stat.branch-instructions
1.32 ± 5% -9.1% 1.20 ± 3% perf-stat.branch-miss-rate%
4.887e+09 ± 5% +9.4% 5.348e+09 ± 3% perf-stat.branch-misses
0.13 ± 6% -16.6% 0.11 ± 5% perf-stat.dTLB-load-miss-rate%
4.368e+11 ± 6% +17.7% 5.139e+11 ± 0% perf-stat.dTLB-loads
0.04 ± 10% -13.3% 0.04 ± 0% perf-stat.dTLB-store-miss-rate%
2.071e+12 ± 4% +20.4% 2.494e+12 ± 6% perf-stat.instructions
12681 ± 4% +18.0% 14964 ± 5% perf-stat.instructions-per-iTLB-miss
0.92 ± 1% +11.8% 1.03 ± 1% perf-stat.ipc
3784094 ± 0% +9.5% 4145210 ± 0% perf-stat.minor-faults
3784100 ± 0% +9.5% 4145210 ± 0% perf-stat.page-faults
perf-stat.page-faults
4.5e+06 ++----------------------------------------------------------------+
O OO O OO O O OO O O O O |
4e+06 *+**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
3.5e+06 ++ |
| |
3e+06 ++ |
2.5e+06 ++ |
| |
2e+06 ++ |
1.5e+06 ++ |
| |
1e+06 ++ |
500000 ++ |
| |
0 ++-----------------O----------------------------------------------+
perf-stat.minor-faults
4.5e+06 ++----------------------------------------------------------------+
O OO O OO O O OO O O O O |
4e+06 *+**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
3.5e+06 ++ |
| |
3e+06 ++ |
2.5e+06 ++ |
| |
2e+06 ++ |
1.5e+06 ++ |
| |
1e+06 ++ |
500000 ++ |
| |
0 ++-----------------O----------------------------------------------+
aim9.shared_memory.ops_per_sec
1.2e+06 O+OO-O-OO-O-O-OO-O--O-O-O-----------------------------------------+
*.**.*.**.*.*.**.*.**.*.**.*.* *.*.**. .**.*. *. *.*. *.*
1e+06 ++ *.*.*.* * * *.*.* * |
| |
| |
800000 ++ |
| |
600000 ++ |
| |
400000 ++ |
| |
| |
200000 ++ |
| |
0 ++-----------------O----------------------------------------------+
aim9.time.minor_page_faults
4e+06 ++----------------------------------------------------------------+
| O OO O O O |
3.5e+06 O+OO O OO.O.*. O*.*. |
3e+06 *+**.*.** **.*.**.*.* **.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
| |
2.5e+06 ++ |
| |
2e+06 ++ |
| |
1.5e+06 ++ |
1e+06 ++ |
| |
500000 ++ |
| |
0 ++-----------------O----------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
3 years, 11 months
[lkp-robot] [PM / QOS] 3a8bc3c5ea: kernel_BUG_at_kernel/workqueue.c
by kernel test robot
FYI, we noticed the following commit:
commit: 3a8bc3c5ea623f1658dbe002b93976a4ff19a2f7 ("PM / QOS: Add 'performance' request")
https://git.linaro.org/people/vireshk/linux opp/genpd-performance-state
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -m 512M
caused below changes:
+---------------------------------------------------+------------+------------+
| | 6ece60ef12 | 3a8bc3c5ea |
+---------------------------------------------------+------------+------------+
| boot_successes | 1 | 0 |
| boot_failures | 93 | 193 |
| WARNING:at_lib/list_debug.c:#__list_add_valid | 31 | 54 |
| BUG:kernel_hang_in_test_stage | 90 | 121 |
| WARNING:at_drivers/usb/core/urb.c:#usb_submit_urb | 1 | 1 |
| general_protection_fault:#[##]SMP | 0 | 39 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 71 |
| kernel_BUG_at_kernel/workqueue.c | 0 | 19 |
| invalid_opcode:#[##]SMP | 0 | 30 |
| WARNING:at_fs/kernfs/dir.c:#kernfs_get | 0 | 1 |
| BUG:unable_to_handle_kernel | 0 | 7 |
| Oops | 0 | 7 |
| kernel_BUG_at_mm/slub.c | 0 | 11 |
+---------------------------------------------------+------------+------------+
[ 98.099740] evbug: Connected device: input3 (AT Translated Set 2 keyboard at isa0060/serio0/input0)
[ 98.228482] usb 1-1: new high-speed USB device number 2 using dummy_hcd
[ 98.287950] ------------[ cut here ]------------
[ 98.305941] kernel BUG at kernel/workqueue.c:3486!
[ 98.328787] invalid opcode: 0000 [#1] SMP
[ 98.345473] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc2-00051-g3a8bc3c #1
[ 98.376651] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 98.413393] task: ffff88001f720000 task.stack: ffff88001f724000
[ 98.434333] RIP: 0010:init_pwq+0xf/0xe3
[ 98.450692] RSP: 0000:ffff88001f727d38 EFLAGS: 00010202
[ 98.469653] RAX: ffffffff825e17c9 RBX: ffffffff825e17c9 RCX: 0000000000000441
[ 98.492725] RDX: ffff88001f409000 RSI: ffff880019c9f800 RDI: ffffffff825e17c9
[ 98.516020] RBP: ffff88001f727d58 R08: ffff88001f818180 R09: 0000000000000001
[ 98.541143] R10: ffffffff825e17c9 R11: 0000000000000003 R12: ffff88001f409000
[ 98.564148] R13: ffff880019c9f800 R14: ffff880019c9f800 R15: 0000000000000001
[ 98.587384] FS: 0000000000000000(0000) GS:ffff88001f800000(0000) knlGS:0000000000000000
[ 98.619557] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 98.639715] CR2: 0000000000000000 CR3: 000000000280f000 CR4: 00000000000006f0
[ 98.662730] Call Trace:
[ 98.676173] ? alloc_unbound_pwq+0xb0/0xe7
[ 98.693023] apply_wqattrs_prepare+0x167/0x305
[ 98.710819] apply_workqueue_attrs_locked+0xb9/0xd0
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
3 years, 11 months
[lkp-robot] [x86, dax, pmem] 2e12109d1c: fio.write_bw_MBps -75% regression
by kernel test robot
Greeting,
FYI, we noticed a -75% regression of fio.write_bw_MBps due to commit:
commit: 2e12109d1c32c810088820478d21b5b7cd87a805 ("x86, dax, pmem: introduce 'copy_from_iter' dax operation")
url: https://github.com/0day-ci/linux/commits/Dan-Williams/dax-pmem-move-cpu-c...
in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
disk: 2pmem
fs: xfs
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: randwrite
bs: 2M
ioengine: sync
test_size: 200G
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: fio-basic/2pmem-xfs-dax-200s-50%-tb-randwrite-2M-sync-200G-performance/lkp-hsw-ep6
c42a4508649e40af 2e12109d1c32c810088820478d
---------------- --------------------------
%stddev change %stddev
\ | \
68769 ± 3% -75% 17370 fio.write_bw_MBps
34384 ± 3% -75% 8685 fio.write_iops
0.70 ± 20% 14149% 99.39 fio.latency_4ms%
745 ± 4% 327% 3182 fio.write_clat_mean_us
1580 ± 14% 137% 3752 ± 5% fio.write_clat_99%_us
1405 ± 17% 136% 3320 ± 3% fio.write_clat_90%_us
1456 ± 16% 133% 3392 fio.write_clat_95%_us
435 ± 27% -60% 175 ± 21% fio.write_clat_stddev
0.01 -100% 0.00 fio.latency_250us%
21.31 ± 36% -100% 0.00 fio.latency_2ms%
5122 8% 5530 fio.time.system_time
487.59 ± 12% -84% 79.88 ± 10% fio.time.user_time
59604 60810 vmstat.system.in
2351 -25% 1758 vmstat.system.cs
1191 19% 1413 turbostat.Avg_MHz
199 4% 206 turbostat.PkgWatt
51.36 50.78 turbostat.%Busy
205 -42% 118 turbostat.RAMWatt
0.00 ± 23% 224% 0.01 ± 4% perf-stat.dTLB-load-miss-rate%
0.09 66% 0.15 perf-stat.branch-miss-rate%
1.431e+13 ± 3% 13% 1.617e+13 perf-stat.cpu-cycles
2.893e+08 -19% 2.356e+08 perf-stat.branch-misses
468564 -26% 346074 perf-stat.context-switches
2.444e+08 ± 6% -39% 1.503e+08 perf-stat.iTLB-loads
1.26e+08 ± 27% -42% 72844209 ± 17% perf-stat.node-load-misses
3.148e+11 -51% 1.546e+11 perf-stat.branch-instructions
40098660 ± 22% -56% 17714563 ± 10% perf-stat.node-store-misses
326873 ± 34% -62% 122748 perf-stat.instructions-per-iTLB-miss
1.741e+12 -67% 5.7e+11 ± 7% perf-stat.dTLB-stores
1.776e+12 ± 11% -68% 5.699e+11 ± 12% perf-stat.dTLB-loads
5.629e+12 -69% 1.725e+12 perf-stat.instructions
0.39 ± 3% -73% 0.11 perf-stat.ipc
1.385e+11 ± 3% -75% 3.438e+10 perf-stat.cache-references
perf-stat.instructions
7e+12 ++------------------------------------------------------------------+
| *. *. *. |
6e+12 ++* : *. .* *. .*.*.*. .*.*.* * : * * : * .*. |
|: + : * : : * * : : + : : : + : : *. .* *.*
5e+12 ++ * : : : : * : : * : : *.* |
* * * * :: |
4e+12 ++ * |
| |
3e+12 ++ |
| |
2e+12 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
1e+12 ++ |
| |
0 ++--------------O---------------------------------------------------+
perf-stat.cache-references
1.8e+11 ++----------------------------------------------------------------+
| *. *. * |
1.6e+11 ++ + * * .*.*. .*.* + * + * |
1.4e+11 ++*.* + + : * .* *.* : *.* : *.* : *. .*.*.*
|+ * : : * : + : + : : *.*.* |
1.2e+11 *+ :: * * :: |
1e+11 ++ * * |
| |
8e+10 ++ |
6e+10 ++ |
| |
4e+10 O+ O O O O O O O O O O O O O O OO |
2e+10 ++O O O O O O O O O |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.branch-instructions
4e+11 ++----------------------------------------------------------------+
| *. *. * |
3.5e+11 ++ : * * * *. .*. .*.*. : * : * |
3e+11 ++*. : + + : : :+ * * * *. : : *. : : *. .*.*.*.*
|+ * * : : * + + * : + * : : *.* |
2.5e+11 *+ * * * :: |
| * |
2e+11 ++ |
| O |
1.5e+11 O+O O O O O O O O O O O O O O O O O O O O O O OO |
1e+11 ++ |
| |
5e+10 ++ |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.dTLB-loads
3.5e+12 ++----------------------------------------------------------------+
| |
3e+12 ++ * |
| :: |
2.5e+12 ++ * : : |
| * *. .* :*.*. .*.*. .* : * *. * * *.* |
2e+12 ++ + + *.* : : * * :+ + + *. + + + * : + .*. |
* * : : * * * * + : * *.*.|
1.5e+12 ++ * * *
| |
1e+12 ++ |
| O |
5e+11 O+O O O O O O O O O O O O O O O O O O O O O O OO |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.dTLB-stores
2.5e+12 ++----------------------------------------------------------------+
| .* |
| *. .* : *. * |
2e+12 ++ : *. .* * .*.*.*.*.* : : * : * |
| *. : * : :* : *. : : *. : : *. .*. .*.*
|+ * : : :+ * : + * : : *.* * |
1.5e+12 *+ :: * * :: |
| * * |
1e+12 ++ |
| |
| |
5e+11 O+O O O O O O O O O O O O O O O O O O O O O O O OO |
| |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.context-switches
500000 ++------------*------------------------------------------*---*-----+
450000 *+ .*. + + .*. .*.*. .* .*. .*. + * *.*.*
| *.*.*.* * * *.*.* *.* *.*.*.* *.*.*.* * |
400000 ++ |
350000 O+O O O O O O O O O O O O O O O OO O O O O O O O O |
| |
300000 ++ |
250000 ++ |
200000 ++ |
| |
150000 ++ |
100000 ++ |
| |
50000 ++ |
0 ++--------------O--------------------------------------------------+
perf-stat.ipc
0.45 ++--------------*-------*--------------------------------------------+
| * *.*. :+ .*.* + .*. .* * *.* * *.* * .*. |
0.4 ++ + + *.* : * * * + : + + : : + + : :+ .* *.*
0.35 ++ * + : + : * : : * : : *.* |
* * * * :: |
0.3 ++ * |
0.25 ++ |
| |
0.2 ++ |
0.15 ++ |
| |
0.1 O+O O O O O O O O O O O O O O O O O O O O O O O O O |
0.05 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.write_bw_MBps
80000 ++------------------------------------------------------------------+
| * *.*. *. .*.*.*. .*.*.* * *.* * *.* *
70000 ++ + + *.* : * * : : + + : : + + : *. .*.*. +|
60000 ++ * : : : : * : : * : : *.* * |
* : : * * :: |
50000 ++ * * |
| |
40000 ++ |
| |
30000 ++ |
20000 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
10000 ++ |
| |
0 ++--------------O---------------------------------------------------+
fio.write_iops
40000 ++------------------------------------------------------------------+
| * *.*. *. .*.*.*. .*.*.* * *.* * *.* *
35000 ++ + + *.* : * * : : + + : : + + : *. .*.*. +|
30000 ++ * : : : : * : : * : : *.* * |
* : : * * :: |
25000 ++ * * |
| |
20000 ++ |
| |
15000 ++ |
10000 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
5000 ++ |
| |
0 ++--------------O---------------------------------------------------+
fio.write_clat_mean_us
4000 ++-------------------------------------------------------------------+
| O |
3500 O+O O O O O O O O O O O O O O O |
3000 ++ O O O O O O O O O |
| |
2500 ++ |
| |
2000 ++ |
| |
1500 ++ |
1000 ++ * |
*. .*. .*.*.*. .*. .*. .*. .*. .*. .*. + + .*.*.*. .*.*
500 ++* *.* * *.*.* *.*.*. * *.* * *.* * * |
| |
0 ++--------------O----------------------------------------------------+
fio.write_clat_90__us
4000 ++-------------------------------------------------------------------+
| O O O O O O O O O |
3500 O+ O O O O O O O O O O O O O O |
3000 ++ O O |
| |
2500 ++ |
| |
2000 ++ |
* .*. .* .* .* .*. *. *.|
1500 ++ .*.*.* * + .*. .*. .*. .*.*. + .*.*.* + .*.*.* *. + * : *
1000 ++* * * * * * * * + : |
| * |
500 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.write_clat_95__us
4500 ++-------------------------------------------------------------------+
| O |
4000 ++O O |
3500 O+ O O O O O O O O O O O O O O O O O O O O |
| O O |
3000 ++ |
2500 ++ |
| |
2000 *+ .* * * * |
1500 ++ .*. .*.* + .*. .*. .*. .*. .. + .*. + + .*. + + .*.* *.*
| * *.* * * * * * * *.* * *.* *.* + + |
1000 ++ * |
500 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.latency_4ms_
100 O+--O-------O-O----O-O---O---O-O-O---O-O-O-O---O-O-O--O---------------+
90 ++O O O O O O O O |
| |
80 ++ |
70 ++ |
| |
60 ++ |
50 ++ |
40 ++ |
| |
30 ++ |
20 ++ |
| |
10 ++ |
0 *+*-*-*-*-*-*-*-O--*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*--*-*-*-*-*-*-*-*-*
turbostat.Avg_MHz
1600 ++-------------------------------------------------------------------+
| |
1400 O+O O O O O O O O O O O O O O O O O O O O O O O O O |
1200 ++ .*.*.*. .*. .*.*.*. .*.*.*.*.. .*.*.*. .*.*.*. .*. .*
*.* * *.* * *.* *.* * *.*.*.*.* |
1000 ++ |
| |
800 ++ |
| |
600 ++ |
400 ++ |
| |
200 ++ |
| |
0 ++--------------O----------------------------------------------------+
turbostat.RAMWatt
250 ++--------------------------------------------------------------------+
| |
|.*. .*.*. .* *.. .*.*.*.*.*.*.*. .*. .*.*. .*. .*..*. .*. .*. .*
200 *+ * * + + * * * * * * *.*.* * |
| * |
| |
150 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
100 ++ |
| |
| |
50 ++ |
| |
| |
0 ++--------------O-----------------------------------------------------+
fio.time.user_time
800 ++--------------------------------------------------------------------+
| *. |
700 ++ * : * |
600 ++ :: : : |
| : : .* : : * |
500 ++*. .*.*. *. : * + .* *. .* *. .* : : + + |
|+ *.* *.*.*.. + * * + + *.* + + *.*. +: *.* *.*
400 *+ * * * * |
| |
300 ++ |
200 ++ |
| |
100 ++ O O O |
O O O O O O O O O O O O O O O O O O O O O O O |
0 ++--------------O-----------------------------------------------------+
fio.time.system_time
6000 ++-------------------------------------------------------------------+
O O O O O O O O O O O O O O O O O O O O O O O O O O |
5000 *+*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*.*. .*.*.*.*.*
| *.* |
| |
4000 ++ |
| |
3000 ++ |
| |
2000 ++ |
| |
| |
1000 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.latency_250us_
0.6 ++--------------------------------------------------------------------+
| * |
0.5 ++ : |
| * * * : |
| : : : : |
0.4 ++ : : : : : |
| : : * : : : : : : |
0.3 ++ : : : : : : : : : |
| : : :: : : : : : : |
0.2 ++ : : : : : : : : : : |
| : : : : : : : : : : |
| : : : : : : : : : : |
0.1 ++: : : : : : : : : : |
| : : : : : : : : : : |
0 *+*---*-*-*---*-*--*-*-*-*-*-*-*-*-*-*---*-*-*-*---*--*---*-*-*-*-*-*-*
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
3 years, 11 months
[lkp-robot] [mm, vmscan] fe23914ddd: fsmark.files_per_sec -11.1% regression
by kernel test robot
Greeting,
FYI, we noticed a -11.1% regression of fsmark.files_per_sec due to commit:
commit: fe23914dddb126a29ac4929415eb318f60d97cac ("mm, vmscan: consider eligible zones in get_scan_count")
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
in testcase: fsmark
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 64G memory
with following parameters:
iterations: 1
nr_threads: 64
disk: 3HDD
md: RAID5
fs: btrfs
filesize: 4M
test_size: 130G
sync_method: NoSync
cpufreq_governor: performance
test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: fsmark/1-64-3HDD-RAID5-btrfs-4M-130G-NoSync-performance/ivb44
13b4e0592df44c15 fe23914dddb126a29ac4929415
---------------- --------------------------
57.60 -11% 51.20 fsmark.files_per_sec
608 10% 670 fsmark.time.elapsed_time
608 10% 670 fsmark.time.elapsed_time.max
141 5% 149 fsmark.time.system_time
613586 628565 fsmark.time.voluntary_context_switches
14682 ± 7% -25% 11031 ± 5% fsmark.time.involuntary_context_switches
301042 29% 389835 ± 5% interrupts.CAL:Function_call_interrupts
207732 -9% 188203 vmstat.io.bo
6654 6541 vmstat.system.cs
5.08 -4% 4.88 turbostat.RAMWatt
92 -5% 88 turbostat.Avg_MHz
3.23 ± 4% -8% 2.97 turbostat.%Busy
155081 ± 12% -1e+05 19550 ± 76% latency_stats.sum.btrfs_tree_lock.[btrfs].btrfs_search_slot.[btrfs].btrfs_insert_empty_items.[btrfs].btrfs_new_inode.[btrfs].btrfs_create.[btrfs].path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
1454653 ± 3% -1e+06 209533 ± 47% latency_stats.sum.btrfs_tree_lock.[btrfs].btrfs_lock_root_node.[btrfs].btrfs_search_slot.[btrfs].btrfs_insert_empty_items.[btrfs].insert_with_overflow.[btrfs].btrfs_insert_dir_item.[btrfs].btrfs_add_link.[btrfs].btrfs_create.[btrfs].path_openat.do_filp_open.do_sys_open.SyS_open
3.351e+08 ± 6% -3e+08 71080683 ± 24% latency_stats.sum.wait_current_trans.[btrfs].start_transaction.[btrfs].btrfs_start_transaction.[btrfs].btrfs_create.[btrfs].path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
17.26 ± 12% 31% 22.53 ± 15% perf-stat.node-store-miss-rate%
2.856e+11 ± 5% 14% 3.267e+11 ± 4% perf-stat.dTLB-loads
1.815e+09 ± 11% 14% 2.071e+09 ± 6% perf-stat.node-load-misses
1156825 10% 1268375 perf-stat.minor-faults
1156846 10% 1268396 perf-stat.page-faults
4058202 8% 4395013 perf-stat.context-switches
2.071e+08 ± 6% 8% 2.228e+08 perf-stat.iTLB-loads
32.23 ± 3% 7% 34.55 perf-stat.node-load-miss-rate%
2.322e+08 5% 2.441e+08 perf-stat.iTLB-load-misses
0.44 0.43 perf-stat.ipc
1.925e+10 ± 3% -7% 1.783e+10 ± 5% perf-stat.cache-references
1.702e+09 ± 9% -9% 1.542e+09 ± 6% perf-stat.branch-misses
49.17 -11% 43.60 ± 3% perf-stat.cache-miss-rate%
4.582e+09 -15% 3.904e+09 perf-stat.node-stores
9.459e+09 -18% 7.764e+09 ± 3% perf-stat.cache-misses
207580 -9% 188098 iostat.md0.wkB/s
12.24 ± 5% 42% 17.38 ± 10% iostat.sda.rrqm/s
103790 -9% 94071 iostat.sda.wkB/s
25722 -10% 23255 iostat.sda.wrqm/s
931 -16% 783 iostat.sda.avgrq-sz
120.88 -42% 69.92 ± 3% iostat.sda.r_await
26.14 -55% 11.88 ± 3% iostat.sda.avgqu-sz
117.72 -56% 52.03 ± 3% iostat.sda.await
117.47 -56% 51.76 ± 3% iostat.sda.w_await
15.66 ± 5% 37% 21.39 ± 8% iostat.sdb.rrqm/s
103763 -9% 94048 iostat.sdb.wkB/s
25717 -10% 23249 iostat.sdb.wrqm/s
930 -16% 783 iostat.sdb.avgrq-sz
120.42 -39% 73.21 ± 4% iostat.sdb.r_await
27.19 -54% 12.38 ± 3% iostat.sdb.avgqu-sz
122.32 -56% 54.23 ± 3% iostat.sdb.await
122.10 -56% 53.98 ± 3% iostat.sdb.w_await
11.84 ± 4% 41% 16.68 ± 7% iostat.sdc.rrqm/s
228 18% 270 iostat.sdc.w/s
103791 -9% 94074 iostat.sdc.wkB/s
25721 -10% 23246 iostat.sdc.wrqm/s
904 -21% 710 iostat.sdc.avgrq-sz
117.37 -40% 70.17 ± 3% iostat.sdc.r_await
26.85 -53% 12.71 iostat.sdc.avgqu-sz
118.03 -56% 52.24 ± 3% iostat.sdc.await
117.83 -56% 52.02 ± 3% iostat.sdc.w_await
perf-stat.page-faults
1.3e+06 ++---------------------------------------------------------------+
| |
1.28e+06 O+O O OO O O O O |
1.26e+06 ++ O O OO O O |
| O O O O |
1.24e+06 ++ |
| |
1.22e+06 ++ |
| |
1.2e+06 ++ |
1.18e+06 ++ |
| |
1.16e+06 ++ .*. *. *.*. .*.* *. .*. .*. *.|
*.**.* **. .* *.*.**.*.* *.** *.*.* * **.*.**.* * *
1.14e+06 ++----------*----------------------------------------------------+
perf-stat.minor-faults
1.3e+06 ++---------------------------------------------------------------+
| |
1.28e+06 O+O O OO O O O O |
1.26e+06 ++ O O OO O O |
| O O O O |
1.24e+06 ++ |
| |
1.22e+06 ++ |
| |
1.2e+06 ++ |
1.18e+06 ++ |
| |
1.16e+06 ++ .*. *. *.*. .*.* *. .*. .*. *.|
*.**.* **. .* *.*.**.*.* *.** *.*.* * **.*.**.* * *
1.14e+06 ++----------*----------------------------------------------------+
fsmark.time.elapsed_time
680 ++--------------------------------------------------------------------+
| O O O O O |
670 O+ O O O O O OO O O O O O |
660 ++ O |
| |
650 ++ |
| |
640 ++ |
| |
630 ++ |
620 ++ |
| |
610 ++ .*. *. .*. *.*. .* .**.*. *. .*.|
*.* * *.*.*.*.**.*.*.* * *.*.*.* *.*.*.*.* *.*.*.* * *
600 ++--------------------------------------------------------------------+
fsmark.time.elapsed_time.max
680 ++--------------------------------------------------------------------+
| O O O O O |
670 O+ O O O O O OO O O O O O |
660 ++ O |
| |
650 ++ |
| |
640 ++ |
| |
630 ++ |
620 ++ |
| |
610 ++ .*. *. .*. *.*. .* .**.*. *. .*.|
*.* * *.*.*.*.**.*.*.* * *.*.*.* *.*.*.*.* *.*.*.* * *
600 ++--------------------------------------------------------------------+
fsmark.files_per_sec
58 ++---------------------------------------------------------------------+
*.*.*.**.*.*.*.*.*.*.**.*.*.*.*. .*.**.*. .*.*.*.**.*.*.*.*.*.*.**.*.*.*
57 ++ * * |
| |
56 ++ |
| |
55 ++ |
| |
54 ++ |
| |
53 ++ |
| |
52 ++ |
| |
51 O+O-O-OO-O-O-O-O-O-O-OO-O-O-O-O-O-O------------------------------------+
vmstat.io.bo
210000 ++------------------------------*--------*-------------------------+
*.*.**. .*.**.*.*.**.*.*. *. + **.*. + *.*.*.**.*.*.**.*.*. *.*.*
| * * *.* * * |
205000 ++ |
| |
| |
200000 ++ |
| |
195000 ++ |
| |
| |
190000 ++ O O |
O OO O O O O O OO O O O O |
| O O O |
185000 ++-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
3 years, 11 months
[lkp-robot] [mm/memblock] cc4a913fa5: WARNING:at_mm/memblock.c:#__next_mem_pfn_range
by kernel test robot
FYI, we noticed the following commit:
commit: cc4a913fa513cdac8777c2714e6388465691faf8 ("mm/memblock: switch to use NUMA_NO_NODE instead of MAX_NUMNODES in for_each_mem_pfn_range()")
url: https://github.com/0day-ci/linux/commits/Wei-Yang/mm-memblock-use-NUMA_NO...
in testcase: trinity
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-i386 -enable-kvm -m 256M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+------------------------------------------------+------------+------------+
| | 2d1ec173d5 | cc4a913fa5 |
+------------------------------------------------+------------+------------+
| boot_successes | 6 | 0 |
| boot_failures | 0 | 8 |
| WARNING:at_mm/memblock.c:#__next_mem_pfn_range | 0 | 8 |
| calltrace:SyS_open | 0 | 8 |
+------------------------------------------------+------------+------------+
[ 0.000000] initial memory mapped: [mem 0x00000000-0x083fffff]
[ 0.000000] Base memory trampoline at [8009b000] 9b000 size 16384
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at mm/memblock.c:1088 __next_mem_pfn_range+0x5e/0x16e
[ 0.000000] Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6-00135-gcc4a913 #1
[ 0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 0.000000] 87511ec0 86e241ad 8743bde0 8743bde0 87511ed8 86c48c2a 00000440 11a426ae
[ 0.000000] 87511f5c 87511f54 87511ef4 86c48c68 00000009 00000000 87511eec 8743bde0
[ 0.000000] 87511f08 87511f1c 875eadb9 8743bd80 00000440 8743bde0 87511f5c 87511f54
[ 0.000000] Call Trace:
[ 0.000000] [<86e241ad>] dump_stack+0x76/0xa9
[ 0.000000] [<86c48c2a>] __warn+0xba/0xd0
[ 0.000000] [<86c48c68>] warn_slowpath_fmt+0x28/0x30
[ 0.000000] [<875eadb9>] __next_mem_pfn_range+0x5e/0x16e
[ 0.000000] [<875a84e6>] init_range_memory_mapping+0x3e/0x177
[ 0.000000] [<875e9fea>] ? memblock_find_in_range+0x3c/0x95
[ 0.000000] [<875a87c4>] init_mem_mapping+0x1a5/0x260
[ 0.000000] [<87594e3a>] setup_arch+0x7c0/0xb82
[ 0.000000] [<86c973b2>] ? vprintk_default+0x12/0x20
[ 0.000000] [<8758dcd8>] start_kernel+0x5b/0x40f
[ 0.000000] [<8758d2c8>] i386_start_kernel+0xaf/0xc7
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000] BRK [0x07ea3000, 0x07ea3fff] PGTABLE
[ 0.000000] RAMDISK: [mem 0x0f4f1000-0x0ffcffff]
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
3 years, 11 months