Greeting,
FYI, we noticed a -8.7% regression of stress-ng.dup.ops_per_sec due to commit:
commit: 4facb95b7adaf77e2da73aafb9ba60996fe42a12 ("x86/entry: Unbreak 32bit fast
syscall")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:
nr_threads: 10%
disk: 1HDD
testtime: 30s
class: filesystem
cpufreq_governor: performance
ucode: 0x5002f01
fs: btrfs
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -4.1% regression
|
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G
memory |
| test parameters | cpufreq_governor=performance
|
| | mode=process
|
| | nr_task=50%
|
| | test=getppid1
|
| | ucode=0x5002f01
|
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -3.5% regression
|
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G
memory |
| test parameters | cpufreq_governor=performance
|
| | mode=thread
|
| | nr_task=16
|
| | test=futex4
|
| | ucode=0x5002f01
|
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -3.1% regression
|
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G
memory |
| test parameters | cpufreq_governor=performance
|
| | mode=process
|
| | nr_task=16
|
| | test=poll1
|
| | ucode=0x5002f01
|
+------------------+---------------------------------------------------------------------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen(a)intel.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
filesystem/gcc-9/performance/1HDD/btrfs/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/stress-ng/30s/0x5002f01
commit:
d5c678aed5 ("x86/debug: Allow a single level of #DB recursion")
4facb95b7a ("x86/entry: Unbreak 32bit fast syscall")
d5c678aed5eddb94 4facb95b7adaf77e2da73aafb9b
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.937e+08 -8.7% 1.768e+08 stress-ng.dup.ops
6455320 -8.7% 5891784 stress-ng.dup.ops_per_sec
1898586 -3.9% 1823791 stress-ng.time.involuntary_context_switches
3902753 -3.5% 3767258 ± 2% stress-ng.time.minor_page_faults
42502 ± 2% -11.2% 37751 ± 5% sched_debug.cpu.yld_count.min
37834 +1.8% 38531 proc-vmstat.nr_slab_reclaimable
7344206 -2.1% 7188882 proc-vmstat.pgfault
1.346e+08 +1.3% 1.363e+08 perf-stat.i.cache-references
200.92 -1.6% 197.63 perf-stat.i.cpu-migrations
9.69 +2.0% 9.88 perf-stat.overall.MPKI
1.344e+08 +1.3% 1.361e+08 perf-stat.ps.cache-references
200.79 -1.6% 197.56 perf-stat.ps.cpu-migrations
2.076e+09 -1.4% 2.046e+09 perf-stat.ps.dTLB-stores
1.696e+13 -1.2% 1.676e+13 perf-stat.total.instructions
149522 ± 7% -12.1% 131402 ± 5% softirqs.BLOCK
111270 ± 3% +10.1% 122480 ± 3% softirqs.CPU40.RCU
131299 ± 5% -16.1% 110136 ± 9% softirqs.CPU55.RCU
116708 ± 4% -9.5% 105607 ± 4% softirqs.CPU56.RCU
127637 ± 9% -14.8% 108805 ± 6% softirqs.CPU58.RCU
108133 ± 6% +10.8% 119806 ± 4% softirqs.CPU7.RCU
108681 ± 9% +17.6% 127829 ± 9% softirqs.CPU88.RCU
9521 ± 3% -9.6% 8607 ± 5% slabinfo.file_lock_cache.active_objs
9555 ± 3% -9.6% 8637 ± 5% slabinfo.file_lock_cache.num_objs
76202 ± 7% -12.4% 66736 ± 8% slabinfo.ftrace_event_field.active_objs
899.25 ± 7% -12.4% 788.00 ± 8% slabinfo.ftrace_event_field.active_slabs
76468 ± 7% -12.3% 67030 ± 8% slabinfo.ftrace_event_field.num_objs
899.25 ± 7% -12.4% 788.00 ± 8% slabinfo.ftrace_event_field.num_slabs
24041 ± 5% -10.3% 21566 ± 3% slabinfo.pid.active_objs
24141 ± 5% -10.5% 21610 ± 3% slabinfo.pid.num_objs
6.16 -0.4 5.73 ± 6%
perf-profile.calltrace.cycles-pp.free_uid.put_cred_rcu.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.09 -0.4 5.67 ± 6%
perf-profile.calltrace.cycles-pp.refcount_dec_not_one.refcount_dec_and_lock_irqsave.free_uid.put_cred_rcu.do_faccessat
6.11 -0.4 5.69 ± 6%
perf-profile.calltrace.cycles-pp.refcount_dec_and_lock_irqsave.free_uid.put_cred_rcu.do_faccessat.do_syscall_64
0.85 ± 3% -0.1 0.75 ± 12%
perf-profile.calltrace.cycles-pp.btrfs_get_delayed_node.btrfs_get_or_create_delayed_node.btrfs_delayed_update_inode.btrfs_update_inode.btrfs_dirty_inode
6.10 -0.4 5.67 ± 6%
perf-profile.children.cycles-pp.refcount_dec_not_one
6.16 -0.4 5.73 ± 6% perf-profile.children.cycles-pp.free_uid
6.12 -0.4 5.70 ± 6%
perf-profile.children.cycles-pp.refcount_dec_and_lock_irqsave
0.85 ± 3% -0.1 0.75 ± 12%
perf-profile.children.cycles-pp.btrfs_get_delayed_node
0.25 ± 4% -0.1 0.17 ± 6%
perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.12 ± 3% +0.0 0.13
perf-profile.children.cycles-pp.lapic_next_deadline
0.03 ±102% +0.0 0.08 ± 19%
perf-profile.children.cycles-pp.update_blocked_averages
0.00 +0.1 0.05 ± 8%
perf-profile.children.cycles-pp.__x64_sys_access
0.00 +0.1 0.07 ± 13%
perf-profile.children.cycles-pp.__x64_sys_faccessat
6.06 -0.4 5.64 ± 6%
perf-profile.self.cycles-pp.refcount_dec_not_one
1.34 ± 4% -0.2 1.10 ± 11% perf-profile.self.cycles-pp.menu_select
1.28 ± 5% -0.2 1.11 ± 9%
perf-profile.self.cycles-pp.cpuidle_enter_state
0.84 ± 3% -0.1 0.74 ± 12%
perf-profile.self.cycles-pp.btrfs_get_delayed_node
0.24 ± 3% -0.1 0.17 ± 6%
perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.11 ± 4% +0.0 0.13
perf-profile.self.cycles-pp.lapic_next_deadline
0.00 +0.1 0.05 ± 8%
perf-profile.self.cycles-pp.__x64_sys_access
0.00 +0.1 0.06 ± 13%
perf-profile.self.cycles-pp.__x64_sys_faccessat
0.00 +0.1 0.09 ± 14%
perf-profile.self.cycles-pp.__x64_sys_fchmod
297777 ± 7% -58.2% 124580 ±100%
interrupts.315:PCI-MSI.376832-edge.ahci[0000:00:17.0]
2.25 ± 79% +21266.7% 480.75 ± 98%
interrupts.69:PCI-MSI.31981602-edge.i40e-eth0-TxRx-33
2.25 ±127% +3888.9% 89.75 ±134%
interrupts.74:PCI-MSI.31981607-edge.i40e-eth0-TxRx-38
16244 ± 15% -19.2% 13128 ± 14% interrupts.CPU1.RES:Rescheduling_interrupts
91723 ± 9% +52.2% 139616 ± 46%
interrupts.CPU10.CAL:Function_call_interrupts
10853 ± 6% -31.2% 7465 ± 15%
interrupts.CPU10.RES:Rescheduling_interrupts
11509 ± 8% -29.8% 8080 ± 26%
interrupts.CPU11.RES:Rescheduling_interrupts
123.00 ± 4% -19.7% 98.75 ± 25%
interrupts.CPU14.NMI:Non-maskable_interrupts
123.00 ± 4% -19.7% 98.75 ± 25%
interrupts.CPU14.PMI:Performance_monitoring_interrupts
8822 ± 13% -36.8% 5575 ± 19%
interrupts.CPU14.RES:Rescheduling_interrupts
11485 ± 15% -45.3% 6283 ± 2%
interrupts.CPU16.RES:Rescheduling_interrupts
125.00 ± 3% -19.8% 100.25 ± 24%
interrupts.CPU17.NMI:Non-maskable_interrupts
125.00 ± 3% -19.8% 100.25 ± 24%
interrupts.CPU17.PMI:Performance_monitoring_interrupts
11064 ± 18% -34.1% 7293 ± 23%
interrupts.CPU17.RES:Rescheduling_interrupts
810.00 ±146% -87.7% 99.75 ± 24%
interrupts.CPU18.NMI:Non-maskable_interrupts
810.00 ±146% -87.7% 99.75 ± 24%
interrupts.CPU18.PMI:Performance_monitoring_interrupts
9896 ± 2% -31.6% 6769 ± 26%
interrupts.CPU18.RES:Rescheduling_interrupts
137504 ± 14% -46.9% 72975 ± 44% interrupts.CPU18.TLB:TLB_shootdowns
10649 ± 17% -38.4% 6556 ± 28%
interrupts.CPU21.RES:Rescheduling_interrupts
125.00 ± 3% -20.8% 99.00 ± 24%
interrupts.CPU22.NMI:Non-maskable_interrupts
125.00 ± 3% -20.8% 99.00 ± 24%
interrupts.CPU22.PMI:Performance_monitoring_interrupts
9267 ± 16% -24.6% 6990 ± 10%
interrupts.CPU22.RES:Rescheduling_interrupts
19849 ± 21% -28.1% 14273 ± 20%
interrupts.CPU24.RES:Rescheduling_interrupts
14181 ± 8% -17.4% 11709 ± 10%
interrupts.CPU25.RES:Rescheduling_interrupts
185468 ± 20% -29.7% 130477 ± 17%
interrupts.CPU27.CAL:Function_call_interrupts
159450 ± 5% -22.1% 124232 ± 18%
interrupts.CPU31.CAL:Function_call_interrupts
122378 ± 10% -46.4% 65550 ± 30% interrupts.CPU31.TLB:TLB_shootdowns
6106 ± 15% +46.4% 8939 ± 28%
interrupts.CPU32.RES:Rescheduling_interrupts
2.00 ± 79% +23887.5% 479.75 ± 98%
interrupts.CPU33.69:PCI-MSI.31981602-edge.i40e-eth0-TxRx-33
5812 ± 15% +23.8% 7195 ± 6%
interrupts.CPU36.RES:Rescheduling_interrupts
2.00 ±122% +4362.5% 89.25 ±135%
interrupts.CPU38.74:PCI-MSI.31981607-edge.i40e-eth0-TxRx-38
132.75 ± 6% -31.6% 90.75 ± 34%
interrupts.CPU44.NMI:Non-maskable_interrupts
132.75 ± 6% -31.6% 90.75 ± 34%
interrupts.CPU44.PMI:Performance_monitoring_interrupts
157673 ± 44% -45.5% 85876 ± 44%
interrupts.CPU45.CAL:Function_call_interrupts
8304 ± 6% -21.3% 6537 ± 17%
interrupts.CPU48.RES:Rescheduling_interrupts
113272 ± 32% +66.1% 188101 ± 22%
interrupts.CPU5.CAL:Function_call_interrupts
12178 ± 9% -48.7% 6249 ± 34%
interrupts.CPU55.RES:Rescheduling_interrupts
574.50 ±132% -80.6% 111.25 ± 7%
interrupts.CPU56.NMI:Non-maskable_interrupts
574.50 ±132% -80.6% 111.25 ± 7%
interrupts.CPU56.PMI:Performance_monitoring_interrupts
10975 ± 21% -55.5% 4879 ± 34%
interrupts.CPU56.RES:Rescheduling_interrupts
134.00 ± 9% -15.9% 112.75 ± 5%
interrupts.CPU57.NMI:Non-maskable_interrupts
134.00 ± 9% -15.9% 112.75 ± 5%
interrupts.CPU57.PMI:Performance_monitoring_interrupts
13877 ± 31% -48.6% 7138 ± 46%
interrupts.CPU58.RES:Rescheduling_interrupts
10752 ± 25% -43.5% 6070 ± 48%
interrupts.CPU59.RES:Rescheduling_interrupts
124.25 ± 3% -9.7% 112.25 ± 6%
interrupts.CPU61.NMI:Non-maskable_interrupts
124.25 ± 3% -9.7% 112.25 ± 6%
interrupts.CPU61.PMI:Performance_monitoring_interrupts
91014 ± 27% -49.4% 46018 ± 36% interrupts.CPU62.TLB:TLB_shootdowns
9574 ± 24% -49.4% 4841 ± 40%
interrupts.CPU69.RES:Rescheduling_interrupts
8343 ± 17% -45.7% 4533 ± 34%
interrupts.CPU70.RES:Rescheduling_interrupts
4009 ± 96% -80.5% 783.50 ±152%
interrupts.CPU71.NMI:Non-maskable_interrupts
4009 ± 96% -80.5% 783.50 ±152%
interrupts.CPU71.PMI:Performance_monitoring_interrupts
10207 ± 14% -24.9% 7670 ± 18%
interrupts.CPU72.RES:Rescheduling_interrupts
127441 ± 20% -40.7% 75630 ± 18%
interrupts.CPU74.CAL:Function_call_interrupts
106754 ± 23% -36.8% 67436 ± 20% interrupts.CPU74.TLB:TLB_shootdowns
10256 ± 20% -29.5% 7230 ± 22% interrupts.CPU8.RES:Rescheduling_interrupts
79980 ± 36% -49.4% 40484 ± 21% interrupts.CPU9.TLB:TLB_shootdowns
875684 ± 2% -12.4% 767074 ± 6% interrupts.RES:Rescheduling_interrupts
stress-ng.dup.ops
4e+08 +-----------------------------------------------------------------+
|+ ++.+ +.+ +.++ ++ ++.+ |
|: : : : : : : :: : : |
3.5e+08 |:+ : : : : : : :: : : |
| : : : : : : : : : : : |
| : : : : : : : : : : : |
3e+08 |-: : : : : : : : : : : |
| :: : : : : : :: : |
2.5e+08 |-:: : : : : : :: : |
| :: : : : : : :: : |
| : : : : : : : : |
2e+08 |-+: ++. .+ .+ +. : : +.+ : +.+ +.+ +. |
|O + O OO +++.++ O++ + +++.++ + + + + +++.+|
| OO O O OO |
1.5e+08 +-----------------------------------------------------------------+
stress-ng.dup.ops_per_sec
1.3e+07 +-----------------------------------------------------------------+
|+ ++.+ +.+ +.++ ++ ++.+ |
1.2e+07 |:+ : : : : : : :: : : |
1.1e+07 |:+ : : : : : : :: : : |
| : : : : : : : : : : : |
1e+07 |-: : : : : : : : : : : |
| : : : : :: : : : : : |
9e+06 |-:: : : : : : :: : |
| :: : : : : : :: : |
8e+06 |-+: : : : : : : : |
7e+06 |-+: : : : : : : : |
| + ++.+++.++.+++.+++.+++. + + +.+ + +.+++.+++.+ +.+|
6e+06 |O+OOO OOO OOO OO OO + + |
| |
5e+06 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-csl-2ap3: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap3/getppid1/will-it-scale/0x5002f01
commit:
d5c678aed5 ("x86/debug: Allow a single level of #DB recursion")
4facb95b7a ("x86/entry: Unbreak 32bit fast syscall")
d5c678aed5eddb94 4facb95b7adaf77e2da73aafb9b
---------------- ---------------------------
%stddev %change %stddev
\ | \
12038155 -4.1% 11545213 will-it-scale.per_process_ops
1.156e+09 -4.1% 1.108e+09 will-it-scale.workload
79181 ± 3% +10.8% 87715 ± 6% cpuidle.POLL.time
8.25 ± 6% -3.2 5.07 ± 9%
perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
10.46 ± 6% +2.0 12.48 ± 9%
perf-profile.calltrace.cycles-pp.__x64_sys_getppid.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
8.26 ± 6% -2.9 5.34 ± 9%
perf-profile.children.cycles-pp.syscall_enter_from_user_mode
10.62 ± 6% +2.0 12.64 ± 9%
perf-profile.children.cycles-pp.__x64_sys_getppid
8.06 ± 6% -2.9 5.18 ± 9%
perf-profile.self.cycles-pp.syscall_enter_from_user_mode
1.14 ± 6% +1.1 2.21 ± 10% perf-profile.self.cycles-pp.do_syscall_64
3.27 ± 6% +1.4 4.65 ± 9%
perf-profile.self.cycles-pp.__x64_sys_getppid
5.086e+10 -4.1% 4.879e+10 perf-stat.i.branch-instructions
1.16 -0.1 1.06 perf-stat.i.branch-miss-rate%
5.856e+08 -12.2% 5.144e+08 perf-stat.i.branch-misses
1.18 +6.3% 1.25 perf-stat.i.cpi
8.872e+10 -6.6% 8.291e+10 perf-stat.i.dTLB-loads
0.00 ± 4% +0.0 0.00 ± 6% perf-stat.i.dTLB-store-miss-rate%
6.113e+10 -7.7% 5.643e+10 perf-stat.i.dTLB-stores
5.564e+08 -8.2% 5.106e+08 perf-stat.i.iTLB-load-misses
2.531e+11 -5.8% 2.384e+11 perf-stat.i.instructions
456.82 +2.5% 468.10 perf-stat.i.instructions-per-iTLB-miss
0.85 -5.9% 0.80 perf-stat.i.ipc
1045 -6.3% 979.65 perf-stat.i.metric.M/sec
0.04 +7.0% 0.05 perf-stat.overall.MPKI
1.15 -0.1 1.05 perf-stat.overall.branch-miss-rate%
1.18 +6.3% 1.25 perf-stat.overall.cpi
0.00 +0.0 0.00 perf-stat.overall.dTLB-store-miss-rate%
455.16 +2.6% 466.97
perf-stat.overall.instructions-per-iTLB-miss
0.85 -5.9% 0.80 perf-stat.overall.ipc
65906 -1.8% 64694 perf-stat.overall.path-length
5.068e+10 -4.1% 4.862e+10 perf-stat.ps.branch-instructions
5.834e+08 -12.2% 5.125e+08 perf-stat.ps.branch-misses
8.841e+10 -6.5% 8.262e+10 perf-stat.ps.dTLB-loads
6.092e+10 -7.7% 5.623e+10 perf-stat.ps.dTLB-stores
5.543e+08 -8.2% 5.088e+08 perf-stat.ps.iTLB-load-misses
2.522e+11 -5.8% 2.376e+11 perf-stat.ps.instructions
7.617e+13 -5.9% 7.17e+13 perf-stat.total.instructions
***************************************************************************************************
lkp-csl-2ap2: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/futex4/will-it-scale/0x5002f01
commit:
d5c678aed5 ("x86/debug: Allow a single level of #DB recursion")
4facb95b7a ("x86/entry: Unbreak 32bit fast syscall")
d5c678aed5eddb94 4facb95b7adaf77e2da73aafb9b
---------------- ---------------------------
%stddev %change %stddev
\ | \
6661382 -3.5% 6425740 will-it-scale.per_thread_ops
1.066e+08 -3.5% 1.028e+08 will-it-scale.workload
673195 ± 12% -13.4% 582901 ± 5% sched_debug.cpu.max_idle_balance_cost.max
2863 ± 6% -8.7% 2615 ± 6%
slabinfo.fsnotify_mark_connector.active_objs
2863 ± 6% -8.7% 2615 ± 6% slabinfo.fsnotify_mark_connector.num_objs
24713 ± 21% -30.5% 17177 ± 9% numa-meminfo.node2.KReclaimable
24713 ± 21% -30.5% 17177 ± 9% numa-meminfo.node2.SReclaimable
22491 ± 29% -31.4% 15432 ± 2% numa-meminfo.node3.Shmem
423370 ± 30% +49.6% 633375 ± 17% numa-vmstat.node0.numa_local
6178 ± 21% -30.5% 4293 ± 9% numa-vmstat.node2.nr_slab_reclaimable
504182 ± 22% -33.3% 336099 ± 15% numa-vmstat.node2.numa_local
63782 ± 57% +93.7% 123540 numa-vmstat.node2.numa_other
5616 ± 29% -31.4% 3851 ± 2% numa-vmstat.node3.nr_shmem
3728 ± 2% +428.9% 19720 ± 83% softirqs.CPU12.SCHED
11536 ± 8% +18.7% 13692 ± 9% softirqs.CPU141.RCU
39530 ± 3% +5.0% 41516 ± 4% softirqs.CPU58.SCHED
39627 ± 4% +5.1% 41636 ± 4% softirqs.CPU64.SCHED
11762 ± 6% +12.9% 13281 ± 3% softirqs.CPU94.RCU
2858 +103.4% 5813 ± 50%
interrupts.CPU108.NMI:Non-maskable_interrupts
2858 +103.4% 5813 ± 50%
interrupts.CPU108.PMI:Performance_monitoring_interrupts
65.00 ± 45% -88.5% 7.50 ± 55%
interrupts.CPU112.RES:Rescheduling_interrupts
805.25 +67.5% 1348 ± 54%
interrupts.CPU16.CAL:Function_call_interrupts
39.50 ±120% -85.4% 5.75 ± 39% interrupts.CPU2.RES:Rescheduling_interrupts
1404 ± 27% -42.9% 801.00
interrupts.CPU51.CAL:Function_call_interrupts
800.50 +39.6% 1117 ± 30%
interrupts.CPU71.CAL:Function_call_interrupts
946.00 ± 25% +47.1% 1391 ± 44%
interrupts.CPU73.CAL:Function_call_interrupts
1423 ± 3% -16.8% 1184 ± 19%
interrupts.CPU8.CAL:Function_call_interrupts
895.25 ± 10% -10.8% 798.75
interrupts.CPU84.CAL:Function_call_interrupts
1004 ± 23% -20.5% 799.25
interrupts.CPU87.CAL:Function_call_interrupts
1024 ± 22% -21.9% 799.25
interrupts.CPU94.CAL:Function_call_interrupts
8.836e+09 -3.3% 8.541e+09 perf-stat.i.branch-instructions
0.95 +5.2% 1.00 perf-stat.i.cpi
1.571e+10 -4.8% 1.497e+10 perf-stat.i.dTLB-loads
1.192e+10 -5.2% 1.13e+10 perf-stat.i.dTLB-stores
56842974 -2.4% 55486396 perf-stat.i.iTLB-load-misses
5.901e+10 -4.1% 5.66e+10 perf-stat.i.instructions
1.06 -5.0% 1.00 perf-stat.i.ipc
190.07 -4.5% 181.55 perf-stat.i.metric.M/sec
40174 -14.6% 34309 ± 21% perf-stat.i.node-loads
0.95 +5.2% 1.00 perf-stat.overall.cpi
1.06 -5.0% 1.00 perf-stat.overall.ipc
8.806e+09 -3.3% 8.513e+09 perf-stat.ps.branch-instructions
1.566e+10 -4.8% 1.491e+10 perf-stat.ps.dTLB-loads
1.188e+10 -5.2% 1.126e+10 perf-stat.ps.dTLB-stores
56650485 -2.4% 55298787 perf-stat.ps.iTLB-load-misses
5.881e+10 -4.1% 5.641e+10 perf-stat.ps.instructions
40073 -14.5% 34257 ± 21% perf-stat.ps.node-loads
1.778e+13 -4.2% 1.704e+13 perf-stat.total.instructions
3.52 ± 9% -0.9 2.66
perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.11 ± 7% +0.2 1.35 ± 15%
perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
1.13 ± 7% +0.2 1.38 ± 16%
perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.asm_call_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
1.14 ± 7% +0.2 1.39 ± 16%
perf-profile.calltrace.cycles-pp.asm_call_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
1.75 ± 5% +0.5 2.22 ± 13%
perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle
38.04 ± 8% +2.8 40.87
perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
33.65 ± 8% +3.1 36.76
perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
3.52 ± 9% -0.7 2.80
perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.15 ± 14% +0.0 0.18 ± 6%
perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.16 ± 27% +0.2 0.35 ± 42%
perf-profile.children.cycles-pp.tick_irq_enter
0.71 ± 9% +0.2 0.92 ± 26%
perf-profile.children.cycles-pp.__hrtimer_run_queues
1.29 ± 8% +0.3 1.59 ± 12%
perf-profile.children.cycles-pp.hrtimer_interrupt
1.31 ± 7% +0.3 1.63 ± 12%
perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
1.53 ± 6% +0.4 1.88 ± 14%
perf-profile.children.cycles-pp.asm_call_on_stack
1.95 ± 6% +0.5 2.50 ± 12%
perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
38.11 ± 8% +2.8 40.96
perf-profile.children.cycles-pp.do_syscall_64
33.80 ± 8% +3.1 36.89
perf-profile.children.cycles-pp.__x64_sys_futex
3.46 ± 9% -0.7 2.73
perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.00 +0.1 0.05
perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
0.11 ± 11% +0.1 0.18 ± 23%
perf-profile.self.cycles-pp.ktime_get_update_offsets_now
1.21 ± 9% +0.4 1.62 perf-profile.self.cycles-pp.do_futex
0.58 ± 11% +0.4 1.02 ± 3% perf-profile.self.cycles-pp.do_syscall_64
1.60 ± 9% +2.0 3.57 perf-profile.self.cycles-pp.__x64_sys_futex
***************************************************************************************************
lkp-csl-2ap2: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/poll1/will-it-scale/0x5002f01
commit:
d5c678aed5 ("x86/debug: Allow a single level of #DB recursion")
4facb95b7a ("x86/entry: Unbreak 32bit fast syscall")
d5c678aed5eddb94 4facb95b7adaf77e2da73aafb9b
---------------- ---------------------------
%stddev %change %stddev
\ | \
6604008 -3.1% 6400612 will-it-scale.per_process_ops
1.057e+08 -3.1% 1.024e+08 will-it-scale.workload
16484 -36.2% 10522 ± 28% numa-meminfo.node0.Mapped
180979 ± 30% -67.3% 59090 ± 80% numa-numastat.node3.local_node
196599 ± 21% -54.1% 90155 ± 53% numa-numastat.node3.numa_hit
545463 ± 19% -26.2% 402373 ± 7% numa-vmstat.node3.numa_hit
471870 ± 23% -39.0% 287855 ± 15% numa-vmstat.node3.numa_local
935.25 ± 7% -25.5% 696.38 ± 15% slabinfo.skbuff_fclone_cache.active_objs
935.25 ± 7% -25.5% 696.38 ± 15% slabinfo.skbuff_fclone_cache.num_objs
1055 ± 24% -23.3% 809.00
interrupts.CPU161.CAL:Function_call_interrupts
1549 ± 49% -39.3% 941.25 ± 37%
interrupts.CPU5.CAL:Function_call_interrupts
1432 ± 55% -40.6% 850.12 ± 12%
interrupts.CPU75.CAL:Function_call_interrupts
11812 ± 8% +13.9% 13457 ± 7% softirqs.CPU87.RCU
11812 ± 8% +11.7% 13190 ± 4% softirqs.CPU94.RCU
32469 ± 5% +7.1% 34781 ± 4% softirqs.TIMER
3.23 ± 9% -0.9 2.30 ± 9%
perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
0.75 ± 10% +0.1 0.89 ± 10%
perf-profile.calltrace.cycles-pp.___might_sleep.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll
3.24 ± 9% -0.8 2.43 ± 9%
perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.81 ± 10% +0.1 0.95 ± 10%
perf-profile.children.cycles-pp.___might_sleep
3.18 ± 9% -0.8 2.38 ± 9%
perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.80 ± 9% +0.1 0.93 ± 9% perf-profile.self.cycles-pp.___might_sleep
0.70 ± 10% +0.4 1.12 ± 9% perf-profile.self.cycles-pp.do_syscall_64
1.71 ± 10% +1.8 3.49 ± 9% perf-profile.self.cycles-pp.__x64_sys_poll
1.289e+10 -3.1% 1.249e+10 perf-stat.i.branch-instructions
28.40 ± 11% -8.9 19.50 ± 43% perf-stat.i.cache-miss-rate%
6416617 ± 17% -23.0% 4943945 ± 6% perf-stat.i.cache-misses
0.85 +3.2% 0.88 perf-stat.i.cpi
9079 ± 16% +26.4% 11476 ± 8% perf-stat.i.cycles-between-cache-misses
1.754e+10 -3.8% 1.688e+10 perf-stat.i.dTLB-loads
1.319e+10 -4.6% 1.258e+10 perf-stat.i.dTLB-stores
6.64e+10 -3.7% 6.391e+10 perf-stat.i.instructions
1229 -4.7% 1171 perf-stat.i.instructions-per-iTLB-miss
1.18 -3.1% 1.14 perf-stat.i.ipc
227.32 -3.8% 218.71 perf-stat.i.metric.M/sec
62934 ± 32% -44.6% 34876 ± 22% perf-stat.i.node-loads
28.25 ± 11% -8.8 19.43 ± 43% perf-stat.overall.cache-miss-rate%
0.85 +3.2% 0.88 perf-stat.overall.cpi
9029 ± 15% +26.1% 11383 ± 8%
perf-stat.overall.cycles-between-cache-misses
1227 -4.8% 1168
perf-stat.overall.instructions-per-iTLB-miss
1.18 -3.1% 1.14 perf-stat.overall.ipc
1.285e+10 -3.1% 1.244e+10 perf-stat.ps.branch-instructions
6396063 ± 17% -23.0% 4927357 ± 6% perf-stat.ps.cache-misses
1.748e+10 -3.8% 1.682e+10 perf-stat.ps.dTLB-loads
1.315e+10 -4.6% 1.254e+10 perf-stat.ps.dTLB-stores
6.618e+10 -3.7% 6.37e+10 perf-stat.ps.instructions
62806 ± 32% -44.6% 34787 ± 22% perf-stat.ps.node-loads
1.997e+13 -3.8% 1.922e+13 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen