Re: [LKP] [lkp] [sched/fair] 41e0d37f7a: divide error: 0000 [#1] SMP
by Rafael J. Wysocki
On Friday, May 06, 2016 03:06:44 PM Wanpeng Li wrote:
> 2016-05-05 21:46 GMT+08:00 Rafael J. Wysocki <rjw(a)rjwysocki.net>:
> > On Thursday, May 05, 2016 01:05:41 PM Wanpeng Li wrote:
> >> 2016-05-03 23:10 GMT+08:00 Rafael J. Wysocki <rjw(a)rjwysocki.net>:
> >> > On Tuesday, May 03, 2016 03:53:12 PM Rafael J. Wysocki wrote:
[cut]
> >>
> >> I can reproduce the bug and your patch fix it.
> >>
> >> Tested-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
> >
> > Thanks!
> >
> > Please also try this one:
> >
> > https://patchwork.kernel.org/patch/9012861/
> >
> > which is the final fix for this bug.
>
> The warning disappear.
>
> Tested-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Thank you!
4 years, 11 months
[i2c-dev] c5ba7c7346: BUG kmalloc-192 (Not tainted): Poison overwritten
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux Erico-Nunes/i2c-dev-switch-from-register_chrdev-to-cdev-API/20160504-025424
commit c5ba7c7346715117dc54a5cee9c8bf2a0e1af8ff
Author: Erico Nunes <erico.nunes(a)datacom.ind.br>
AuthorDate: Tue May 3 15:45:43 2016 -0300
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Wed May 4 02:54:27 2016 +0800
i2c-dev: switch from register_chrdev to cdev API
i2c-dev had never moved away from the older register_chrdev interface to
implement its char device registration. The register_chrdev API has the
limitation of enabling only up to 256 i2c-dev busses to exist.
Large platforms with lots of i2c devices (i.e. pluggable transceivers)
with dedicated busses may have to exceed that limit.
In particular, there are also platforms making use of the i2c bus
multiplexing API, which instantiates a virtual bus for each possible
multiplexed selection.
This patch removes the register_chrdev usage and replaces it with the
less old cdev API, which takes away the 256 i2c-dev bus limitation.
It should not have any other impact for i2c bus drivers or user space.
This patch has been tested on qemu x86 and qemu powerpc platforms with
the aid of a module which adds and removes 5000 virtual i2c busses, as
well as validated on an existing powerpc hardware platform which makes
use of the i2c bus multiplexing API.
i2c-dev busses with device minor numbers larger than 256 have also been
validated to work with the existing i2c-tools.
Signed-off-by: Erico Nunes <erico.nunes(a)datacom.ind.br>
+----------------------------------------------------------------------------------------------------------------------------------------+------------+------------+-----------------+
| | b3b8592253 | c5ba7c7346 | v4.6-rc6_050404 |
+----------------------------------------------------------------------------------------------------------------------------------------+------------+------------+-----------------+
| boot_successes | 63 | 0 | 0 |
| boot_failures | 0 | 22 | 24 |
| BUG_kmalloc-#(Not_tainted):Poison_overwritten | 0 | 22 | 24 |
| INFO:#-#.First_byte#instead_of | 0 | 22 | 24 |
| INFO:Allocated_in_i2cdev_attach_adapter_age=#cpu=#pid= | 0 | 22 | 24 |
| INFO:Freed_in_return_i2c_dev_age=#cpu=#pid= | 0 | 22 | 24 |
| INFO:Slab#objects=#used=#fp=0x(null)flags= | 0 | 22 | 24 |
| INFO:Object#@offset=#fp= | 0 | 22 | 24 |
| backtrace:of_unittest | 0 | 22 | 24 |
| backtrace:kernel_init_freeable | 0 | 22 | 24 |
| BUG_kmalloc-#(Tainted:G_B):Poison_overwritten | 0 | 0 | 20 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_open | 0 | 0 | 24 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_startup | 0 | 0 | 24 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_change_pm | 0 | 0 | 24 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_change_speed | 0 | 0 | 24 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_close | 0 | 0 | 24 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_shutdown | 0 | 0 | 24 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_ioctl | 0 | 0 | 24 |
| WARNING:at_drivers/tty/serial/serial_core.c:#uart_set_termios | 0 | 0 | 8 |
| backtrace:do_sys_open | 0 | 0 | 24 |
| backtrace:SyS_open | 0 | 0 | 24 |
| backtrace:do_vfs_ioctl | 0 | 0 | 24 |
| backtrace:SyS_ioctl | 0 | 0 | 24 |
| backtrace:do_group_exit | 0 | 0 | 24 |
| backtrace:SyS_exit_group | 0 | 0 | 24 |
| WARNING:at_drvers/tty/serial/serial_core.c:#uart_open | 0 | 0 | 1 |
| WARNING:at/kbuid/src/smoke/drivers/tty/serial/serial_core.c:#uart_open | 0 | 0 | 1 |
| WARNING:CPU:#PID:#at_drivers/tty/suedervdi[#a93l]/:sfeairlialed_ctoo_erxeecu.tec':/#s_biun/amrodtp_roobpe'e_n'+0x/s8bian//mo0dxpr1ob1e | 0 | 0 | 1 |
| WARNING:at_ivers/tty/serial/serial_core.c:#uart_open | 0 | 0 | 3 |
| WARNING:CPU:#PID:#at/kbui | 0 | 0 | 1 |
| WARNING:at/kbuiudlevdd[/#s3]r:c/smoke/drivers/tty/serial/serial_core.c:#uart_open | 0 | 0 | 1 |
| WARNING:at_drers/tty/serial/seriuadelv_d[c24o0]r:efa.icl:#eud_atro_te_open | 0 | 0 | 1 |
+----------------------------------------------------------------------------------------------------------------------------------------+------------+------------+-----------------+
[ 20.391731] i2c i2c-0: Added multiplexed i2c bus 2
[ 20.401461] =============================================================================
[ 20.401461] =============================================================================
[ 20.404517] BUG kmalloc-192 (Not tainted): Poison overwritten
[ 20.404517] BUG kmalloc-192 (Not tainted): Poison overwritten
[ 20.406438] -----------------------------------------------------------------------------
[ 20.406438]
[ 20.406438] -----------------------------------------------------------------------------
[ 20.406438]
[ 20.409842] Disabling lock debugging due to kernel taint
[ 20.409842] Disabling lock debugging due to kernel taint
[ 20.411762] INFO: 0xffff880011d31500-0xffff880011d31500. First byte 0x6a instead of 0x6b
[ 20.411762] INFO: 0xffff880011d31500-0xffff880011d31500. First byte 0x6a instead of 0x6b
[ 20.414539] INFO: Allocated in i2cdev_attach_adapter+0x5f/0x14b age=6 cpu=0 pid=1
[ 20.414539] INFO: Allocated in i2cdev_attach_adapter+0x5f/0x14b age=6 cpu=0 pid=1
[ 20.441159] INFO: Freed in return_i2c_dev+0x30/0x33 age=1 cpu=0 pid=1
[ 20.441159] INFO: Freed in return_i2c_dev+0x30/0x33 age=1 cpu=0 pid=1
[ 20.466035] INFO: Slab 0xffffea0000474c00 objects=15 used=15 fp=0x (null) flags=0x4080
[ 20.466035] INFO: Slab 0xffffea0000474c00 objects=15 used=15 fp=0x (null) flags=0x4080
[ 20.469035] INFO: Object 0xffff880011d314a8 @offset=5288 fp=0xffff880011d31088
[ 20.469035]
[ 20.469035] INFO: Object 0xffff880011d314a8 @offset=5288 fp=0xffff880011d31088
[ 20.469035]
[ 20.472090] Redzone ffff880011d314a0: bb bb bb bb bb bb bb bb ........
[ 20.472090] Redzone ffff880011d314a0: bb bb bb bb bb bb bb bb ........
[ 20.475128] Object ffff880011d314a8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 20.475128] Object ffff880011d314a8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 20.478405] Object ffff880011d314b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
git bisect start 148e07d104847a48b2b3fb79907ddaa885de231d 04974df8049fc4240d22759a91e035082ccd18b4 --
git bisect bad 3b68703e2a76a6c6a42d2b12063a17386f6ba32e # 13:53 0- 22 Merge 'gpio/devel' into devel-hourly-2016050404
git bisect bad 25d399026b624f0012e2cf6c84b1d045be0c2e07 # 13:53 0- 22 Merge 'leon-rdma/topic/fix-core' into devel-hourly-2016050404
git bisect bad d6faeaa0b859aef4dbb20a3c668a005f29211935 # 13:54 0- 22 Merge 'peterz-queue/locking/rfc' into devel-hourly-2016050404
git bisect good cbac98bb2d98de979c26784466af6b00f3cb76b6 # 13:54 22+ 0 Merge 'linux-review/Anders-Darander/leds-pca963x-add-bindings-to-invert-polarity/20160502-174457' into devel-hourly-2016050404
git bisect good 32609969fdd7595c0c27f98e72232ce3b239e3a2 # 13:54 22+ 0 Merge 'slave-dma/topic/dw' into devel-hourly-2016050404
git bisect good f679c2dadbd2679666b9e11bab651f98d550d800 # 14:24 22+ 0 Merge 'ia64/next' into devel-hourly-2016050404
git bisect bad 2750e147f76fb466a0fce94dad4a883920f08e1d # 14:33 0- 15 Merge 'linux-review/Erico-Nunes/i2c-dev-switch-from-register_chrdev-to-cdev-API/20160504-025424' into devel-hourly-2016050404
git bisect good a464347ea54e00f8d334ac86403ba2e9fb0c0d1e # 14:37 22+ 0 Merge 'pci/for-linus' into devel-hourly-2016050404
git bisect good 7d83118a57e296a11bb94f636b73c629e5040b9c # 14:41 22+ 0 Merge 'linux-review/Jani-Nikula/man-add-gitignore-with-defs-rst/20160503-223853' into devel-hourly-2016050404
git bisect good e00fed40f48e43bdb5e018156d077c65b61f93bf # 14:49 22+ 0 [media] m88ds3103: convert to use an explicit i2c mux core
git bisect good 8f8edd491ac64c3cfe4e8ffff2b7385df5c02860 # 14:56 21+ 0 i2c: s3c2410: Print errno code in error logs
git bisect good f3a36fbdd359608bf73d674533cc419bf7e65aae # 15:00 22+ 1 i2c: mv64xxx: remove CONFIG_HAVE_CLK conditionals
git bisect good 408c91ad9d335caccb233049a9d38ea3f86cb2c6 # 15:05 22+ 0 Merge branch 'i2c/for-4.7' into i2c/for-next
git bisect good b3b85922534861445a24f48f1d6125a99c6f3aa2 # 15:15 20+ 0 Merge branch 'i2c/for-4.7' into i2c/for-next
git bisect bad c5ba7c7346715117dc54a5cee9c8bf2a0e1af8ff # 15:18 0- 17 i2c-dev: switch from register_chrdev to cdev API
# first bad commit: [c5ba7c7346715117dc54a5cee9c8bf2a0e1af8ff] i2c-dev: switch from register_chrdev to cdev API
git bisect good b3b85922534861445a24f48f1d6125a99c6f3aa2 # 15:22 63+ 0 Merge branch 'i2c/for-4.7' into i2c/for-next
# extra tests with DEBUG_INFO
git bisect bad c5ba7c7346715117dc54a5cee9c8bf2a0e1af8ff # 15:28 0- 25 i2c-dev: switch from register_chrdev to cdev API
# extra tests on HEAD of linux-devel/devel-hourly-2016050404
git bisect bad 148e07d104847a48b2b3fb79907ddaa885de231d # 15:28 0- 24 0day head guard for 'devel-hourly-2016050404'
# extra tests on tree/branch linux-review/Erico-Nunes/i2c-dev-switch-from-register_chrdev-to-cdev-API/20160504-025424
git bisect bad c5ba7c7346715117dc54a5cee9c8bf2a0e1af8ff # 15:31 0- 22 i2c-dev: switch from register_chrdev to cdev API
# extra tests with first bad commit reverted
git bisect good 5961e7224ab939e37d9e929aeefa792bf3466dd0 # 15:41 66+ 0 Revert "i2c-dev: switch from register_chrdev to cdev API"
# extra tests on tree/branch linus/master
git bisect good 9caa7e78481f17fb6ff77dfaca774998e7440430 # 15:47 64+ 0 Merge branch 'akpm' (patches from Andrew)
# extra tests on tree/branch linux-next/master
git bisect good 41b4d7e361f4a0a24cc5ad89a9555c311cfa774c # 15:53 66+ 1 Add linux-next specific files for 20160506
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=quantal-core-x86_64.cgz
wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
qemu-system-x86_64
-enable-kvm
-cpu kvm64
-kernel $kernel
-initrd $initrd
-m 300
-smp 2
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
systemd.log_level=err
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 11 months
[lkp] [mm, thp] 409ca714ac: INFO: possible circular locking dependency detected
by kernel test robot
FYI, we noticed the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 409ca714ac58768342cd39ca79c16f51e1825b3e ("mm, thp: avoid unnecessary swapin in khugepaged")
on test machine: vm-kbuild-1G: 2 threads qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap with 1G memory
caused below changes:
[ 21.116124] ======================================================
[ 21.116124] ======================================================
[ 21.116124] [ INFO: possible circular locking dependency detected ]
[ 21.116124] [ INFO: possible circular locking dependency detected ]
[ 21.116127] 4.6.0-rc5-00302-g409ca71 #1 Not tainted
[ 21.116127] 4.6.0-rc5-00302-g409ca71 #1 Not tainted
[ 21.116127] -------------------------------------------------------
[ 21.116127] -------------------------------------------------------
[ 21.116128] udevadm/221 is trying to acquire lock:
[ 21.116128] udevadm/221 is trying to acquire lock:
[ 21.116138] (&mm->mmap_sem){++++++}, at: [<ffffffff81262543>] __might_fault+0x83/0x150
[ 21.116138] (&mm->mmap_sem){++++++}, at: [<ffffffff81262543>] __might_fault+0x83/0x150
[ 21.116138]
[ 21.116138] but task is already holding lock:
[ 21.116138]
[ 21.116138] but task is already holding lock:
[ 21.116144] (s_active#12){++++.+}, at: [<ffffffff813315ee>] kernfs_fop_write+0x8e/0x250
[ 21.116144] (s_active#12){++++.+}, at: [<ffffffff813315ee>] kernfs_fop_write+0x8e/0x250
[ 21.116144]
[ 21.116144] which lock already depends on the new lock.
[ 21.116144]
[ 21.116144]
[ 21.116144] which lock already depends on the new lock.
[ 21.116144]
[ 21.116145]
[ 21.116145] the existing dependency chain (in reverse order) is:
[ 21.116145]
[ 21.116145] the existing dependency chain (in reverse order) is:
[ 21.116148]
[ 21.116148] -> #2 (s_active#12){++++.+}:
[ 21.116148]
[ 21.116148] -> #2 (s_active#12){++++.+}:
[ 21.116152] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116152] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116155] [<ffffffff8132f50a>] __kernfs_remove+0x2da/0x410
[ 21.116155] [<ffffffff8132f50a>] __kernfs_remove+0x2da/0x410
[ 21.116158] [<ffffffff81330630>] kernfs_remove_by_name_ns+0x40/0x90
[ 21.116158] [<ffffffff81330630>] kernfs_remove_by_name_ns+0x40/0x90
[ 21.116160] [<ffffffff813339fb>] sysfs_remove_file_ns+0x2b/0x70
[ 21.116160] [<ffffffff813339fb>] sysfs_remove_file_ns+0x2b/0x70
[ 21.116164] [<ffffffff81ba8a16>] device_del+0x166/0x320
[ 21.116164] [<ffffffff81ba8a16>] device_del+0x166/0x320
[ 21.116166] [<ffffffff81ba943c>] device_destroy+0x3c/0x50
[ 21.116166] [<ffffffff81ba943c>] device_destroy+0x3c/0x50
[ 21.116170] [<ffffffff8105aa61>] cpuid_class_cpu_callback+0x51/0x70
[ 21.116170] [<ffffffff8105aa61>] cpuid_class_cpu_callback+0x51/0x70
[ 21.116173] [<ffffffff81131ce9>] notifier_call_chain+0x59/0x190
[ 21.116173] [<ffffffff81131ce9>] notifier_call_chain+0x59/0x190
[ 21.116177] [<ffffffff81132749>] __raw_notifier_call_chain+0x9/0x10
[ 21.116177] [<ffffffff81132749>] __raw_notifier_call_chain+0x9/0x10
[ 21.116180] [<ffffffff810fe6b0>] __cpu_notify+0x40/0x90
[ 21.116180] [<ffffffff810fe6b0>] __cpu_notify+0x40/0x90
[ 21.116182] [<ffffffff810fe890>] cpu_notify_nofail+0x10/0x30
[ 21.116182] [<ffffffff810fe890>] cpu_notify_nofail+0x10/0x30
[ 21.116185] [<ffffffff810fe8d7>] notify_dead+0x27/0x1e0
[ 21.116185] [<ffffffff810fe8d7>] notify_dead+0x27/0x1e0
[ 21.116187] [<ffffffff810fe273>] cpuhp_down_callbacks+0x93/0x190
[ 21.116187] [<ffffffff810fe273>] cpuhp_down_callbacks+0x93/0x190
[ 21.116192] [<ffffffff82096062>] _cpu_down+0xc2/0x1e0
[ 21.116192] [<ffffffff82096062>] _cpu_down+0xc2/0x1e0
[ 21.116194] [<ffffffff810ff727>] do_cpu_down+0x37/0x50
[ 21.116194] [<ffffffff810ff727>] do_cpu_down+0x37/0x50
[ 21.116197] [<ffffffff8110003b>] cpu_down+0xb/0x10
[ 21.116197] [<ffffffff8110003b>] cpu_down+0xb/0x10
[ 21.116201] [<ffffffff81038e4d>] _debug_hotplug_cpu+0x7d/0xd0
[ 21.116201] [<ffffffff81038e4d>] _debug_hotplug_cpu+0x7d/0xd0
[ 21.116205] [<ffffffff8435d6bb>] debug_hotplug_cpu+0xd/0x11
[ 21.116205] [<ffffffff8435d6bb>] debug_hotplug_cpu+0xd/0x11
[ 21.116208] [<ffffffff84352426>] do_one_initcall+0x138/0x1cf
[ 21.116208] [<ffffffff84352426>] do_one_initcall+0x138/0x1cf
[ 21.116211] [<ffffffff8435270a>] kernel_init_freeable+0x24d/0x2de
[ 21.116211] [<ffffffff8435270a>] kernel_init_freeable+0x24d/0x2de
[ 21.116214] [<ffffffff8209533a>] kernel_init+0xa/0x120
[ 21.116214] [<ffffffff8209533a>] kernel_init+0xa/0x120
[ 21.116217] [<ffffffff820a7972>] ret_from_fork+0x22/0x50
[ 21.116217] [<ffffffff820a7972>] ret_from_fork+0x22/0x50
[ 21.116221]
[ 21.116221] -> #1 (cpu_hotplug.lock#2){+.+.+.}:
[ 21.116221]
[ 21.116221] -> #1 (cpu_hotplug.lock#2){+.+.+.}:
[ 21.116223] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116223] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116226] [<ffffffff820a20d1>] mutex_lock_nested+0x71/0x4c0
[ 21.116226] [<ffffffff820a20d1>] mutex_lock_nested+0x71/0x4c0
[ 21.116228] [<ffffffff810ff526>] get_online_cpus+0x66/0x80
[ 21.116228] [<ffffffff810ff526>] get_online_cpus+0x66/0x80
[ 21.116232] [<ffffffff81246fb3>] sum_vm_event+0x23/0x1b0
[ 21.116232] [<ffffffff81246fb3>] sum_vm_event+0x23/0x1b0
[ 21.116236] [<ffffffff81293768>] collapse_huge_page+0x118/0x10b0
[ 21.116236] [<ffffffff81293768>] collapse_huge_page+0x118/0x10b0
[ 21.116238] [<ffffffff81294c5d>] khugepaged+0x55d/0xe80
[ 21.116238] [<ffffffff81294c5d>] khugepaged+0x55d/0xe80
[ 21.116240] [<ffffffff81130304>] kthread+0x134/0x1a0
[ 21.116240] [<ffffffff81130304>] kthread+0x134/0x1a0
[ 21.116242] [<ffffffff820a7972>] ret_from_fork+0x22/0x50
[ 21.116242] [<ffffffff820a7972>] ret_from_fork+0x22/0x50
[ 21.116244]
[ 21.116244] -> #0 (&mm->mmap_sem){++++++}:
[ 21.116244]
[ 21.116244] -> #0 (&mm->mmap_sem){++++++}:
[ 21.116246] [<ffffffff8117bf61>] __lock_acquire+0x2861/0x31f0
[ 21.116246] [<ffffffff8117bf61>] __lock_acquire+0x2861/0x31f0
[ 21.116248] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116248] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116251] [<ffffffff8126257e>] __might_fault+0xbe/0x150
[ 21.116251] [<ffffffff8126257e>] __might_fault+0xbe/0x150
[ 21.116253] [<ffffffff8133160f>] kernfs_fop_write+0xaf/0x250
[ 21.116253] [<ffffffff8133160f>] kernfs_fop_write+0xaf/0x250
[ 21.116256] [<ffffffff812a8933>] __vfs_write+0x43/0x1a0
[ 21.116256] [<ffffffff812a8933>] __vfs_write+0x43/0x1a0
[ 21.116258] [<ffffffff812a8d3a>] vfs_write+0xda/0x240
[ 21.116258] [<ffffffff812a8d3a>] vfs_write+0xda/0x240
[ 21.116260] [<ffffffff812a8f84>] SyS_write+0x44/0xa0
[ 21.116260] [<ffffffff812a8f84>] SyS_write+0x44/0xa0
[ 21.116263] [<ffffffff820a773c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 21.116263] [<ffffffff820a773c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 21.116264]
[ 21.116264] other info that might help us debug this:
[ 21.116264]
[ 21.116264]
[ 21.116264] other info that might help us debug this:
[ 21.116264]
[ 21.116268] Chain exists of:
[ 21.116268] &mm->mmap_sem --> cpu_hotplug.lock#2 --> s_active#12
[ 21.116268]
[ 21.116268] Chain exists of:
[ 21.116268] &mm->mmap_sem --> cpu_hotplug.lock#2 --> s_active#12
[ 21.116268]
[ 21.116268] Possible unsafe locking scenario:
[ 21.116268]
[ 21.116268] Possible unsafe locking scenario:
[ 21.116268]
[ 21.116269] CPU0 CPU1
[ 21.116269] CPU0 CPU1
[ 21.116269] ---- ----
[ 21.116269] ---- ----
[ 21.116270] lock(s_active#12);
[ 21.116270] lock(s_active#12);
[ 21.116271] lock(cpu_hotplug.lock#2);
[ 21.116271] lock(cpu_hotplug.lock#2);
[ 21.116272] lock(s_active#12);
[ 21.116272] lock(s_active#12);
[ 21.116273] lock(&mm->mmap_sem);
[ 21.116273] lock(&mm->mmap_sem);
[ 21.116274]
[ 21.116274] *** DEADLOCK ***
[ 21.116274]
[ 21.116274]
[ 21.116274] *** DEADLOCK ***
[ 21.116274]
[ 21.116274] 3 locks held by udevadm/221:
[ 21.116274] 3 locks held by udevadm/221:
[ 21.116278] #0: (sb_writers#3){.+.+.+}, at: [<ffffffff812ad64d>] __sb_start_write+0x6d/0x120
[ 21.116278] #0: (sb_writers#3){.+.+.+}, at: [<ffffffff812ad64d>] __sb_start_write+0x6d/0x120
[ 21.116280] #1: (&of->mutex){+.+.+.}, at: [<ffffffff813315e6>] kernfs_fop_write+0x86/0x250
[ 21.116280] #1: (&of->mutex){+.+.+.}, at: [<ffffffff813315e6>] kernfs_fop_write+0x86/0x250
[ 21.116282] #2: (s_active#12){++++.+}, at: [<ffffffff813315ee>] kernfs_fop_write+0x8e/0x250
[ 21.116282] #2: (s_active#12){++++.+}, at: [<ffffffff813315ee>] kernfs_fop_write+0x8e/0x250
[ 21.116283]
[ 21.116283] stack backtrace:
[ 21.116283]
[ 21.116283] stack backtrace:
[ 21.116284] CPU: 1 PID: 221 Comm: udevadm Not tainted 4.6.0-rc5-00302-g409ca71 #1
[ 21.116284] CPU: 1 PID: 221 Comm: udevadm Not tainted 4.6.0-rc5-00302-g409ca71 #1
[ 21.116287] ffff88003f698000 ffff88003f077bf0 ffffffff81444ef3 0000000000000011
[ 21.116287] ffff88003f698000 ffff88003f077bf0 ffffffff81444ef3 0000000000000011
[ 21.116288] ffffffff84bdd8f0 ffffffff84bf2630 ffff88003f077c40 ffffffff81173e91
[ 21.116288] ffffffff84bdd8f0 ffffffff84bf2630 ffff88003f077c40 ffffffff81173e91
[ 21.116290] 0000000000000000 ffffffff84fbdbc0 00ff88003f077c40 ffff88003f698bb8
[ 21.116290] 0000000000000000 ffffffff84fbdbc0 00ff88003f077c40 ffff88003f698bb8
[ 21.116290] Call Trace:
[ 21.116290] Call Trace:
[ 21.116293] [<ffffffff81444ef3>] dump_stack+0x86/0xd3
[ 21.116293] [<ffffffff81444ef3>] dump_stack+0x86/0xd3
[ 21.116294] [<ffffffff81173e91>] print_circular_bug+0x221/0x360
[ 21.116294] [<ffffffff81173e91>] print_circular_bug+0x221/0x360
[ 21.116296] [<ffffffff8117bf61>] __lock_acquire+0x2861/0x31f0
[ 21.116296] [<ffffffff8117bf61>] __lock_acquire+0x2861/0x31f0
[ 21.116297] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116297] [<ffffffff8117da2c>] lock_acquire+0xac/0x180
[ 21.116299] [<ffffffff81262543>] ? __might_fault+0x83/0x150
[ 21.116299] [<ffffffff81262543>] ? __might_fault+0x83/0x150
[ 21.116300] [<ffffffff8126257e>] __might_fault+0xbe/0x150
[ 21.116300] [<ffffffff8126257e>] __might_fault+0xbe/0x150
[ 21.116302] [<ffffffff81262543>] ? __might_fault+0x83/0x150
[ 21.116302] [<ffffffff81262543>] ? __might_fault+0x83/0x150
[ 21.116303] [<ffffffff8133160f>] kernfs_fop_write+0xaf/0x250
[ 21.116303] [<ffffffff8133160f>] kernfs_fop_write+0xaf/0x250
[ 21.116304] [<ffffffff812a8933>] __vfs_write+0x43/0x1a0
[ 21.116304] [<ffffffff812a8933>] __vfs_write+0x43/0x1a0
[ 21.116306] [<ffffffff8116fe0d>] ? update_fast_ctr+0x1d/0x80
[ 21.116306] [<ffffffff8116fe0d>] ? update_fast_ctr+0x1d/0x80
[ 21.116308] [<ffffffff8116ffe7>] ? percpu_down_read+0x57/0xa0
[ 21.116308] [<ffffffff8116ffe7>] ? percpu_down_read+0x57/0xa0
[ 21.116310] [<ffffffff812ad64d>] ? __sb_start_write+0x6d/0x120
[ 21.116310] [<ffffffff812ad64d>] ? __sb_start_write+0x6d/0x120
[ 21.116311] [<ffffffff812ad64d>] ? __sb_start_write+0x6d/0x120
[ 21.116311] [<ffffffff812ad64d>] ? __sb_start_write+0x6d/0x120
[ 21.116312] [<ffffffff812a8d3a>] vfs_write+0xda/0x240
[ 21.116312] [<ffffffff812a8d3a>] vfs_write+0xda/0x240
[ 21.116314] [<ffffffff812a8f84>] SyS_write+0x44/0xa0
[ 21.116314] [<ffffffff812a8f84>] SyS_write+0x44/0xa0
[ 21.116315] [<ffffffff820a773c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 21.116315] [<ffffffff820a773c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
FYI, raw QEMU command line is:
qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -kernel /pkg/linux/x86_64-randconfig-s5-04281751/gcc-5/409ca714ac58768342cd39ca79c16f51e1825b3e/vmlinuz-4.6.0-rc5-00302-g409ca71 -append 'root=/dev/ram0 user=lkp job=/lkp/scheduled/vm-kbuild-1G-1/bisect_boot-1-debian-x86_64-2015-02-07.cgz-x86_64-randconfig-s5-04281751-409ca714ac58768342cd39ca79c16f51e1825b3e-20160429-79826-1u5edix-0.yaml ARCH=x86_64 kconfig=x86_64-randconfig-s5-04281751 branch=linux-devel/devel-catchup-201604281803 commit=409ca714ac58768342cd39ca79c16f51e1825b3e BOOT_IMAGE=/pkg/linux/x86_64-randconfig-s5-04281751/gcc-5/409ca714ac58768342cd39ca79c16f51e1825b3e/vmlinuz-4.6.0-rc5-00302-g409ca71 max_uptime=600 RESULT_ROOT=/result/boot/1/vm-kbuild-1G/debian-x86_64-2015-02-07.cgz/x86_64-randconfig-s5-04281751/gcc-5/409ca714ac58768342cd39ca79c16f51e1825b3e/0 LKP_SERVER=inn earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw ip=::::vm-kbuild-1G-1::dhcp' -initrd /fs/sdb1/initrd-vm-kbuild-1G-1 -m 1024 -smp 2 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp::23000-:22 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -device virtio-scsi-pci,id=scsi0 -drive file=/fs/sdb1/disk0-vm-kbuild-1G-1,if=none,id=hd0,media=disk,aio=native,cache=none -device scsi-hd,bus=scsi0.0,drive=hd0,scsi-id=1,lun=0 -drive file=/fs/sdb1/disk1-vm-kbuild-1G-1,if=none,id=hd1,media=disk,aio=native,cache=none -device scsi-hd,bus=scsi0.0,drive=hd1,scsi-id=1,lun=1 -drive file=/fs/sdb1/disk2-vm-kbuild-1G-1,if=none,id=hd2,media=disk,aio=native,cache=none -device scsi-hd,bus=scsi0.0,drive=hd2,scsi-id=1,lun=2 -drive file=/fs/sdb1/disk3-vm-kbuild-1G-1,if=none,id=hd3,media=disk,aio=native,cache=none -device scsi-hd,bus=scsi0.0,drive=hd3,scsi-id=1,lun=3 -drive file=/fs/sdb1/disk4-vm-kbuild-1G-1,if=none,id=hd4,media=disk,aio=native,cache=none -device scsi-hd,bus=scsi0.0,drive=hd4,scsi-id=1,lun=4 -pidfile /dev/shm/kboot/pid-vm-kbuild-1G-1 -serial file:/dev/shm/kboot/serial-vm-kbuild-1G-1 -daemonize -display none -monitor null
Thanks,
Xiaolong
4 years, 11 months
Re: [LKP] [lkp] [sched/fair] 41e0d37f7a: divide error: 0000 [#1] SMP
by Rafael J. Wysocki
On Thursday, May 05, 2016 01:05:41 PM Wanpeng Li wrote:
> 2016-05-03 23:10 GMT+08:00 Rafael J. Wysocki <rjw(a)rjwysocki.net>:
> > On Tuesday, May 03, 2016 03:53:12 PM Rafael J. Wysocki wrote:
> >> On Tuesday, May 03, 2016 03:22:24 PM Rafael J. Wysocki wrote:
> >> > On Tue, May 3, 2016 at 2:58 PM, Rafael J. Wysocki <rafael(a)kernel.org> wrote:
> >> > > On Tue, May 3, 2016 at 2:54 PM, Rafael J. Wysocki <rafael(a)kernel.org> wrote:
> >> > >> On Tue, May 3, 2016 at 2:15 PM, Rafael J. Wysocki <rafael(a)kernel.org> wrote:
> >> > >>> On Tue, May 3, 2016 at 10:32 AM, Peter Zijlstra <peterz(a)infradead.org> wrote:
> >> > >>>> On Tue, May 03, 2016 at 09:10:51AM +0800, kernel test robot wrote:
> >> > >>>>> FYI, we noticed the following commit:
> >> > >>>>>
> >> > >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core
> >> > >>>>> commit 41e0d37f7ac81297c07ba311e4ad39465b8c8295 ("sched/fair: Do not call cpufreq hook unless util changed")
> >> > >>>>
> >> > >>>>
> >> > >>>>> [ 14.860950] Freeing unused kernel memory: 260K (ffff88103edbf000 - ffff88103ee00000)
> >> > >>>>> [ 14.873013] systemd[1]: RTC configured in localtime, applying delta of 480 minutes to system time.
> >> > >>>>> [ 14.884474] random: systemd urandom read with 5 bits of entropy available
> >> > >>>>> [ 14.903975] divide error: 0000 [#1] SMP
> >> > >>>>> [ 14.908375] Modules linked in:
> >> > >>>>> [ 14.911793] CPU: 39 PID: 1 Comm: systemd Not tainted 4.6.0-rc4-00016-g41e0d37 #1
> >> > >>>>> [ 14.920051] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
> >> > >>>>> [ 14.931509] task: ffff8810101d8000 ti: ffff88081ab20000 task.ti: ffff88081ab20000
> >> > >>>>> [ 14.939862] RIP: 0010:[<ffffffff8176ad32>] [<ffffffff8176ad32>] intel_pstate_get+0x32/0x40
> >> > >>>>> [ 14.949202] RSP: 0018:ffff88081ab23d70 EFLAGS: 00010006
> >> > >>>>> [ 14.955129] RAX: 0000000000000000 RBX: 0000000000000024 RCX: ffff8808091e0300
> >> > >>>>> [ 14.963094] RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000024
> >> > >>>>> [ 14.971057] RBP: ffff88081ab23d88 R08: 0000000000001000 R09: 00000000096a1000
> >> > >>>>> [ 14.979022] R10: 0000000000ffff10 R11: 000000000000000f R12: 0000000000000202
> >> > >>>>> [ 14.986984] R13: ffff88101390a040 R14: ffff88100e48e180 R15: ffff88101390a040
> >> > >>>>> [ 14.994950] FS: 00007f66fe117880(0000) GS:ffff8810139c0000(0000) knlGS:0000000000000000
> >> > >>>>> [ 15.003982] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > >>>>> [ 15.010393] CR2: 000055f78b760098 CR3: 000000103d759000 CR4: 00000000001406e0
> >> > >>>>> [ 15.018359] Stack:
> >> > >>>>> [ 15.020602] ffffffff81764dad 0000000000000024 ffff88100e48e180 ffff88081ab23dc8
> >> > >>>>> [ 15.028899] ffffffff81040267 ffff88101390a0ac 0000000000000340 ffff88081ab23f20
> >> > >>>>> [ 15.037197] ffff88103cd7c400 ffff88100e48e180 ffff88101390a040 ffff88081ab23e30
> >> > >>>>> [ 15.045493] Call Trace:
> >> > >>>>> [ 15.048223] [<ffffffff81764dad>] ? cpufreq_quick_get+0x3d/0x90
> >> > >>>>> [ 15.054832] [<ffffffff81040267>] show_cpuinfo+0x3c7/0x410
> >> > >>>>> [ 15.060956] [<ffffffff8121f5c4>] seq_read+0x2c4/0x3a0
> >> > >>>>> [ 15.066685] [<ffffffff81266ea8>] proc_reg_read+0x48/0x70
> >> > >>>>> [ 15.072713] [<ffffffff811f9d58>] __vfs_read+0x28/0xd0
> >> > >>>>> [ 15.078451] [<ffffffff813bab63>] ? security_file_permission+0xa3/0xc0
> >> > >>>>> [ 15.085737] [<ffffffff811faa97>] ? rw_verify_area+0x57/0xd0
> >> > >>>>> [ 15.092054] [<ffffffff811fab96>] vfs_read+0x86/0x130
> >> > >>>>> [ 15.097691] [<ffffffff811fbf96>] SyS_read+0x46/0xa0
> >> > >>>>> [ 15.103234] [<ffffffff818f71b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> >> > >>>>> [ 15.110421] Code: 05 dc 1b c3 00 89 ff 55 48 89 e5 48 8b 0c f8 48 85 c9 74 1f 48 63 51 1c 48 63 41 20 5d 48 0f af c2 31 d2 48 0f af 81 88 00 00 00 <48> f7 b1 90 00 00 00 c3 31 c0 5d c3 66 90 0f 1f 44 00 00 8b 77
> >> > >>>>> [ 15.132161] RIP [<ffffffff8176ad32>] intel_pstate_get+0x32/0x40
> >> > >>>>> [ 15.138875] RSP <ffff88081ab23d70>
> >> > >>>>> [ 15.142770] ---[ end trace e5d5a8bedf5502e1 ]---
> >> > >>>>> [ 15.149323] Kernel panic - not syncing: Fatal exception
> >> > >>>>>
> >> > >>>>
> >> > >>>> That's intel_pstate.c:get_avg_frequency(), which assumes mperf != 0. It
> >> > >>>> being 0 seems to suggest intel_pstate_sample() hasn't been called yet or
> >> > >>>> so.
> >> > >>>
> >> > >>> Well, what's the tree based on?
> >> > >>>
> >> > >>> The mainline does this:
> >> > >>>
> >> > >>> bool sample_taken = intel_pstate_sample(cpu, time);
> >> > >>>
> >> > >>> if (sample_taken && !hwp_active)
> >> > >>> intel_pstate_adjust_busy_pstate(cpu);
> >> > >>>
> >> > >>> and (the mainline version of) intel_pstate_sample() returns false when
> >> > >>> it is called for the first time after setting the update_util hook.
> >> > >>
> >> > >> If that helps, I can expose my pm-cpufreq-fixes branch to pull from.
> >> > >> It contains all cpufreq material that went into the Linus' tree to
> >> > >> date and is based on 4.5-rc3.
> >> > >
> >> > > In fact, it is exposed already:
> >> > >
> >> > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> >> > > pm-cpufreq-fixes
> >> > >
> >> > > and the top-most commit is 1becf03545a0859ceaaf9e8c2d9861882a71cb01
> >> > > (cpufreq: intel_pstate: Fix processing for turbo activation ratio).
> >> >
> >> > Ah, that will fail as well.
> >> >
> >> > The problem is that intel_pstate_get() can be called before we take
> >> > the first sample.
> >> >
> >> > I need to think about how to fix that.
> >>
> >> Maybe something like the below (untested, but builds).
> >>
> >> It will make intel_pstate_get() return 0 until avg_frequency gets populated
> >> which is actually OK.
> >
> > The previous one would break the HWP case, so below is a new one (still
> > untested).
>
> I can reproduce the bug and your patch fix it.
>
> Tested-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Thanks!
Please also try this one:
https://patchwork.kernel.org/patch/9012861/
which is the final fix for this bug.
4 years, 11 months
[lkp] [sched/fair] 41e0d37f7a: divide error: 0000 [#1] SMP
by kernel test robot
FYI, we noticed the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core
commit 41e0d37f7ac81297c07ba311e4ad39465b8c8295 ("sched/fair: Do not call cpufreq hook unless util changed")
on test machine: ivb42: 48 threads Ivytown Ivy Bridge-EP with 64G memory
caused below changes:
+------------------------------------------+------------+------------+
| | 21e96f8877 | 41e0d37f7a |
+------------------------------------------+------------+------------+
| boot_successes | 6 | 5 |
| boot_failures | 0 | 3 |
| divide_error:#[##]SMP | 0 | 3 |
| RIP:intel_pstate_get | 0 | 3 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 3 |
| backtrace:vfs_read | 0 | 3 |
| backtrace:SyS_read | 0 | 3 |
+------------------------------------------+------------+------------+
[ 14.860950] Freeing unused kernel memory: 260K (ffff88103edbf000 - ffff88103ee00000)
[ 14.873013] systemd[1]: RTC configured in localtime, applying delta of 480 minutes to system time.
[ 14.884474] random: systemd urandom read with 5 bits of entropy available
[ 14.903975] divide error: 0000 [#1] SMP
[ 14.908375] Modules linked in:
[ 14.911793] CPU: 39 PID: 1 Comm: systemd Not tainted 4.6.0-rc4-00016-g41e0d37 #1
[ 14.920051] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 14.931509] task: ffff8810101d8000 ti: ffff88081ab20000 task.ti: ffff88081ab20000
[ 14.939862] RIP: 0010:[<ffffffff8176ad32>] [<ffffffff8176ad32>] intel_pstate_get+0x32/0x40
[ 14.949202] RSP: 0018:ffff88081ab23d70 EFLAGS: 00010006
[ 14.955129] RAX: 0000000000000000 RBX: 0000000000000024 RCX: ffff8808091e0300
[ 14.963094] RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000024
[ 14.971057] RBP: ffff88081ab23d88 R08: 0000000000001000 R09: 00000000096a1000
[ 14.979022] R10: 0000000000ffff10 R11: 000000000000000f R12: 0000000000000202
[ 14.986984] R13: ffff88101390a040 R14: ffff88100e48e180 R15: ffff88101390a040
[ 14.994950] FS: 00007f66fe117880(0000) GS:ffff8810139c0000(0000) knlGS:0000000000000000
[ 15.003982] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.010393] CR2: 000055f78b760098 CR3: 000000103d759000 CR4: 00000000001406e0
[ 15.018359] Stack:
[ 15.020602] ffffffff81764dad 0000000000000024 ffff88100e48e180 ffff88081ab23dc8
[ 15.028899] ffffffff81040267 ffff88101390a0ac 0000000000000340 ffff88081ab23f20
[ 15.037197] ffff88103cd7c400 ffff88100e48e180 ffff88101390a040 ffff88081ab23e30
[ 15.045493] Call Trace:
[ 15.048223] [<ffffffff81764dad>] ? cpufreq_quick_get+0x3d/0x90
[ 15.054832] [<ffffffff81040267>] show_cpuinfo+0x3c7/0x410
[ 15.060956] [<ffffffff8121f5c4>] seq_read+0x2c4/0x3a0
[ 15.066685] [<ffffffff81266ea8>] proc_reg_read+0x48/0x70
[ 15.072713] [<ffffffff811f9d58>] __vfs_read+0x28/0xd0
[ 15.078451] [<ffffffff813bab63>] ? security_file_permission+0xa3/0xc0
[ 15.085737] [<ffffffff811faa97>] ? rw_verify_area+0x57/0xd0
[ 15.092054] [<ffffffff811fab96>] vfs_read+0x86/0x130
[ 15.097691] [<ffffffff811fbf96>] SyS_read+0x46/0xa0
[ 15.103234] [<ffffffff818f71b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[ 15.110421] Code: 05 dc 1b c3 00 89 ff 55 48 89 e5 48 8b 0c f8 48 85 c9 74 1f 48 63 51 1c 48 63 41 20 5d 48 0f af c2 31 d2 48 0f af 81 88 00 00 00 <48> f7 b1 90 00 00 00 c3 31 c0 5d c3 66 90 0f 1f 44 00 00 8b 77
[ 15.132161] RIP [<ffffffff8176ad32>] intel_pstate_get+0x32/0x40
[ 15.138875] RSP <ffff88081ab23d70>
[ 15.142770] ---[ end trace e5d5a8bedf5502e1 ]---
[ 15.149323] Kernel panic - not syncing: Fatal exception
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
Thanks,
Kernel Test Robot
4 years, 11 months
Re: [LKP] [lkp] [sched/fair] 41e0d37f7a: divide error: 0000 [#1] SMP
by Rafael J. Wysocki
On Wed, May 4, 2016 at 1:51 PM, Wanpeng Li <kernellwp(a)gmail.com> wrote:
> 2016-05-04 19:44 GMT+08:00 Rafael J. Wysocki <rafael(a)kernel.org>:
>> On Wed, May 4, 2016 at 2:58 AM, Wanpeng Li <kernellwp(a)gmail.com> wrote:
>>> 2016-05-03 20:15 GMT+08:00 Rafael J. Wysocki <rafael(a)kernel.org>:
>>>> On Tue, May 3, 2016 at 10:32 AM, Peter Zijlstra <peterz(a)infradead.org> wrote:
>>>>> On Tue, May 03, 2016 at 09:10:51AM +0800, kernel test robot wrote:
>>>>>> FYI, we noticed the following commit:
>>>>>>
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core
>>>>>> commit 41e0d37f7ac81297c07ba311e4ad39465b8c8295 ("sched/fair: Do not call cpufreq hook unless util changed")
>>>>>
>>
>> [cut]
>>
>>>>>
>>>>> That's intel_pstate.c:get_avg_frequency(), which assumes mperf != 0. It
>>>>> being 0 seems to suggest intel_pstate_sample() hasn't been called yet or
>>>>> so.
>>>>
>>>> Well, what's the tree based on?
>>>>
>>>> The mainline does this:
>>>>
>>>> bool sample_taken = intel_pstate_sample(cpu, time);
>>>>
>>>> if (sample_taken && !hwp_active)
>>>> intel_pstate_adjust_busy_pstate(cpu);
>>>>
>>>> and (the mainline version of) intel_pstate_sample() returns false when
>>>> it is called for the first time after setting the update_util hook.
>>>
>>> The callsites in scheduler will set time to rq_clock(rq) when trigger
>>> sample, so when time 0 will be used even if it is set just before
>>> setting the update_util hook?
>>
>> I'm not sure what you mean.
>>
>> time=0 is special as it will cause intel_pstate_sample() to return
>> false on the next invocation.
>
> Sample is driven by cpufreq_update_util() which uses rq_clock(rq) as
> time parameter, so there is no opportunity to pass time 0 to
> intel_pstate_sample().
Right.
So I should have said that had time=0 been passed to
intel_pstate_sample(), it would have caused it to return false on the
next invocation. :-)
The way it works is that sample.time is 0 initially, so
intel_pstate_sample() returns false first time it is called and the
second invocation gets all of the deltas as needed.
4 years, 11 months
Re: [LKP] [lkp] [sched/fair] 41e0d37f7a: divide error: 0000 [#1] SMP
by Rafael J. Wysocki
On Wed, May 4, 2016 at 2:58 AM, Wanpeng Li <kernellwp(a)gmail.com> wrote:
> 2016-05-03 20:15 GMT+08:00 Rafael J. Wysocki <rafael(a)kernel.org>:
>> On Tue, May 3, 2016 at 10:32 AM, Peter Zijlstra <peterz(a)infradead.org> wrote:
>>> On Tue, May 03, 2016 at 09:10:51AM +0800, kernel test robot wrote:
>>>> FYI, we noticed the following commit:
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core
>>>> commit 41e0d37f7ac81297c07ba311e4ad39465b8c8295 ("sched/fair: Do not call cpufreq hook unless util changed")
>>>
[cut]
>>>
>>> That's intel_pstate.c:get_avg_frequency(), which assumes mperf != 0. It
>>> being 0 seems to suggest intel_pstate_sample() hasn't been called yet or
>>> so.
>>
>> Well, what's the tree based on?
>>
>> The mainline does this:
>>
>> bool sample_taken = intel_pstate_sample(cpu, time);
>>
>> if (sample_taken && !hwp_active)
>> intel_pstate_adjust_busy_pstate(cpu);
>>
>> and (the mainline version of) intel_pstate_sample() returns false when
>> it is called for the first time after setting the update_util hook.
>
> The callsites in scheduler will set time to rq_clock(rq) when trigger
> sample, so when time 0 will be used even if it is set just before
> setting the update_util hook?
I'm not sure what you mean.
time=0 is special as it will cause intel_pstate_sample() to return
false on the next invocation.
4 years, 11 months
Re: [LKP] [lkp] [sched/fair] 41e0d37f7a: divide error: 0000 [#1] SMP
by Rafael J. Wysocki
On Wed, May 4, 2016 at 2:53 AM, Wanpeng Li <kernellwp(a)gmail.com> wrote:
> 2016-05-03 21:33 GMT+08:00 Rafael J. Wysocki <rafael(a)kernel.org>:
>> On Tue, May 3, 2016 at 11:25 AM, Wanpeng Li <kernellwp(a)gmail.com> wrote:
>>> 2016-05-03 17:19 GMT+08:00 Wanpeng Li <kernellwp(a)gmail.com>:
>>>> 2016-05-03 16:32 GMT+08:00 Peter Zijlstra <peterz(a)infradead.org>:
>>>>> On Tue, May 03, 2016 at 09:10:51AM +0800, kernel test robot wrote:
[cut]
>>>> ----snip----
>>>>
>>>> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
>>>> index 8b5a415..57b3843 100644
>>>> --- a/drivers/cpufreq/intel_pstate.c
>>>> +++ b/drivers/cpufreq/intel_pstate.c
>>>> @@ -1241,6 +1241,7 @@ static int intel_pstate_init_cpu(unsigned int cpunum)
>>>> intel_pstate_get_cpu_pstates(cpu);
>>>>
>>>> intel_pstate_busy_pid_reset(cpu);
>>>> + intel_pstate_sample(cpu);
>>>
>>> intel_pstate_sample(cpu, 0);
>>>
>>>>
>>>> cpu->update_util.func = intel_pstate_update_util;
>>
>> That would avoid the divide by 0, but the value returned by
>> intel_pstate_get() would still be bogus.
>
> If your bogus means that some data is stale and could you explain more?
get_avg_frequency() expects sample.aperf to be a delta between two
different values of the APERF register obtained at two different
instants of time, and analogously for sample.mperf, because that's
when the formula used by it is guaranteed to be valid. This means
that it generally is not sufficient to read those registers just once
to get a meaningful result, they need to be read at least twice for
that (with some time between the reads to let the counters grow
sufficiently).
With your modification sample.aperf and sample.mperf would simply
contain the values of APERF and MPERF, respectively, at the the
intel_pstate_sample(cpu, 0) invocation time, so using them in the
computation would not be guaranteed to lead to a meaningful result.
4 years, 11 months
[serial] 13a7238eea: WARNING: CPU: 0 PID: 1 at drivers/tty/serial/serial_core.c:99 uart_change_pm+0x35/0x162
by kernel test robot
FYI, we noticed the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tty-testing
commit 13a7238eeab5718bf968e2a835205ba659a38a77 ("serial: core: Prevent unsafe uart port access, part 3")
on test machine: vm-vp-quantal-x86_64: 2 threads qemu-system-x86_64 -enable-kvm with 360M memory
caused below changes:
[ 7.249323] ------------[ cut here ]------------
[ 7.249323] ------------[ cut here ]------------
[ 7.250172] WARNING: CPU: 0 PID: 1 at drivers/tty/serial/serial_core.c:99 uart_change_pm+0x35/0x162
[ 7.250172] WARNING: CPU: 0 PID: 1 at drivers/tty/serial/serial_core.c:99 uart_change_pm+0x35/0x162
[ 7.252110] Modules linked in:
[ 7.252110] Modules linked in:
[ 7.252679] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 4.6.0-rc5-00015-g13a7238 #1
[ 7.252679] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 4.6.0-rc5-00015-g13a7238 #1
[ 7.254232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[ 7.254232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[ 7.255806] 0000000000000000
[ 7.255806] 0000000000000000 ffff88000f42f668 ffff88000f42f668 ffffffff815a2566 ffffffff815a2566 0000000000000286 0000000000000286
[ 7.257180] 0000000041b58ab3
[ 7.257180] 0000000041b58ab3 ffffffff81f66067 ffffffff81f66067 ffffffff815a240c ffffffff815a240c ffffffff81f7647d ffffffff81f7647d
[ 7.258601] ffffffff81238abf
[ 7.258601] ffffffff81238abf 000000000042f6a8 000000000042f6a8 ffffed0001e85ecb ffffed0001e85ecb 0000000000000000 0000000000000000
[ 7.259973] Call Trace:
[ 7.259973] Call Trace:
[ 7.260422] [<ffffffff815a2566>] dump_stack+0x15a/0x1de
[ 7.260422] [<ffffffff815a2566>] dump_stack+0x15a/0x1de
[ 7.261374] [<ffffffff815a240c>] ? _atomic_dec_and_lock+0xaf/0xaf
[ 7.261374] [<ffffffff815a240c>] ? _atomic_dec_and_lock+0xaf/0xaf
[ 7.262470] [<ffffffff81238abf>] ? is_module_text_address+0x1f/0x1f
[ 7.262470] [<ffffffff81238abf>] ? is_module_text_address+0x1f/0x1f
[ 7.263595] [<ffffffff817af33b>] ? uart_change_pm+0x35/0x162
[ 7.263595] [<ffffffff817af33b>] ? uart_change_pm+0x35/0x162
[ 7.264623] [<ffffffff8110d1fc>] __warn+0x16e/0x189
[ 7.264623] [<ffffffff8110d1fc>] __warn+0x16e/0x189
[ 7.265523] [<ffffffff8110d37a>] warn_slowpath_null+0x18/0x1a
[ 7.265523] [<ffffffff8110d37a>] warn_slowpath_null+0x18/0x1a
[ 7.266561] [<ffffffff817af33b>] uart_change_pm+0x35/0x162
[ 7.266561] [<ffffffff817af33b>] uart_change_pm+0x35/0x162
[ 7.267553] [<ffffffff817b6ba1>] uart_add_one_port+0xb36/0x1008
[ 7.267553] [<ffffffff817b6ba1>] uart_add_one_port+0xb36/0x1008
[ 7.268637] [<ffffffff817b606b>] ? uart_unregister_driver+0x150/0x150
[ 7.268637] [<ffffffff817b606b>] ? uart_unregister_driver+0x150/0x150
[ 7.269786] [<ffffffff81b45c0e>] ? mutex_trylock+0x4eb/0x4eb
[ 7.269786] [<ffffffff81b45c0e>] ? mutex_trylock+0x4eb/0x4eb
[ 7.270838] [<ffffffff817bcba7>] serial8250_register_8250_port+0xf68/0x14e5
[ 7.270838] [<ffffffff817bcba7>] serial8250_register_8250_port+0xf68/0x14e5
[ 7.272116] [<ffffffff817bed3f>] serial_pnp_probe+0x2d8/0x5f1
[ 7.272116] [<ffffffff817bed3f>] serial_pnp_probe+0x2d8/0x5f1
[ 7.273146] [<ffffffff817bea67>] ? check_name+0x6e/0x6e
[ 7.273146] [<ffffffff817bea67>] ? check_name+0x6e/0x6e
[ 7.274125] [<ffffffff817bea67>] ? check_name+0x6e/0x6e
[ 7.274125] [<ffffffff817bea67>] ? check_name+0x6e/0x6e
[ 7.275076] [<ffffffff817625ff>] pnp_device_probe+0x1c1/0x1dc
[ 7.275076] [<ffffffff817625ff>] pnp_device_probe+0x1c1/0x1dc
[ 7.276109] [<ffffffff817f3514>] driver_probe_device+0x325/0x63c
[ 7.276109] [<ffffffff817f3514>] driver_probe_device+0x325/0x63c
[ 7.277191] [<ffffffff817f382b>] ? driver_probe_device+0x63c/0x63c
[ 7.277191] [<ffffffff817f382b>] ? driver_probe_device+0x63c/0x63c
[ 7.278299] [<ffffffff817f3963>] __driver_attach+0x138/0x179
[ 7.278299] [<ffffffff817f3963>] __driver_attach+0x138/0x179
[ 7.279316] [<ffffffff817eee66>] bus_for_each_dev+0x10d/0x143
[ 7.279316] [<ffffffff817eee66>] bus_for_each_dev+0x10d/0x143
[ 7.280372] [<ffffffff817eed59>] ? next_device+0x52/0x52
[ 7.280372] [<ffffffff817eed59>] ? next_device+0x52/0x52
[ 7.281334] [<ffffffff817f27c5>] driver_attach+0x50/0x53
[ 7.281334] [<ffffffff817f27c5>] driver_attach+0x50/0x53
[ 7.282302] [<ffffffff817f1cb3>] bus_add_driver+0x31d/0x53e
[ 7.282302] [<ffffffff817f1cb3>] bus_add_driver+0x31d/0x53e
[ 7.283346] [<ffffffff817f5063>] driver_register+0x24b/0x2c3
[ 7.283346] [<ffffffff817f5063>] driver_register+0x24b/0x2c3
[ 7.284370] [<ffffffff81761ce6>] pnp_register_driver+0x98/0x9f
[ 7.284370] [<ffffffff81761ce6>] pnp_register_driver+0x98/0x9f
[ 7.285420] [<ffffffff834431e7>] ? univ8250_console_init+0x28/0x28
[ 7.285420] [<ffffffff834431e7>] ? univ8250_console_init+0x28/0x28
[ 7.286579] [<ffffffff817bf068>] serial8250_pnp_init+0x10/0x12
[ 7.286579] [<ffffffff817bf068>] serial8250_pnp_init+0x10/0x12
[ 7.287635] [<ffffffff83443259>] serial8250_init+0x72/0x232
[ 7.287635] [<ffffffff83443259>] serial8250_init+0x72/0x232
[ 7.288650] [<ffffffff834431e7>] ? univ8250_console_init+0x28/0x28
[ 7.288650] [<ffffffff834431e7>] ? univ8250_console_init+0x28/0x28
[ 7.289775] [<ffffffff81000598>] do_one_initcall+0x287/0x36d
[ 7.289775] [<ffffffff81000598>] do_one_initcall+0x287/0x36d
[ 7.290802] [<ffffffff81000311>] ? try_to_run_init_process+0x31/0x31
[ 7.290802] [<ffffffff81000311>] ? try_to_run_init_process+0x31/0x31
[ 7.291954] [<ffffffff833d900a>] kernel_init_freeable+0x60e/0x6a9
[ 7.291954] [<ffffffff833d900a>] kernel_init_freeable+0x60e/0x6a9
[ 7.293060] [<ffffffff81b3e0fd>] kernel_init+0xc/0x11b
[ 7.293060] [<ffffffff81b3e0fd>] kernel_init+0xc/0x11b
[ 7.293998] [<ffffffff81b4f6a2>] ret_from_fork+0x22/0x40
[ 7.293998] [<ffffffff81b4f6a2>] ret_from_fork+0x22/0x40
[ 7.294964] [<ffffffff81b3e0f1>] ? rest_init+0xe8/0xe8
[ 7.294964] [<ffffffff81b3e0f1>] ? rest_init+0xe8/0xe8
[ 7.295921] ---[ end trace 3dfe53a1b1117774 ]---
[ 7.295921] ---[ end trace 3dfe53a1b1117774 ]---
FYI, raw QEMU command line is:
qemu-system-x86_64 -enable-kvm -kernel /pkg/linux/x86_64-randconfig-s0-04291414/gcc-5/13a7238eeab5718bf968e2a835205ba659a38a77/vmlinuz-4.6.0-rc5-00015-g13a7238 -append 'root=/dev/ram0 user=lkp job=/lkp/scheduled/vm-vp-quantal-x86_64-44/bisect_boot-1-quantal-core-x86_64.cgz-x86_64-randconfig-s0-04291414-13a7238eeab5718bf968e2a835205ba659a38a77-20160429-40248-hp2gqc-0.yaml ARCH=x86_64 kconfig=x86_64-randconfig-s0-04291414 branch=linux-devel/devel-spot-201604291342 commit=13a7238eeab5718bf968e2a835205ba659a38a77 BOOT_IMAGE=/pkg/linux/x86_64-randconfig-s0-04291414/gcc-5/13a7238eeab5718bf968e2a835205ba659a38a77/vmlinuz-4.6.0-rc5-00015-g13a7238 max_uptime=600 RESULT_ROOT=/result/boot/1/vm-vp-quantal-x86_64/quantal-core-x86_64.cgz/x86_64-randconfig-s0-04291414/gcc-5/13a7238eeab5718bf968e2a835205ba659a38a77/0 LKP_SERVER=inn earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw ip=::::vm-vp-quantal-x86_64-44::dhcp drbd.minor_count=8' -initrd /fs/sdi1/initrd-vm-vp-quantal-x86_64-44 -m 360 -smp 2 -device e1000,netdev=net0 -netdev user,id=net0 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -pidfile /dev/shm/kboot/pid-vm-vp-quantal-x86_64-44 -serial file:/dev/shm/kboot/serial-vm-vp-quantal-x86_64-44 -daemonize -display none -monitor null
Thanks,
Xiaolong
4 years, 11 months
[net] 5413d1babe: INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
by kernel test robot
FYI, we noticed the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 5413d1babe8f10de13d72496c12b862eef8ba613 ("net: do not block BH while processing socket backlog")
on test machine: vm-lkp-wsx03-1G: 1 threads qemu-system-x86_64 -enable-kvm -cpu host with 1G memory
caused below changes:
[ 35.296699] mount.nfs (3510) used greatest stack depth: 11536 bytes left
[ 42.600792]
[ 42.601632] ======================================================
[ 42.603227] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[ 42.606492] 4.6.0-rc5-01190-g5413d1b #1 Not tainted
[ 42.607747] ------------------------------------------------------
[ 42.609227] wget/3661 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
[ 42.610880] (&(&tcp_hashinfo.bhash[i].lock)->rlock){+.+...}, at: [<ffffffff81bb184c>] __inet_twsk_hashdance+0x52/0x11d
[ 42.613890]
[ 42.613890] and this task is already holding:
[ 42.615802] (slock-AF_INET){+.-...}, at: [<ffffffff81bb7ca1>] tcp_close+0x134/0x3ba
[ 42.618697] which would create a new lock dependency:
[ 42.620137] (slock-AF_INET){+.-...} -> (&(&tcp_hashinfo.bhash[i].lock)->rlock){+.+...}
[ 42.623119]
[ 42.623119] but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 42.625296] (slock-AF_INET){+.-...}
... which became SOFTIRQ-irq-safe at:
[ 42.627562] [<ffffffff810e9da5>] __lock_acquire+0x970/0x192a
[ 42.629139] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.630562] [<ffffffff81cb4064>] _raw_spin_lock+0x34/0x6a
[ 42.632030] [<ffffffff81bd3a82>] udp_queue_rcv_skb+0x204/0x4e8
[ 42.633558] [<ffffffff81bd45d8>] __udp4_lib_rcv+0x872/0x9ce
[ 42.634980] [<ffffffff81bd4aa9>] udp_rcv+0x1a/0x1c
[ 42.636301] [<ffffffff81ba7ccd>] ip_local_deliver_finish+0x1aa/0x329
[ 42.637836] [<ffffffff81ba853a>] ip_local_deliver+0xb8/0xbf
[ 42.639277] [<ffffffff81ba82e2>] ip_rcv_finish+0x496/0x4ca
[ 42.640702] [<ffffffff81ba8818>] ip_rcv+0x2d7/0x333
[ 42.642039] [<ffffffff81b4674f>] __netif_receive_skb_core+0x6a9/0x811
[ 42.643647] [<ffffffff81b46eba>] __netif_receive_skb+0x1d/0x5f
[ 42.645141] [<ffffffff81b46fc2>] netif_receive_skb_internal+0xc6/0x187
[ 42.646697] [<ffffffff81b48203>] napi_gro_receive+0x15c/0x224
[ 42.648221] [<ffffffff8192ce2d>] e1000_clean_rx_irq+0x374/0x3f5
[ 42.649722] [<ffffffff8192c0ab>] e1000_clean+0x521/0x6b5
[ 42.651121] [<ffffffff81b4866d>] net_rx_action+0x12f/0x422
[ 42.652545] [<ffffffff81cb7997>] __do_softirq+0x207/0x4d0
[ 42.654067] [<ffffffff810a4948>] run_ksoftirqd+0x25/0x62
[ 42.655477] [<ffffffff810c1bce>] smpboot_thread_fn+0x1cc/0x1e3
[ 42.656953] [<ffffffff810be641>] kthread+0xfb/0x103
[ 42.658296] [<ffffffff81cb4ef2>] ret_from_fork+0x22/0x50
[ 42.659725]
[ 42.659725] to a SOFTIRQ-irq-unsafe lock:
[ 42.661558] (&(&tcp_hashinfo.bhash[i].lock)->rlock){+.+...}
... which became SOFTIRQ-irq-unsafe at:
[ 42.664127] ... [<ffffffff810e9e17>] __lock_acquire+0x9e2/0x192a
[ 42.665706] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.667116] [<ffffffff81cb4064>] _raw_spin_lock+0x34/0x6a
[ 42.668668] [<ffffffff81bb184c>] __inet_twsk_hashdance+0x52/0x11d
[ 42.670170] [<ffffffff81bc9fca>] tcp_time_wait+0x1da/0x210
[ 42.671770] [<ffffffff81bbdba0>] tcp_fin+0x70/0x370
[ 42.673109] [<ffffffff81bbe187>] tcp_data_queue+0x2e7/0xab9
[ 42.674552] [<ffffffff81bbfd97>] tcp_rcv_state_process+0xae1/0xc22
[ 42.676071] [<ffffffff81bc6780>] tcp_v4_do_rcv+0x106/0x14f
[ 42.677501] [<ffffffff81b313b1>] release_sock+0xd2/0x1bb
[ 42.678915] [<ffffffff81bb7c88>] tcp_close+0x11b/0x3ba
[ 42.680287] [<ffffffff81bdcb5e>] inet_release+0x52/0x59
[ 42.681761] [<ffffffff81b2b672>] sock_release+0x1f/0x78
[ 42.683215] [<ffffffff81b2b6dd>] sock_close+0x12/0x16
[ 42.684679] [<ffffffff811ee737>] __fput+0x10d/0x1c6
[ 42.686024] [<ffffffff811ee826>] ____fput+0xe/0x10
[ 42.687342] [<ffffffff810bd199>] task_work_run+0x67/0x90
[ 42.688760] [<ffffffff810a2571>] do_exit+0x556/0xb30
[ 42.690108] [<ffffffff810a3de1>] do_group_exit+0x4a/0xb5
[ 42.691505] [<ffffffff810a3e60>] __wake_up_parent+0x0/0x28
[ 42.692926] [<ffffffff81cb4cbc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 42.694459]
[ 42.694459] other info that might help us debug this:
[ 42.694459]
[ 42.697180] Possible interrupt unsafe locking scenario:
[ 42.697180]
[ 42.699260] CPU0 CPU1
[ 42.700448] ---- ----
[ 42.701627] lock(&(&tcp_hashinfo.bhash[i].lock)->rlock);
[ 42.703104] local_irq_disable();
[ 42.704490] lock(slock-AF_INET);
[ 42.706017] lock(&(&tcp_hashinfo.bhash[i].lock)->rlock);
[ 42.708427] <Interrupt>
[ 42.709351] lock(slock-AF_INET);
[ 42.710586]
[ 42.710586] *** DEADLOCK ***
[ 42.710586]
[ 42.712967] 1 lock held by wget/3661:
[ 42.714042] #0: (slock-AF_INET){+.-...}, at: [<ffffffff81bb7ca1>] tcp_close+0x134/0x3ba
[ 42.716659]
the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
[ 42.718884] -> (slock-AF_INET){+.-...} ops: 163 {
[ 42.720519] HARDIRQ-ON-W at:
[ 42.721609] [<ffffffff810e9df5>] __lock_acquire+0x9c0/0x192a
[ 42.723853] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.725475] [<ffffffff81cb41a8>] _raw_spin_lock_bh+0x3c/0x72
[ 42.727632] [<ffffffff81b30e0e>] lock_sock_nested+0x3b/0x9a
[ 42.729803] [<ffffffff81b33330>] sock_setsockopt+0x21e/0x8f9
[ 42.731979] [<ffffffff81b2e515>] SyS_setsockopt+0x63/0x9e
[ 42.733607] [<ffffffff81cb4cbc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 42.735936] IN-SOFTIRQ-W at:
[ 42.737020] [<ffffffff810e9da5>] __lock_acquire+0x970/0x192a
[ 42.739196] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.740869] [<ffffffff81cb4064>] _raw_spin_lock+0x34/0x6a
[ 42.742503] [<ffffffff81bd3a82>] udp_queue_rcv_skb+0x204/0x4e8
[ 42.744687] [<ffffffff81bd45d8>] __udp4_lib_rcv+0x872/0x9ce
[ 42.746838] [<ffffffff81bd4aa9>] udp_rcv+0x1a/0x1c
[ 42.748380] [<ffffffff81ba7ccd>] ip_local_deliver_finish+0x1aa/0x329
[ 42.750655] [<ffffffff81ba853a>] ip_local_deliver+0xb8/0xbf
[ 42.752827] [<ffffffff81ba82e2>] ip_rcv_finish+0x496/0x4ca
[ 42.754986] [<ffffffff81ba8818>] ip_rcv+0x2d7/0x333
[ 42.756547] [<ffffffff81b4674f>] __netif_receive_skb_core+0x6a9/0x811
[ 42.758830] [<ffffffff81b46eba>] __netif_receive_skb+0x1d/0x5f
[ 42.760997] [<ffffffff81b46fc2>] netif_receive_skb_internal+0xc6/0x187
[ 42.763252] [<ffffffff81b48203>] napi_gro_receive+0x15c/0x224
[ 42.765407] [<ffffffff8192ce2d>] e1000_clean_rx_irq+0x374/0x3f5
[ 42.767657] [<ffffffff8192c0ab>] e1000_clean+0x521/0x6b5
[ 42.769266] [<ffffffff81b4866d>] net_rx_action+0x12f/0x422
[ 42.771401] [<ffffffff81cb7997>] __do_softirq+0x207/0x4d0
[ 42.773017] [<ffffffff810a4948>] run_ksoftirqd+0x25/0x62
[ 42.774619] [<ffffffff810c1bce>] smpboot_thread_fn+0x1cc/0x1e3
[ 42.776855] [<ffffffff810be641>] kthread+0xfb/0x103
[ 42.778408] [<ffffffff81cb4ef2>] ret_from_fork+0x22/0x50
[ 42.780013] INITIAL USE at:
[ 42.781094] [<ffffffff810e9e5d>] __lock_acquire+0xa28/0x192a
[ 42.783228] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.784825] [<ffffffff81cb41a8>] _raw_spin_lock_bh+0x3c/0x72
[ 42.786948] [<ffffffff81b30e0e>] lock_sock_nested+0x3b/0x9a
[ 42.789093] [<ffffffff81b33330>] sock_setsockopt+0x21e/0x8f9
[ 42.791251] [<ffffffff81b2e515>] SyS_setsockopt+0x63/0x9e
[ 42.792853] [<ffffffff81cb4cbc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 42.795088] }
[ 42.795898] ... key at: [<ffffffff83855d90>] af_family_slock_keys+0x10/0x160
[ 42.798089] ... acquired at:
[ 42.799074] [<ffffffff810e88b5>] check_irq_usage+0x55/0xaa
[ 42.800522] [<ffffffff810ea742>] __lock_acquire+0x130d/0x192a
[ 42.801982] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.803414] [<ffffffff81cb4064>] _raw_spin_lock+0x34/0x6a
[ 42.804826] [<ffffffff81bb184c>] __inet_twsk_hashdance+0x52/0x11d
[ 42.806340] [<ffffffff81bc9fca>] tcp_time_wait+0x1da/0x210
[ 42.807780] [<ffffffff81bb7db2>] tcp_close+0x245/0x3ba
[ 42.809185] [<ffffffff81bdcb5e>] inet_release+0x52/0x59
[ 42.810584] [<ffffffff81b2b672>] sock_release+0x1f/0x78
[ 42.811987] [<ffffffff81b2b6dd>] sock_close+0x12/0x16
[ 42.813362] [<ffffffff811ee737>] __fput+0x10d/0x1c6
[ 42.814790] [<ffffffff811ee826>] ____fput+0xe/0x10
[ 42.816198] [<ffffffff810bd199>] task_work_run+0x67/0x90
[ 42.817611] [<ffffffff810a2571>] do_exit+0x556/0xb30
[ 42.818973] [<ffffffff810a3de1>] do_group_exit+0x4a/0xb5
[ 42.820380] [<ffffffff810a3e60>] __wake_up_parent+0x0/0x28
[ 42.821797] [<ffffffff81cb4cbc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 42.823327]
[ 42.824105]
the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[ 42.839607] -> (&(&tcp_hashinfo.bhash[i].lock)->rlock){+.+...} ops: 19 {
[ 42.841688] HARDIRQ-ON-W at:
[ 42.842789] [<ffffffff810e9df5>] __lock_acquire+0x9c0/0x192a
[ 42.844954] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.846581] [<ffffffff81cb41a8>] _raw_spin_lock_bh+0x3c/0x72
[ 42.848754] [<ffffffff81bb21b9>] inet_csk_get_port+0x78/0x435
[ 42.850925] [<ffffffff81bdcd6d>] inet_bind+0x148/0x1c6
[ 42.852524] [<ffffffff81b2ddd5>] SyS_bind+0x6a/0x97
[ 42.854078] [<ffffffff81cb4cbc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 42.856336] SOFTIRQ-ON-W at:
[ 42.857425] [<ffffffff810e9e17>] __lock_acquire+0x9e2/0x192a
[ 42.859623] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.861257] [<ffffffff81cb4064>] _raw_spin_lock+0x34/0x6a
[ 42.862881] [<ffffffff81bb184c>] __inet_twsk_hashdance+0x52/0x11d
[ 42.865114] [<ffffffff81bc9fca>] tcp_time_wait+0x1da/0x210
[ 42.867351] [<ffffffff81bbdba0>] tcp_fin+0x70/0x370
[ 42.869032] [<ffffffff81bbe187>] tcp_data_queue+0x2e7/0xab9
[ 42.871183] [<ffffffff81bbfd97>] tcp_rcv_state_process+0xae1/0xc22
[ 42.873565] [<ffffffff81bc6780>] tcp_v4_do_rcv+0x106/0x14f
[ 42.875801] [<ffffffff81b313b1>] release_sock+0xd2/0x1bb
[ 42.877408] [<ffffffff81bb7c88>] tcp_close+0x11b/0x3ba
[ 42.878999] [<ffffffff81bdcb5e>] inet_release+0x52/0x59
[ 42.880606] [<ffffffff81b2b672>] sock_release+0x1f/0x78
[ 42.882205] [<ffffffff81b2b6dd>] sock_close+0x12/0x16
[ 42.883876] [<ffffffff811ee737>] __fput+0x10d/0x1c6
[ 42.885520] [<ffffffff811ee826>] ____fput+0xe/0x10
[ 42.887060] [<ffffffff810bd199>] task_work_run+0x67/0x90
[ 42.889781] [<ffffffff810a2571>] do_exit+0x556/0xb30
[ 42.891522] [<ffffffff810a3de1>] do_group_exit+0x4a/0xb5
[ 42.893122] [<ffffffff810a3e60>] __wake_up_parent+0x0/0x28
[ 42.895331] [<ffffffff81cb4cbc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 42.897630] INITIAL USE at:
[ 42.898709] [<ffffffff810e9e5d>] __lock_acquire+0xa28/0x192a
[ 42.900929] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.902664] [<ffffffff81cb41a8>] _raw_spin_lock_bh+0x3c/0x72
[ 42.904895] [<ffffffff81bb21b9>] inet_csk_get_port+0x78/0x435
[ 42.907129] [<ffffffff81bdcd6d>] inet_bind+0x148/0x1c6
[ 42.908721] [<ffffffff81b2ddd5>] SyS_bind+0x6a/0x97
[ 42.910250] [<ffffffff81cb4cbc>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 42.912481] }
[ 42.913278] ... key at: [<ffffffff8386e0e8>] __key.62185+0x0/0x8
[ 42.914811] ... acquired at:
[ 42.915781] [<ffffffff810e88b5>] check_irq_usage+0x55/0xaa
[ 42.917204] [<ffffffff810ea742>] __lock_acquire+0x130d/0x192a
[ 42.918684] [<ffffffff810eb59d>] lock_acquire+0x188/0x223
[ 42.920100] [<ffffffff81cb4064>] _raw_spin_lock+0x34/0x6a
[ 42.921529] [<ffffffff81bb184c>] __inet_twsk_hashdance+0x52/0x11d
[ 42.923031] [<ffffffff81bc9fca>] tcp_time_wait+0x1da/0x210
[ 42.924463] [<ffffffff81bb7db2>] tcp_close+0x245/0x3ba
[ 42.925839] [<ffffffff81bdcb5e>] inet_release+0x52/0x59
[ 42.927229] [<ffffffff81b2b672>] sock_release+0x1f/0x78
[ 42.928642] [<ffffffff81b2b6dd>] sock_close+0x12/0x16
[ 42.930009] [<ffffffff811ee737>] __fput+0x10d/0x1c6
[ 42.931477] [<ffffffff811ee826>] ____fput+0xe/0x10
FYI, raw QEMU command line is:
qemu-system-x86_64 -enable-kvm -cpu host -kernel /pkg/linux/x86_64-nfsroot/gcc-5/5413d1babe8f10de13d72496c12b862eef8ba613/vmlinuz-4.6.0-rc5-01190-g5413d1b -append 'root=/dev/ram0 user=lkp job=/lkp/scheduled/vm-lkp-wsx03-1G-23/bisect_boot-1-debian-x86_64-2015-02-07.cgz-x86_64-nfsroot-5413d1babe8f10de13d72496c12b862eef8ba613-20160504-63843-5pexbh-1.yaml ARCH=x86_64 kconfig=x86_64-nfsroot branch=internal-eywa/master commit=5413d1babe8f10de13d72496c12b862eef8ba613 BOOT_IMAGE=/pkg/linux/x86_64-nfsroot/gcc-5/5413d1babe8f10de13d72496c12b862eef8ba613/vmlinuz-4.6.0-rc5-01190-g5413d1b max_uptime=600 RESULT_ROOT=/result/boot/1/vm-lkp-wsx03-1G/debian-x86_64-2015-02-07.cgz/x86_64-nfsroot/gcc-5/5413d1babe8f10de13d72496c12b862eef8ba613/1 LKP_SERVER=inn earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw ip=::::vm-lkp-wsx03-1G-23::dhcp' -initrd /fs/sdc1/initrd-vm-lkp-wsx03-1G-23 -m 1024 -smp 1 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp::23622-:22 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -drive file=/fs/sdc1/disk0-vm-lkp-wsx03-1G-23,media=disk,if=virtio -drive file=/fs/sdc1/disk1-vm-lkp-wsx03-1G-23,media=disk,if=virtio -pidfile /dev/shm/kboot/pid-vm-lkp-wsx03-1G-23 -serial file:/dev/shm/kboot/serial-vm-lkp-wsx03-1G-23 -daemonize -display none -monitor null
Thanks,
Xiaolong
4 years, 11 months