Nathan Chancellor <nathan(a)kernel.org> writes:
> On 6/20/2021 4:59 PM, Nicholas Piggin wrote:
>> Excerpts from kernel test robot's message of April 3, 2021 8:47 pm:
>>> tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>> head: d93a0d43e3d0ba9e19387be4dae4a8d5b175a8d7
>>> commit: 97e4910232fa1f81e806aa60c25a0450276d99a2 linux/compiler-clang.h:
>>> date: 3 weeks ago
>>> config: powerpc64-randconfig-r006-20210403 (attached as .config)
>>> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project
>>> reproduce (this is a W=1 build):
>>> chmod +x ~/bin/make.cross
>>> # install powerpc64 cross compiling tool for clang build
>>> # apt-get install binutils-powerpc64-linux-gnu
>>> git remote add linus
>>> git fetch --no-tags linus master
>>> git checkout 97e4910232fa1f81e806aa60c25a0450276d99a2
>>> # save the attached .config to linux build tree
>>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross
>>> If you fix the issue, kindly add following tag as appropriate
>>> Reported-by: kernel test robot <lkp(a)intel.com>
>>> All errors (new ones prefixed by >>):
>>>>> arch/powerpc/kvm/book3s_hv_nested.c:264:6: error: stack frame size
of 2304 bytes in function 'kvmhv_enter_nested_guest'
>>> long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
>>> 1 error generated.
>>> vim +/kvmhv_enter_nested_guest +264 arch/powerpc/kvm/book3s_hv_nested.c
>> Not much changed here recently. It's not that big a concern because
>> only called in the KVM ioctl path, not in any deep IO paths or anything,
>> and doesn't recurse. Might be a bit of inlining or stack spilling put it
>> over the edge.
> It appears to be the fact that LLVM's PowerPC backend does not emit
> efficient byteswap assembly:
>> powerpc does make it an error though, would be good to avoid that so the
>> robot doesn't keep tripping over.
> Marking byteswap_pt_regs as 'noinline_for_stack' drastically reduces the
> stack usage. If that is an acceptable solution, I can send it along
Yeah that should be OK. Can you post the before/after disassembly when
you post the patch?
It should just be two extra function calls, which shouldn't be enough
overhead to be measurable.
The diff is pretty large so I have attached it here along with the full
disassembly of the files before and after the patch I am about to send.
I will reply to this message so the history is there.